pandas 数据框每天重新采样,没有日期时间索引

pandas dataframe resample per day without date time index( pandas 数据框每天重新采样,没有日期时间索引)
本文介绍了 pandas 数据框每天重新采样,没有日期时间索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我有一个如下形式的熊猫数据框:

I have a dataframe in pandas of the following form:

      timestamps         light
7   2004-02-28 00:58:45 150.88
26  2004-02-28 00:59:45 143.52
34  2004-02-28 01:00:45 150.88
42  2004-02-28 01:01:15 150.88
59  2004-02-28 01:02:15 150.88

这里注意索引不是时间戳列.但我想重新采样(或以某种方式对数据进行分类)以反映每分钟、每小时、每天等光柱的平均值.我研究了 pandas 提供的 resample 方法,它需要数据框有一个数据时间索引以便该方法工作(除非我误解了这一点).

Here note that the index is not the timestamps column. But I want to resample (or bin the data somehow) to reflect the average value of the light column per minute , hour, day etc.. I have looked into the resample method that pandas offers and it requires the dataframe to have a datatime index for the method to work (unless I've misunderstood this).

  1. 所以我的第一个问题是,我可以重新索引数据帧以将时间戳作为索引(请注意,并非每一行都有唯一的时间戳,对于每个时间戳,大约有 30 行具有相同的时间戳,每个代表一个传感器).

  1. So my first question is, can I re-index the dataframe to have timestamps as the index (note that not each row has a unique timestamp and for each timestamp, there are about 30 rows with the same timestamp,each representing a sensor).

如果没有,是否有其他方法可以实现另一个数据帧,该数据帧具有每小时、每天、每月等的平均值?

If not, is there some other way to possibly achieve another dataframe which has the average value of light per hour , per day , per month etc..?

任何帮助将不胜感激.

推荐答案

你是对的 - 需要 DatetimeIndex, TimedeltaIndexPeriodIndex 否则错误:

You are right - need DatetimeIndex, TimedeltaIndex or PeriodIndex else error:

TypeError:仅对 DatetimeIndex、TimedeltaIndex 或 PeriodIndex 有效,但获得了 'Index' 的实例

TypeError: Only valid with DatetimeIndex, TimedeltaIndex or PeriodIndex, but got an instance of 'Index'

所以你必须先 reset_indexset_index 如果原始 index 很重要:

So you have to first reset_index and set_index if original index is important:

print (df.reset_index().set_index('timestamps'))
                     index   light
timestamps                        
2004-02-28 00:58:45      7  150.88
2004-02-28 00:59:45     26  143.52
2004-02-28 01:00:45     34  150.88
2004-02-28 01:01:15     42  150.88
2004-02-28 01:02:15     59  150.88

如果不只是 set_index:

if not only set_index:

print (df.set_index('timestamps'))
                      light
timestamps                 
2004-02-28 00:58:45  150.88
2004-02-28 00:59:45  143.52
2004-02-28 01:00:45  150.88
2004-02-28 01:01:15  150.88
2004-02-28 01:02:15  150.88

然后 resample:

print (df.reset_index().set_index('timestamps').resample('1D').mean())
            index    light
timestamps                
2004-02-28   33.6  149.408

这篇关于 pandas 数据框每天重新采样,没有日期时间索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

Seasonal Decomposition of Time Series by Loess with Python(Loess 用 Python 对时间序列进行季节性分解)
Resample a time series with the index of another time series(使用另一个时间序列的索引重新采样一个时间序列)
How can I simply calculate the rolling/moving variance of a time series in python?(如何在 python 中简单地计算时间序列的滚动/移动方差?)
How to use Dynamic Time warping with kNN in python(如何在python中使用动态时间扭曲和kNN)
Keras LSTM: a time-series multi-step multi-features forecasting - poor results(Keras LSTM:时间序列多步多特征预测 - 结果不佳)
Python pandas time series interpolation and regularization(Python pandas 时间序列插值和正则化)