pandas.Series() 使用 DataFrame Columns 创建返回 NaN 数据条目

pandas.Series() Creation using DataFrame Columns returns NaN Data entries(pandas.Series() 使用 DataFrame Columns 创建返回 NaN 数据条目)
本文介绍了pandas.Series() 使用 DataFrame Columns 创建返回 NaN 数据条目的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!



Im attempting to convert a dataframe into a series using code which, simplified, looks like this:

dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
ts = pd.Series(df['Value'], index=df['Date'])


However, print output looks like this:

2016-01-01   NaN
2016-01-02   NaN
2016-01-03   NaN
2016-01-04   NaN
2016-01-05   NaN
2016-01-06   NaN
2016-01-07   NaN
2016-01-08   NaN
2016-01-09   NaN
2016-01-10   NaN
2016-01-11   NaN
2016-01-12   NaN
2016-01-13   NaN
2016-01-14   NaN
2016-01-15   NaN
2016-01-16   NaN
2016-01-17   NaN
2016-01-18   NaN
2016-01-19   NaN
2016-01-20   NaN
Name: Value, dtype: float64

NaN 是从哪里来的?DataFrame 对象上的视图是否不是 Series 类的有效输入?

Where does NaN come from? Is a view on a DataFrame object not a valid input for the Series class ?

我为 pd.Index 对象找到了 to_series 函数,DataFrames 有类似的东西吗?

I have found the to_series function for pd.Index objects, is there something similar for DataFrames ?


我觉得你可以使用 values,它将列 Value 转换为数组:

I think you can use values, it convert column Value to array:

ts = pd.Series(df['Value'].values, index=df['Date'])

import pandas as pd
import numpy as np
import io

dates = ['2016-1-{}'.format(i)for i in range(1,21)]
values = [i for i in range(20)]
data = {'Date': dates, 'Value': values}
df = pd.DataFrame(data)
df['Date'] = pd.to_datetime(df['Date'])
print df['Value'].values
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19]

ts = pd.Series(df['Value'].values, index=df['Date'])

2016-01-01     0
2016-01-02     1
2016-01-03     2
2016-01-04     3
2016-01-05     4
2016-01-06     5
2016-01-07     6
2016-01-08     7
2016-01-09     8
2016-01-10     9
2016-01-11    10
2016-01-12    11
2016-01-13    12
2016-01-14    13
2016-01-15    14
2016-01-16    15
2016-01-17    16
2016-01-18    17
2016-01-19    18
2016-01-20    19
dtype: int64


ts1 = pd.Series(data=values, index=pd.to_datetime(dates))
2016-01-01     0
2016-01-02     1
2016-01-03     2
2016-01-04     3
2016-01-05     4
2016-01-06     5
2016-01-07     6
2016-01-08     7
2016-01-09     8
2016-01-10     9
2016-01-11    10
2016-01-12    11
2016-01-13    12
2016-01-14    13
2016-01-15    14
2016-01-16    15
2016-01-17    16
2016-01-18    17
2016-01-19    18
2016-01-20    19
dtype: int64

谢谢@ajcr 更好地解释为什么你得到 NaN:

Thank you @ajcr for better explanation why you get NaN:

当您将 SeriesDataFrame 列提供给 pd.Series 时,它将使用 index 你指定.由于您的 DataFrame 列有一个整数 index(不是 date index),因此您会得到很多缺失值.

When you give a Series or DataFrame column to pd.Series, it will reindex it using the index you specify. Since your DataFrame column has an integer index (not a date index) you get lots of missing values.

这篇关于pandas.Series() 使用 DataFrame Columns 创建返回 NaN 数据条目的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!



Seasonal Decomposition of Time Series by Loess with Python(Loess 用 Python 对时间序列进行季节性分解)
Resample a time series with the index of another time series(使用另一个时间序列的索引重新采样一个时间序列)
How can I simply calculate the rolling/moving variance of a time series in python?(如何在 python 中简单地计算时间序列的滚动/移动方差?)
How to use Dynamic Time warping with kNN in python(如何在python中使用动态时间扭曲和kNN)
Keras LSTM: a time-series multi-step multi-features forecasting - poor results(Keras LSTM:时间序列多步多特征预测 - 结果不佳)
Python pandas time series interpolation and regularization(Python pandas 时间序列插值和正则化)