在所有 pandas 列中将字符串转换为浮点数,这是可能的

Convert strings to float in all pandas columns, where this is possible(在所有 pandas 列中将字符串转换为浮点数,这是可能的)
本文介绍了在所有 pandas 列中将字符串转换为浮点数,这是可能的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我从列表列表中创建了一个 pandas 数据框

I created a pandas dataframe from a list of lists

import pandas as pd

df_list = [["a", "1", "2"], ["b", "3", np.nan]]
df = pd.DataFrame(df_list, columns = list("ABC"))
>>>   A  B    C
   0  a  1    2
   1  b  3  NaN

有没有办法将数据框的所有列转换为可以转换的浮点数,即 B 和 C?如果您知道要转换哪些列,则可以使用以下方法:

Is there a way to convert all columns of the dataframe to float, that can be converted, i.e. B and C? The following works, if you know, which columns to convert:

  df[["B", "C"]] = df[["B", "C"]].astype("float")

但是,如果您事先不知道哪些列包含数字,您会怎么做?当我尝试时

But what do you do, if you don't know in advance, which columns contain the numbers? When I tried

  df = df.astype("float", errors = "ignore")

所有列仍然是字符串/对象.同样,

all columns are still strings/objects. Similarly,

df[["B", "C"]] = df[["B", "C"]].apply(pd.to_numeric)

转换两列(虽然B"是 int 而C"是float",因为存在 NaN 值),但是

converts both columns (though "B" is int and "C" is "float", because of the NaN value being present), but

df = df.apply(pd.to_numeric)

显然会引发错误消息,我看不出有什么方法可以抑制它.
是否有可能在不遍历每一列的情况下执行此字符串-浮点转换,以尝试 .astype("float", errors = "ignore")?

obviously throws an error message and I don't see a way to suppress this.
Is there a possibility to perform this string-float conversion without looping through each column, to try .astype("float", errors = "ignore")?

推荐答案

我觉得你需要errors='ignore'pandas-docs/stable/generated/pandas.to_numeric.html" rel="noreferrer">to_numeric:

I think you need parameter errors='ignore' in to_numeric:

df = df.apply(pd.to_numeric, errors='ignore')
print (df.dtypes)
A     object
B      int64
C    float64
dtype: object

如果不是混合值,它工作得很好 - 带有字符串的数字:

It working nice if not mixed values - numeric with strings:

df_list = [["a", "t", "2"], ["b", "3", np.nan]]
df = pd.DataFrame(df_list, columns = list("ABC"))

df = df.apply(pd.to_numeric, errors='ignore')
print (df)
   A  B    C
0  a  t  2.0 <=added t to column B for mixed values
1  b  3  NaN

print (df.dtypes)
A     object
B     object
C    float64
dtype: object

您也可以将 int 向下转换为 floats:

You can downcast also int to floats:

df = df.apply(pd.to_numeric, errors='ignore', downcast='float')
print (df.dtypes)
A     object
B    float32
C    float32
dtype: object

同理:

df = df.apply(lambda x: pd.to_numeric(x, errors='ignore', downcast='float'))
print (df.dtypes)
A     object
B    float32
C    float32
dtype: object

这篇关于在所有 pandas 列中将字符串转换为浮点数,这是可能的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯了您的权益,请联系我们,我们会在确认后第一时间进行删除!

相关文档推荐

Multiprocessing on Windows breaks(Windows 上的多处理中断)
How to use a generator as an iterable with Multiprocessing map function(如何将生成器用作具有多处理映射功能的可迭代对象)
read multiple files using multiprocessing(使用多处理读取多个文件)
Why does importing module in #39;__main__#39; not allow multiprocessig to use module?(为什么在__main__中导入模块不允许multiprocessig使用模块?)
Trouble using a lock with multiprocessing.Pool: pickling error(使用带有 multiprocessing.Pool 的锁时遇到问题:酸洗错误)
Python sharing a dictionary between parallel processes(Python 在并行进程之间共享字典)