emax
emax

Reputation: 7235

How to get a list from a list?

I read a pandas dataframe df from .csv file. Each cell of the dataframe contains a string like the following

for i in df.index:
    for j in df.columns:

df[i][j]
      '[0.109, 0.1455, 0.0, 1.80e-48, 42.070, -14.582]'

I would like to have a list with the values as np.float. I tried

 df[i][j].split()
'[0.109,',
 '0.145,',
 '0.0,',
 '1.80e-48,',
 '42.070,',
 '-14.582]']

Upvotes: 2

Views: 85

Answers (4)

jpp
jpp

Reputation: 164673

You can use ast.literal_eval, and I recommend you avoid chained indexing. Instead, use pd.DataFrame.at for fast scalar access. Note also to iterate columns you don't need to access pd.DataFrame.columns:

from ast import literal_eval

for i in df.index:
    for j in df:
        print(literal_eval(df.at[i, j]))

If you need to apply this for an entire series, you can use pd.Series.map or a list comprehension:

df['col1'] = df['col1'].map(literal_eval)
df['col1'] = [literal_eval(i) for i in df['col1']]

If each list has the same number of items I strongly suggest you split into separate columns to permit vectorised functionality:

df = df.join(pd.DataFrame(df.pop('col1').map(literal_eval).values.tolist()))

Pandas is not designed to hold lists in series and for big data workflows you will likely face efficiency and memory issues with such a data structure.

Upvotes: 0

Guimoute
Guimoute

Reputation: 4629

Without exterior modules, it's pretty easy to do with a list comprehension:

A = df[i][j]                     '[0.109, 0.1455, 0.0, 1.80e-48, 42.070, -14.582]'
B = A.strip("[]").split(",")      ['0.109', ' 0.1455', ' 0.0', ' 1.80e-48', ' 42.070', ' -14.582']
C = [float(x) for x in B]         [0.109, 0.1455, 0.0, 1.8e-48, 42.07, -14.582]

So the one-liner would be:

My_list_of_floats = [float(x) for x in df[i][j].strip("[]").split(",")]

Upvotes: 2

Ravi Patel
Ravi Patel

Reputation: 366

You can use the python eval() function to convert the string into a python object, then turn into np.float objects:

map(np.float, eval(df[i][j]))

This makes the string into a python list first, then casts each item as a np.float.

Since np.float == float, you can skip the casting to np.float, and just do

eval(df[i][j])

Upvotes: 0

blhsing
blhsing

Reputation: 106553

You can use ast.literal_eval to parse the string as a list of floats:

>>> import ast
>>> ast.literal_eval('[0.109, 0.1455, 0.0, 1.80e-48, 42.070, -14.582]')
[0.109, 0.1455, 0.0, 1.8e-48, 42.07, -14.582]
>>>

Upvotes: 4

Related Questions