galaxyan
galaxyan

Reputation: 6111

pandas convert list to float

How could I convert column b and column c to float and also expend column b to two columns.

Example dataframe:

    a                              b             c
0  36   [-212828.804308, 100000067.554]  [-3079773936.0]
1  39  [-136.358761948, -50000.0160325]  [1518911.64408]
2  40  [-136.358761948, -50000.0160325]  [1518911.64408]

Expected:

    a        b1                  b2             c
0  36   -212828.804308  100000067.554  -3079773936.0
1  39  -136.358761948, -50000.0160325  1518911.64408
2  40  -136.358761948, -50000.0160325  1518911.64408

Upvotes: 4

Views: 6026

Answers (3)

titipata
titipata

Reputation: 5389

I extend solution from @ayhan in case you want to rename columns name in case you have multiple columns also. Note that I assume each columns has list with the same length.

col_names = []
for col in df.columns:
    if df[col].dtype == 'O' and len(df[col].iloc[0]) > 1:
        col_names.extend([col + str(i + 1) for i in range(len(df[col].iloc[0]))])
    else:
        col_names.extend([col])

df_new = pd.concat([df[col].apply(pd.Series) for col in df], axis=1)
df_new.columns = col_names

Upvotes: 1

Serenity
Serenity

Reputation: 36635

Construct new columns from 'b' and the drop 'b'. Column 'c' you may replace inplace.

df[['b1','b2']] = pd.DataFrame([x for x in df.b]) # new b1,b2
df.drop('b',axis=1,inplace=True) # drop b
df['c'] = pd.DataFrame([x for x in df.c]) # remove list from c

Upvotes: 2

user2285236
user2285236

Reputation:

Here are two alternatives:

1) Convert the columns to a list then construct a DataFrame from scratch:

pd.concat((df['a'], pd.DataFrame(df['b'].tolist()), pd.DataFrame(df['c'].tolist())), axis=1)
Out: 
    a              0             1             0
0  36 -212828.804308  1.000001e+08 -3.079774e+09
1  39    -136.358762 -5.000002e+04  1.518912e+06
2  40    -136.358762 -5.000002e+04  1.518912e+06

Or in a loop:

pd.concat((pd.DataFrame(df[col].tolist()) for col in df), axis=1)
Out: 
    0              0             1             0
0  36 -212828.804308  1.000001e+08 -3.079774e+09
1  39    -136.358762 -5.000002e+04  1.518912e+06
2  40    -136.358762 -5.000002e+04  1.518912e+06

2) Apply pd.Series to each column (possibly slower):

pd.concat((df[col].apply(pd.Series) for col in df), axis=1)
Out: 
    0              0             1             0
0  36 -212828.804308  1.000001e+08 -3.079774e+09
1  39    -136.358762 -5.000002e+04  1.518912e+06
2  40    -136.358762 -5.000002e+04  1.518912e+06

Upvotes: 4

Related Questions