hek18
hek18

Reputation: 57

Convert an array column into multiple columns Python

I have a dataframe in the following format:

0 [[2387, 1098], [1873, 6792], ....

1 [0, 1, 0, 1, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, ...

I need to convert the array of first column into two futher columns. I have seen other such similiar questions but the solutions are given for smaller data, I have around 300 rows and can't write them all down manually. I have tried to_list() but I get an error when using it.

What code should I use to split it into two? Also, why is my dataframe not displaying in the form of columns rather in rows?

Upvotes: 0

Views: 2437

Answers (2)

furas
furas

Reputation: 142879

EDIT:

Maybe it looks strange but you can use .str[0] to get first column from lists in DataFrame.

import pandas as pd

df = pd.DataFrame({0:[[2387, 1098], [1873, 6792],], 1:[0,1]})

new_df = pd.DataFrame({
              0: df[0].str[0], 
              1: df[0].str[1], 
              2: df[1]
         })

print(new_df)

OLDER:

Using apply() with pandas.Series you can convert first column into new DataFrame with two columns

import pandas as pd

df = pd.DataFrame({0:[[2387, 1098], [1873, 6792],], 1:[0,1]})

new_df = df[0].apply(pd.Series)

print(new_df)

Result:

      0     1
0  2387  1098
1  1873  6792

And later you can assing them back to old `DataFrame

df[2] = df[1]       # move `[0,1,...]` to column 2
df[[0,1]] = new_df  # put `new_df` in columns 0,1

Result:

      0     1  2
0  2387  1098  0
1  1873  6792  1

Or you can copy column [0,1,...] from old df to new_df

import pandas as pd

df = pd.DataFrame({0:[[2387, 1098], [1873, 6792],], 1:[0,1]})

new_df = df[0].apply(pd.Series)
new_df[2] = df[1]

print(new_df)

Upvotes: 1

XtianP
XtianP

Reputation: 389

You can convert your dataframe in this way:

import pandas as pd
import numpy as np

df = pd.DataFrame({0:[[2387, 1098], [1873, 6792],], 1:[0,1]})
arr = np.array(df.loc[:,0].to_list())
df2 = pd.DataFrame({0:arr[:,0], 1:arr[:,1], 2:df.loc[:,1]})
print(df2)

The result is:

      0     1  2
0  2387  1098  0
1  1873  6792  1

A second way to solve the problem (with a "moon" sample) is:

import sklearn
import sklearn.datasets

X, y = sklearn.datasets.make_moons()
pd.DataFrame({'x0':X[:,0], 'x1': X[:,1], 'y':y})

and the result is:

          x0        x1  y
0   0.981559  0.191159  0
1   0.967948 -0.499486  1
2   0.018441  0.308841  1
3  -0.981559  0.191159  0
4   0.967295  0.253655  0
..       ...       ... ..
95  0.238554 -0.148228  1
96  0.096023  0.995379  0
97  0.327699 -0.240278  1
98  0.900969  0.433884  0
99  1.981559  0.308841  1

[100 rows x 3 columns]

Upvotes: 1

Related Questions