Reputation: 49
I have a DataFrame with one column called cardinality
that has in each row a list of n numbers. I would like to iterate over each row, take the first element from the list, and put it into a new column so that I have n columns.
Input:
input_data = pd.DataFrame({'cardinality':[[0,1,0,0,0],[1,0,1,0,0],[0,0,1,3,0],[0,0,0,1,1]]})
Desired output:
c_0 c_1 c_2 c_3 c_4
0 1 0 0 0
1 0 1 0 0
0 0 1 3 0
0 0 0 1 1
Here is my code:
iterator = 0
for row in input_data.itertuples():
for cardinality in input_data['cardinality']:
col_name = 'c_' + str(iterator)
input_data[col_name] = [item[0] for item in input_data['cardinality']]
input_data['cardinality'] = input_data['cardinality'].apply(lambda x: x.pop(0))
iterator += 1
It seems like the .pop
method is not the right way to remove the first item from the list.
Upvotes: 2
Views: 93
Reputation: 29635
you can use tolist
on the series and create a new dataframe, then add_prefx
to name it as expected.
output = pd.DataFrame(input_data['cardinality'].tolist()).add_prefix('c_')
print(output)
c_0 c_1 c_2 c_3 c_4
0 0 1 0 0 0
1 1 0 1 0 0
2 0 0 1 3 0
3 0 0 0 1 1
Upvotes: 3
Reputation: 15872
You can use pd.DataFrame
constructor and rename
:
>>> pd.DataFrame(input_data['cardinality'].tolist()).rename(columns='c_{}'.format)
c_0 c_1 c_2 c_3 c_4
0 0 1 0 0 0
1 1 0 1 0 0
2 0 0 1 3 0
3 0 0 0 1 1
Upvotes: 3