Rory LM
Rory LM

Reputation: 180

Tuple elements to dataframe column in python

I have 2D lists containing 0-3 sets of pairs (data will always be paired).

examples:

[[2.0, 0.1], [7.0, 0.6], [1.0, 0.3]] or
[[9.0, 0.7], [1.0, 0.2]]             or
[[]]

I want to be able to append each element of each pair to its own column in an existing dataframe.

Desired dataframe using above data:

other_data,    pair_0_0, pair_0_1, pair_1_0, pair_1_1, pair_2_0, pair2_1
'blah',        2.0,      0.1,      7.0,      0.6,      1.0,      0.3    
'blah blah',   9.0,      0.7,      1.0,      0.2
'blaah'       

It needs to be able to handle nulls, and preserve the order of the list.

I've tried the following, but it can't it gives an index error if i don't have 3 pairs.

df.loc[len(df)] = ['blah blah', list2D[0][0], list2D[0][1], list2D[1][0], list2D[1][1], list2D[2][0], list2D[2][1]

I think it would involve some list comprehension, but i'm not sure how to do it.

Upvotes: 3

Views: 1199

Answers (2)

Dos
Dos

Reputation: 2507

Another very simple way is using Python dict comprehension to insert a new entry e:

row = {f'pair_{j}_{i}': e[j][i] for j in range(len(e)) for i in range(len(e[j]))}

Example:

e1 = [[2.0, 0.1], [7.0, 0.6], [1.0, 0.3]]
e2 = [[9.0, 0.7], [1.0, 0.2]]
e3 = [[]]

df = pd.DataFrame({})

for e in [e1, e2, e3]:
    row = {f'pair_{j}_{i}': e[j][i] for j in range(len(e)) for i in range(len(e[j]))}
    df = df.append(row, ignore_index=True)

print(df)

   pair_0_0  pair_0_1  pair_1_0  pair_1_1  pair_2_0  pair_2_1
0       2.0       0.1       7.0       0.6       1.0       0.3
1       9.0       0.7       1.0       0.2       NaN       NaN
2       NaN       NaN       NaN       NaN       NaN       NaN

Upvotes: 1

Chris Adams
Chris Adams

Reputation: 18647

How about numpy.ravel in a list comprehension:

l1 = [[2.0, 0.1], [7.0, 0.6], [1.0, 0.3]]
l2 = [[9.0, 0.7], [1.0, 0.2]]
l3 = [[]]

df = pd.DataFrame([np.ravel(x) for x in [l1, l2, l3]])

# Fix column headers
df.columns = [f'pair_{x//2}_{x%2}' for x in range(df.shape[1])]

[out]

   pair_0_0  pair_0_1  pair_1_0  pair_1_1  pair_2_0  pair_2_1
0       2.0       0.1       7.0       0.6       1.0       0.3
1       9.0       0.7       1.0       0.2       NaN       NaN
2       NaN       NaN       NaN       NaN       NaN       NaN

Update

To append an individual list to an existing DataFrame for example, use:

l4 = [[3.0, 0.2], [6.0, 0.8], [1.2, 0.6]]

df.append(pd.DataFrame([np.ravel(l4)]).rename(columns=lambda x: f'pair_{x//2}_{x%2}'))

[out]

   pair_0_0  pair_0_1  pair_1_0  pair_1_1  pair_2_0  pair_2_1
0       2.0       0.1       7.0       0.6       1.0       0.3
1       9.0       0.7       1.0       0.2       NaN       NaN
2       NaN       NaN       NaN       NaN       NaN       NaN
0       3.0       0.2       6.0       0.8       1.2       0.6

Or using pandas.concat in a loop to create a DataFrame from scratch you could do:

df = pd.DataFrame()

for l in  [l1, l2, l3]:
    df = pd.concat([df, pd.DataFrame([np.ravel(l)]).rename(columns=lambda x: f'pair_{x//2}_{x%2}')],
                   sort=True)

Upvotes: 3

Related Questions