Reputation: 5646
Sample code:
import pandas as pd
df = pd.DataFrame({'id': [1, 2, 3], 'bbox': [[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0]]})
Goal:
df = pd.DataFrame({'id': [1, 2, 3], 'bbox': [[1.0, 2.0, 3.0, 4.0], [5.0, 6.0, 7.0, 8.0], [9.0, 10.0, 11.0, 12.0]], 'x1': [1, 5, 9], 'y1': [2, 6, 10], 'x2': [4, 12, 20], 'y2': [6, 14, 22]})
In words, I want to add four integer columns to the dataframe, where the first two are just the first two elements of each list in bbox
, and the last two are respectively the sum of the first and third element of each list, and the sum of the second and fourth one. Currently, I do this:
df[['x1', 'y1', 'w', 'h']] = pd.DataFrame(df['bbox'].values.tolist(), index=df.index).astype(int)
df.assign(x2 = df['x1']+df['w'], y2 = df['y1']+df['h'])
df.drop(['w', 'h'], axis = 1)
It seems a bit convoluted to me. Isn't there a way to avoid creating the intermediate columns w
and h
, or would it make the code less readable? Readability is an higher priority for me than saving one code line, thus if there are no readable alternatives, I'll settle for this solution.
Upvotes: 1
Views: 81
Reputation: 862591
I think you can create x2
and y2
in first step:
df1 = pd.DataFrame(df['bbox'].values.tolist(),index=df.index).astype(int)
df[['x1', 'y1', 'x2', 'y2']] = df1
df = df.assign(x2 = df['x1']+df['x2'], y2 = df['y1']+df['y2'])
print (df)
id bbox x1 y1 x2 y2
0 1 [1.0, 2.0, 3.0, 4.0] 1 2 4 6
1 2 [5.0, 6.0, 7.0, 8.0] 5 6 12 14
2 3 [9.0, 10.0, 11.0, 12.0] 9 10 20 22
Or use +=
:
df1 = pd.DataFrame(df['bbox'].values.tolist(),index=df.index).astype(int)
df[['x1', 'y1', 'x2', 'y2']] = df1
df['x2'] += df['x1']
df['y2'] += df['y1']
Upvotes: 3