Reputation: 43
I have a very large dataframe containing the following columns:
RegAddress.CareOf,RegAddress.POBox,RegAddress.AddressLine1,RegAddress.AddressLine2,RegAddress.PostTown,RegAddress.County,RegAddress.Country,RegAddress.PostCode
I am inserting this dataframe (loaded from a CSV) into a relational database, and so would like to convert these columns into a single column, RegAddress
, containing a dictionary, which contains the keys CareOf, POBox, AddressLine1...
and so on. I cannot figure out how to do this in a vectorised fashion, i.e. go from:
RegAddress.CareOf,RegAddress.POBox
Me,2
You,3
to:
RegAddress
{"CareOf": "Me", "POBox": 2}
{"CareOf": "You", "POBox": 3}
efficiently.
Upvotes: 0
Views: 95
Reputation: 867
You can use the .apply()
method to achieve this:
selected_cols = ['RegAddress.CareOf', 'RegAddress.POBox']
df2 = pd.DataFrame()
df2['RegAddress'] = df.apply(
lambda row: {
col.split('.')[1]: row[col] for col in row.index
if col in selected_cols
},
axis=1
)
Result:
RegAddress
0 {'CareOf': 'Me', 'POBox': 2}
1 {'CareOf': 'You', 'POBox': 3}
Upvotes: 1