Reputation: 3717
I have created a dataframe from csv which has values like this
position, location
1,0,1, 1.2,2.2
based on comma separated values I want to create sub columns like shown in the diagram. No of comma separated will remain same for all. For e.g. If one location value has 2 fields then all of them will have 2 fields. So I need to separate these values by comma and create sub columns based on that.
Upvotes: 1
Views: 1737
Reputation: 71689
Use Series.str.split
on each of the column and create a new dataframe from each of this column, then use pd.concat
along axis=1
and with optional parameter keys
to concat the dataframes creating a MultiIndex
columns.
dfs = [pd.DataFrame(df[c].astype(str).str.split(',').tolist())
.rename(lambda x: f'Field{x + 1}', axis=1) for c in df.columns]
df1 = pd.concat(dfs, axis=1, keys=df.columns)
Result:
position location
Field1 Field2 Field3 Field1 Field2
0 1 0 1 1.2 2.2
Upvotes: 1
Reputation: 31146
I think simplest way to achieve this is to use a dict
comprehension from the dictionary produced by to_dict(orient="list")
This could be extended to deal with multiple before rows
df = pd.read_csv(io.StringIO("""position location
1,0,1 ,1.2,2.2"""), sep="\s\s+", engine="python")
d = df.to_dict(orient="list")
df2 = pd.DataFrame({tuple([k, f"field{e+1}"]):[i]
for k in d.keys()
for e,i in enumerate(d[k][0].strip(",").split(","))})
output
position location
field1 field2 field3 field1 field2
1 0 1 1.2 2.2
Upvotes: 0