user1298426
user1298426

Reputation: 3717

Create sub column in csv based on field values

I have created a dataframe from csv which has values like this

position, location
1,0,1, 1.2,2.2

based on comma separated values I want to create sub columns like shown in the diagram. No of comma separated will remain same for all. For e.g. If one location value has 2 fields then all of them will have 2 fields. So I need to separate these values by comma and create sub columns based on that.

enter image description here

Upvotes: 1

Views: 1737

Answers (2)

Shubham Sharma
Shubham Sharma

Reputation: 71689

Use Series.str.split on each of the column and create a new dataframe from each of this column, then use pd.concat along axis=1 and with optional parameter keys to concat the dataframes creating a MultiIndex columns.

dfs = [pd.DataFrame(df[c].astype(str).str.split(',').tolist())
                   .rename(lambda x: f'Field{x + 1}', axis=1) for c in df.columns]      
df1 = pd.concat(dfs, axis=1, keys=df.columns)

Result:

    position               location       
    Field1 Field2 Field3   Field1 Field2
0        1      0      1      1.2    2.2

Upvotes: 1

Rob Raymond
Rob Raymond

Reputation: 31146

I think simplest way to achieve this is to use a dict comprehension from the dictionary produced by to_dict(orient="list")

This could be extended to deal with multiple before rows

df = pd.read_csv(io.StringIO("""position  location

1,0,1   ,1.2,2.2"""), sep="\s\s+", engine="python")

d = df.to_dict(orient="list")
df2 = pd.DataFrame({tuple([k, f"field{e+1}"]):[i] 
 for k in d.keys() 
 for e,i in enumerate(d[k][0].strip(",").split(","))})

output

position               location       
  field1 field2 field3   field1 field2
       1      0      1      1.2    2.2

Upvotes: 0

Related Questions