Ruslan
Ruslan

Reputation: 423

How to parse a field in csv file to create additional rows in pandas dataframe?

Imagine I have a CSV file:

user, visits, session_time, payment
1, home|deals|cart, 224, 500
2, home|cart|orders|account, 545, 600

In the second field there are pages viewed by user.

How can I create a pandas dataframe with the following structure:

user page    order    session_time  payment
1    home    1        224           500
1    deals   2        224           500
...
2    account 4        545           600

where order field reflects order of visits field in csv file:

home|deals|cart 
1    2     3

Upvotes: 1

Views: 38

Answers (1)

Mark Wang
Mark Wang

Reputation: 2757

Steps,

  • Split visits column (series.str.split)
  • Expand split output (DataFrame.explode)
  • Assign order number (groupby reset_index())
(df.assign(page = df.visits.str.split('|'))
   .explode(column='page')
   .groupby('user')
   .apply(lambda x:x.reset_index().rename(lambda x:x+1))
   .rename_axis([None,'order'])
   .reset_index()
   .filter(['user','page','order','session_time','payment']))

Upvotes: 1

Related Questions