JP1
JP1

Reputation: 749

pandas - make multiple rows for each column

I am trying to convert some data into a more useful format from .xls to .csv with pandas.

I have loaded the data like so:

xls = pd.ExcelFile('file.xls')

which returns a data frame which looks like:

Name    Event1    Date1    Event2    Date2    Event3    Date3
Joe     jump      1.1.13   skip      1.2.13   hop       1.3.14
Jack    skip      1.2.12   run       1.5.14   NA        NA

I would like to reformat the data so names are repeated multiple times for each event they participated in with their date. i.e.

Name    Event   Date
Joe     jump    1.1.13
Joe     skip    1.2.13

In a way I can remove all NA. I have multiple sheets in xcel which is why i want to concatenate everything in this way.

Is there a simple command or am I stuck with for loops?

thanks!

Upvotes: 0

Views: 1168

Answers (1)

Nickil Maveli
Nickil Maveli

Reputation: 29711

Use pd.lreshape which is the go to method for converting a wide formatted DF to a long one such as this.

This method accepts a dictionary as it's groups parameter wherein the column names starting with a certain prefix are clustered under a single wholesome column.

d = dict(Event=df.filter(regex="^Event").columns, Date=df.filter(regex="^Date").columns)
pd.lreshape(df, d)

enter image description here

Upvotes: 3

Related Questions