Reputation: 749
I am trying to convert some data into a more useful format from .xls to .csv with pandas.
I have loaded the data like so:
xls = pd.ExcelFile('file.xls')
which returns a data frame which looks like:
Name Event1 Date1 Event2 Date2 Event3 Date3
Joe jump 1.1.13 skip 1.2.13 hop 1.3.14
Jack skip 1.2.12 run 1.5.14 NA NA
I would like to reformat the data so names are repeated multiple times for each event they participated in with their date. i.e.
Name Event Date
Joe jump 1.1.13
Joe skip 1.2.13
In a way I can remove all NA. I have multiple sheets in xcel which is why i want to concatenate everything in this way.
Is there a simple command or am I stuck with for loops?
thanks!
Upvotes: 0
Views: 1168
Reputation: 29711
Use pd.lreshape
which is the go to method for converting a wide formatted DF
to a long one such as this.
This method accepts a dictionary as it's groups parameter wherein the column names starting with a certain prefix are clustered under a single wholesome column.
d = dict(Event=df.filter(regex="^Event").columns, Date=df.filter(regex="^Date").columns)
pd.lreshape(df, d)
Upvotes: 3