Reputation: 1815
I have a dataset that looks as follows:
Name : joe
Job : Crazy Consultant
Hired : 4/12/2011 3:38:55 AM
Stats : crazy, bald head
Pay : $5000 Monthly
Name : Matt
Job : Crazy Receptionist
Hired : 4/12/2014 3:38:55 PM
Stats : crazy, Lots of hair
Name : Adam
Job : Crazy Drinker
Hired : 4/12/2017 3:38:55 AM
Stats : crazy, unknown
Term : 4/12/2017 3:38:55 PM
I read in and get the data as follows:
df = pd.read_csv(r"pathtomycsv.csv", encoding="UTF-16", delimiter='\s+:').transpose()
Output of above: (just as an example)
Name Job Hired Stats Name Job Hired Stats
Joe Crazy Consultant 4/12/2011 3:38:55 AM crazy, bald head Matt Crazy Consultant 4/12/2011 3:38:55 AM crazy, bald head
Ultimately, I would like to take my dataset from above, and transform it into a dataset like below by combining all headers together like below:
Name Job Hired Stats Pay Term
Joe Crazy Consultant 4/12/2011 3:38:55 AM crazy, bald head $5000 Monthly N/A
Matt Crazy Receptionist 4/12/2014 3:38:55 PM crazy, Lots of hair N/A N/A
Adam Crazy Drinker 4/12/2017 3:38:55 AM crazy, unknown N/A 4/12/2017 3:38:55 PM
Upvotes: 3
Views: 3917
Reputation: 17054
You can try like so:
import pandas as pd
df = pd.read_csv('file_name',sep='\s+:\s+',header=None).pivot(columns=0, values=1)
df.index = [df.index, df.Name.notnull().cumsum() - 1]
df = df.stack().reset_index(name='val')
df = df.pivot(index='Name', columns=0, values='val')
df
Output:
Upvotes: 2
Reputation: 57033
The problem arises because you have more colons in the date. Use "\s+:\s+"
as the separator. (Yes, it can be a regex.)
The following code works for me to convert your file into the table you want. I assume that 'Name' is always the first row in a set.
df = pd.read_csv("yourfile", delimiter='\s+:\s+',header=None)
df = df.reset_index()
df['index'][df[0]!='Name'] = np.nan
df['index'] = df['index'].fillna(method='ffill').astype(int)
df.set_index(['index',0])[1].unstack().set_index('Name')
#0 Hired Job Pay
#Name
#joe 4/12/2011 3:38:55 AM Crazy Consultant $5000 Monthly
#Matt 4/12/2014 3:38:55 PM Crazy Receptionist None
#Adam 4/12/2017 3:38:55 AM Crazy Drinker None
Upvotes: 3