Reputation: 2689
I have a sheet that looks like this.
Fleet Risk Control | Communication | Interpersonal relationships | Demographic | Demographic |
---|---|---|---|---|
Q_21086 | Q_21087 | Q_21088 | AGE | GENDER |
1 | 3 | 4 | 27 | Male |
What I'm trying to achieve is where there is a row with 'Q_' inside of it, merge that cell with the top row and return a new dataframe.
So the existing data above would become something like this:
Fleet Risk Control - Q_21086 | Communication - Q_21087 | Interpersonal relationships - Q_21088 |
---|---|---|
1 | 3 | 4 |
I honestly have no idea where to even begin with something like this.
Upvotes: 0
Views: 449
Reputation: 106
# rename columns
df.columns = [x + ' - ' + y if y.startswith('Q_') else x for x, y in zip(df.columns, df.iloc[0])]
#drop not matching columns
to_drop = [c for c, _ in df.iloc[0].apply(lambda x: not x.startswith('Q_')).items() if _]
df.drop(to_drop, axis=1)[1:]
Upvotes: 0
Reputation: 81
You could try this one. This is for input:
import pandas as pd
df = pd.DataFrame({'Fleet Risk Control': ['Q_21086', 1],
'Communication': ['Q_21087', 3],
'Interpersonal relationships': ['Q_21088', 4],
'Demographic': ['AGE', 27],
'Demographic 2': ['Gender', 'Male']})
Now concat the header line with the first line of df:
df.columns = df.columns + ' - ' + df.iloc[0, :]
Extract every line without the first and dropping the last both columns
df = df.iloc[1:, :-2]
Upvotes: 2