Reputation: 49
I have pandas DataFrame in the below format
|Header|Text |
|------|-------|
| AB |yale |
| BC |grass |
| BC |window |
| AB |school |
| BC |student|
My expectation is to convert that DataFrame into the below format
| AB | BC |
|------|-------|
| yale | grass |
| yale | window|
|school|student|
Until now I can't seem to find a way to get the expected format. kindly need help.
Upvotes: 1
Views: 64
Reputation: 863791
Use DataFrame.insert
for new column for first column with compare AB
in Header
column with Series.eq
and replace non matched values of Text
to misisng values by Series.where
with forward filling them, then reove rows if same values in Text
and AB
column by compare for not equal in Series.ne
and filter only columns specified in DataFrame.loc
:
df.insert(0, 'AB', df['Text'].where(df['Header'].eq('AB')).ffill())
df = df.loc[df['Text'].ne(df['AB']), ['AB','Text']].rename(columns={'Text':'BC'})
print (df)
AB BC
1 yale grass
2 yale window
4 school student
Upvotes: 1