Reputation: 515
I have the following dataframe:
Name rollNumber external_roll_number testDate marks
0 John 34 234 2021-04-28 15
1 John 34 234 2021-03-28 25
I would like to convert it like this:
Name rollNumber external_roll_number testMonth marks testMonth marks
0 John 34 234 April 15 March 25
If the above is not possible then I would atleast want it to be like this:
Name rollNumber external_roll_number testDate marks testDate marks
0 John 34 234 2021-04-28 15 2021-03-28 25
How can I convert my dataframe to the desired output? This change will be based on the Name
column of the rows.
EDIT 1
I tried using pivot_table
like this but I did not get the desired result.
merged_df_pivot = pd.pivot_table(merged_df, index=["name", "testDate"], aggfunc="first", dropna=False).fillna("")
When I try to iterate through the merged_df_pivot
like this:
for index, details in merged_df_pivot.iterrows():
I am again getting two rows and also I was not able to add the new testMonth
column by the above method.
Upvotes: 2
Views: 136
Reputation: 31236
unstack()
month to be columnsdf = pd.read_csv(io.StringIO(""" Name rollNumber external_roll_number testDate marks
0 John 34 234 2021-04-28 15
1 John 34 234 2021-03-28 25
"""), sep="\s+")
df["testDate"] =pd.to_datetime(df["testDate"])
df = df.assign(testMonth = df["testDate"].dt.strftime("%B")).drop(columns="testDate")
dft = (df.set_index([c for c in df.columns if c!="marks"])
.unstack("testMonth") # make month a column
.droplevel(0, axis=1) # remove unneeded level in columns
# create columns for months from column names and rename marks columns
.pipe(lambda d: d.assign(**{f"testMonth_{i+1}":c
for i,c in enumerate(d.columns)}).rename(columns={c:f"marks_{i+1}"
for i,c in enumerate(d.columns)}))
.reset_index()
)
Name | rollNumber | external_roll_number | marks_1 | marks_2 | testMonth_1 | testMonth_2 | |
---|---|---|---|---|---|---|---|
0 | John | 34 | 234 | 15 | 25 | April | March |
Upvotes: 1