Reputation: 21
I have a dataframe with two colums of interest: a column with names (repeated for each course), and a column with the course name they've taken or started. The problem is, if I use the dataframe as-is, if a person has not attempted the course, they won't have a value showing for that course at all, and therefore I won't be able to tell who is missing the course. Is there a way to auto generate the missing rows with a name? The other columns can be N/A or anything else.
Dataframe:
Name | Course Title | Completion Date |
---|---|---|
Mark | Math | 1/1/2021 |
Mark | English | 2/1/2021 |
Mark | Econ | 3/1/2021 |
David | Math | 7/1/2021 |
David | English | 4/1/2021 |
Cindy | Math | 6/1/2021 |
Cindy | English | 9/1/2021 |
Cindy | Econ | 11/1/2021 |
What I'm looking for:
Name | Course Title | Completion Date |
---|---|---|
Mark | Math | 1/1/2021 |
Mark | English | 2/1/2021 |
Mark | Econ | 3/1/2021 |
David | Math | 7/1/2021 |
David | English | 4/1/2021 |
David | Econ | N/A |
Cindy | Math | 6/1/2021 |
Cindy | English | 9/1/2021 |
Cindy | Econ | 11/1/2021 |
Upvotes: 1
Views: 69
Reputation: 28644
You can make missing values explicit with the complete function from pyjanitor:
# pip install pyjanitor
import pandas as pd
import janitor
df.complete('Name', 'Course Title')
Name Course Title Completion Date
0 Cindy Econ 11/1/2021
1 Cindy English 9/1/2021
2 Cindy Math 6/1/2021
3 David Econ NaN
4 David English 4/1/2021
5 David Math 7/1/2021
6 Mark Econ 3/1/2021
7 Mark English 2/1/2021
8 Mark Math 1/1/2021
Upvotes: 1
Reputation: 71570
Try with pivot
and stack
with reset_index
:
>>> df.pivot(*df).stack(dropna=False)[::-1].reset_index(name='Completion Date')
Name Course Title Completion Date
0 Mark Math 1/1/2021
1 Mark English 2/1/2021
2 Mark Econ 3/1/2021
3 David Math 7/1/2021
4 David English 4/1/2021
5 David Econ NaN
6 Cindy Math 6/1/2021
7 Cindy English 9/1/2021
8 Cindy Econ 11/1/2021
>>>
If you specifically want it to be N/A
, try:
>>> df.pivot(*df).stack(dropna=False)[::-1].reset_index(name='Completion Date').fillna('N/A')
Name Course Title Completion Date
0 Mark Math 1/1/2021
1 Mark English 2/1/2021
2 Mark Econ 3/1/2021
3 David Math 7/1/2021
4 David English 4/1/2021
5 David Econ N/A
6 Cindy Math 6/1/2021
7 Cindy English 9/1/2021
8 Cindy Econ 11/1/2021
>>>
Upvotes: 1