Reputation: 65
I have a problem.
Check out the dataframe below
Company Year Status
A 2021 Unpaid
B 2021 Paid
C 2021 Unpaid
D 2021 Paid
A 2020 Unpaid
B 2020 Unpaid
C 2020 Paid
D 2020 Paid
I want to get a list of the companies that were unpaid in 2020 but paid in 2021 (so just C). I can do this in excel with no problem but can't figure it out in pandas. Am stumped.
Upvotes: 0
Views: 46
Reputation: 3706
You can pivot then use query
import pandas as pd
data = {
"Company": ["A", "B", "C", "D", "A", "B", "C", "D"],
"Year": [2021, 2021, 2021, 2021, 2020, 2020, 2020, 2020],
"Status": ["Unpaid", "Paid", "Unpaid", "Paid", "Unpaid", "Unpaid", "Paid", "Paid"]
}
answer = (
pd
.DataFrame(data)
.pivot_table(index="Company", columns="Status", values="Year")
.reset_index()
.query("Paid == 2020 & Unpaid == 2021")
["Company"].tolist()
)
print(answer)
Upvotes: 1