hk2
hk2

Reputation: 487

Create a subset by filtering on Year

I have a sample dataset as shown below:

| Id | Year | Price |
|----|------|-------|
| 1  | 2000 | 10    |
| 1  | 2001 | 12    |
| 1  | 2002 | 15    |
| 2  | 2000 | 16    |
| 2  | 2001 | 20    |
| 2  | 2002 | 22    |
| 3  | 2000 | 15    |
| 3  | 2001 | 19    |
| 3  | 2002 | 26    |

I want to subset the dataset so that I can consider the values only for last two years. I want to create a variable 'end_year' and pass a year value to it and then use it to subset original dataframe to take into account only the last two years. Since I have new data coming, so I wanted to create the variable. I have tried the below code but I'm getting error.

end_year="2002"
df1=df[(df['Year'] >= end_year-1)]

Upvotes: 0

Views: 337

Answers (1)

tdy
tdy

Reputation: 41327

Per the comments, Year is type object in the raw data. We should first cast it to int and then compare with numeric end_year:

df.Year=df.Year.astype(int) # cast `Year` to `int`
end_year=2002 # now we can use `int` here too
df1=df[(df['Year'] >= end_year-1)]
Id Year Price
1 1 2001 12
2 1 2002 15
4 2 2001 20
5 2 2002 22
7 3 2001 19
8 3 2002 26

Upvotes: 1

Related Questions