Reputation: 4388
I have a dataset where I would like to replace and update values within a column when a data condition is met.
Data
ID Date State File
AA Q122 NY closed
AA Q222 NY closed
AA Q322 NY closed
AA Q422 NY closed
AA Q123 NY closed
AA Q223 NY closed
AA Q323 NY closed
AA Q423 NY closed
BB Q124 CA closed
BB Q224 CA open
BB Q324 CA open
BB Q424 CA open
BB Q125 CA open
BB Q225 CA open
BB Q325 CA open
BB Q425 CA open
Desired
ID Date State File
AA Q122 NY yes
AA Q222 NY yes
AA Q322 NY yes
AA Q422 NY yes
AA Q123 NY yes
AA Q223 NY closed
AA Q323 NY closed
AA Q423 NY closed
BB Q124 CA closed
BB Q224 CA open
BB Q324 CA open
BB Q424 CA open
BB Q125 CA open
BB Q225 CA open
BB Q325 CA open
BB Q425 CA open
Doing
df.loc[(df['ID'].eq('AA')) & (df['Date'] <= 'Q123'), 'File'] = 'yes'
**However this is not capturing the quarter order and assigns the string 'yes' to rows that do not meet the condition of less than or = to Q123.**
Still researching, any suggestion is appreciated- Perhaps I need to convert quarters to datetime longdate and base the condition off of this column.
Upvotes: 0
Views: 558
Reputation: 11650
here is one way to do it
to apply the condition, Date need to be in Year and Quarter, and then applying np.where condition. However, you need to put the condition in the same format i.e. YYQ (year, quarter)
df['File']= np.where( ( df['ID'] =='AA') &
((df['Date'].str[-2:] + df['Date'].str[1:2]).astype(int) <=231)
, 'yes'
, df['File'] )
df
ID Date State File
0 AA Q122 NY yes
1 AA Q222 NY yes
2 AA Q322 NY yes
3 AA Q422 NY yes
4 AA Q123 NY yes
5 AA Q223 NY closed
6 AA Q323 NY closed
7 AA Q423 NY closed
8 BB Q124 CA closed
9 BB Q224 CA open
10 BB Q324 CA open
11 BB Q424 CA open
12 BB Q125 CA open
13 BB Q225 CA open
14 BB Q325 CA open
15 BB Q425 CA open
Upvotes: 2