Reputation: 1
I want to find in SOFTWARE column which is the new software with respect to VIN column.
for example 'c5D2-14N450-CBQ' to 'c5D2-14N450-CBR'(for both software C column value should be less than or equal to 10) so, 'c5D2-14N450-CBR' is my new software
condition:- Update should be done when the value of column C should be less than or equal to 10
Below is my data frame
import pandas as pd
data = {'VIN': ['aaaa','aaaa','aaaa','aaaa','bbb','bbb','bbb','bbb','CCCC','CCCC','CCCC','CCCC'],
'SOFTWARE': ['P8G2-14B570-PRC','c5D2-14N450-CBR','P8G2-14B570-PRA','c5D2-14N450-CBQ',
'K9A2-13V570-BAI','K9A2-13V570-BAH','K9A2-13V570-BAH','K9A2-13V570-BAH',
'J4E2-12K532-K7N','J4E2-12K532-K7O','J4E2-12K532-K7O','J4E2-12K532-K7N'],
'C': [1,3,15,9,9,12,17,88,3,5,9,10]
}
df = pd.DataFrame(data)
I tried below method but not getting what I expected:
df['RESULT'] = df.apply(lambda x: x['SOFTWARE'] if x['C'] >= 10 else (x['SOFTWARE']), axis=1)
df
I also tried by masking:
mask = df.groupby('VIN')['C'].diff().le(10)
df['Result'] = np.where(mask|mask.groupby(df['VIN']),1,0)
Below is my expected output:
data = {'VIN': ['aaaa','aaaa','aaaa','aaaa','bbb','bbb','bbb','bbb','CCCC','CCCC','CCCC','CCCC'],
'SOFTWARE': ['P8G2-14B570-PRC','c5D2-14N450-CBR','P8G2-14B570-PRA','c5D2-14N450-CBQ',
'K9A2-13V570-BAI','K9A2-13V570-BAH','K9A2-13V570-BAH','K9A2-13V570-BAH',
'J4E2-12K532-K7N','J4E2-12K532-K7O','J4E2-12K532-K7O','J4E2-12K532-K7N'],
'C': [1,3,15,9,9,12,17,88,3,5,9,10],
'RESULT': ['old software','new software','old software','old software','old software','old software',
'old software','old software','old software','new software','new software','old software',]
}
df = pd.DataFrame(data)
print (df)
Upvotes: 0
Views: 51
Reputation: 11474
You should do the following:
import pandas as pd
data = {'VIN': ['aaaa','aaaa','aaaa','aaaa','bbb','bbb','bbb','bbb','CCCC','CCCC','CCCC','CCCC'],
'SOFTWARE': ['P8G2-14B570-PRC','c5D2-14N450-CBR','P8G2-14B570-PRA','c5D2-14N450-CBQ',
'K9A2-13V570-BAI','K9A2-13V570-BAH','K9A2-13V570-BAH','K9A2-13V570-BAH',
'J4E2-12K532-K7N','J4E2-12K532-K7O','J4E2-12K532-K7O','J4E2-12K532-K7N'],
'C': [1,3,15,9,9,12,17,88,3,5,9,10]
}
df = pd.DataFrame(data)
g = df.groupby('VIN')
mask = (g['C'].diff() <= 10) & (g['C'].diff().shift(-1) >= 0)
df['RESULT'] = np.where(mask, 'new software', 'old software')
which returns your desired output:
VIN SOFTWARE C RESULT
0 aaaa P8G2-14B570-PRC 1 old software
1 aaaa c5D2-14N450-CBR 3 new software
2 aaaa P8G2-14B570-PRA 15 old software
3 aaaa c5D2-14N450-CBQ 9 old software
4 bbb K9A2-13V570-BAI 9 old software
5 bbb K9A2-13V570-BAH 12 old software
6 bbb K9A2-13V570-BAH 17 old software
7 bbb K9A2-13V570-BAH 88 old software
8 CCCC J4E2-12K532-K7N 3 old software
9 CCCC J4E2-12K532-K7O 5 new software
10 CCCC J4E2-12K532-K7O 9 new software
11 CCCC J4E2-12K532-K7N 10 old software
Upvotes: 0