Cla B
Cla B

Reputation: 57

Dataframe: cannot replace elements with others in DF

I have imported data with read_csv and have a DataFrame called praw. It has several columns, like 'TIME', 'COUNTRY', 'PRODUCT' and 'VALUE'.

I want to replace the missing data for 'PRODUCT' == product1 with the data I have for product2, given a selected year and country.

I use the following code

praw.loc[(praw['COUNTRY'] == country) & (praw['TIME'] == year) & 
(praw['PRODUCT'] == 'product1'),'Value'] = praw.loc[(praw['COUNTRY'] == country) & 
(praw['TIME'] == year) & (praw['PRODUCT'] == 'product2'),'Value']

If I print

praw.loc[(praw['COUNTRY'] == country) & 
(praw['TIME'] == year) & 
(praw['PRODUCT'] == 'product1'),'Value'] 

I still have NaN. What am I doing wrong?

I am sure the value for 'product2' is not missing :)

Upvotes: 1

Views: 71

Answers (1)

jezrael
jezrael

Reputation: 863166

I think you need fillna with values[0] for replace by first value by m2 condition:

m1 = (praw['COUNTRY'] == country) & (praw['TIME'] == year)& (praw['PRODUCT'] == 'product1')
m2 = (praw['COUNTRY'] == country) & (praw['TIME'] == year)& (praw['PRODUCT'] == 'product2')
praw.loc[m1,'Value'] = praw.loc[m1,'Value'].fillna(praw.loc[m2,'Value'].values[0])

Sample:

praw = pd.DataFrame({'COUNTRY':list('aaadka'),
                   'TIME':[4,4,4,5,4,4],
                   'C':[7,8,9,4,2,3],
                   'D':[1,3,5,7,1,0],
                   'Value':[5,np.nan,np.nan,9,np.nan,0],
                   'PRODUCT':list('aaabbb')})

print (praw)
   C COUNTRY  D PRODUCT  TIME  Value
0  7       a  1       a     4    5.0
1  8       a  3       a     4    NaN
2  9       a  5       a     4    NaN
3  4       d  7       b     5    9.0
4  2       k  1       b     4    NaN
5  3       a  0       b     4    0.0

country = 'a'
year = 4
m1 = (praw['COUNTRY'] == country) & (praw['TIME'] == year)& (praw['PRODUCT'] == 'a')
m2 = (praw['COUNTRY'] == country) & (praw['TIME'] == year)& (praw['PRODUCT'] == 'b')
print (m1)
0     True
1     True
2     True
3    False
4    False
5    False
dtype: bool

print (m2)
0    False
1    False
2    False
3    False
4    False
5     True
dtype: bool

print (praw.loc[m2,'Value'].values[0])
0.0

praw.loc[m1,'Value'] = praw.loc[m1,'Value'].fillna(praw.loc[m2,'Value'].values[0])
print (praw)
   C COUNTRY  D PRODUCT  TIME  Value
0  7       a  1       a     4    5.0
1  8       a  3       a     4    0.0
2  9       a  5       a     4    0.0
3  4       d  7       b     5    9.0
4  2       k  1       b     4    NaN
5  3       a  0       b     4    0.0

Upvotes: 1

Related Questions