Nabih Bawazir
Nabih Bawazir

Reputation: 7255

Filling NaN and Empty Value wit other column

I want to filling NaN and Empty Value wit other column value, in this case column barcode_y filled by column barcode_x

Here's my data

    id      barcode_x     barcode_y A   B
0   7068    38927887      38927895  0   12
1   7068    38927895      38927895  0   1
2   7068    39111141      38927895  0   4
3   7116    73094237                18  309
4   7154    37645215      37645215  0   9
5   7342    86972909           NaN  7   25

Here's what I need

    id      barcode_x     barcode_y A   B
0   7068    38927887      38927895  0   12
1   7068    38927895      38927895  0   1
2   7068    39111141      38927895  0   4
3   7116    73094237      73094237  18  309
4   7154    37645215      37645215  0   9
5   7342    86972909      86972909  7   25

How suppose I do this?

Upvotes: 2

Views: 225

Answers (5)

harpan
harpan

Reputation: 8631

You can convert empty values with NaN and then use .fillna().

df['barcode_y'].replace(r'\s+', np.nan, regex=True).replace('',np.nan).fillna(df['barcode_x']).astype(int)

Output:

0    38927895
1    38927895
2    38927895
3    73094237
4    37645215
5    86972909
Name: barcode_y, dtype: int32

Upvotes: 1

piRSquared
piRSquared

Reputation: 294258

Using mask

x, y = df['barcode_x'], df['barcode_y']
y.mask(y.eq('') | y.isna(), x)

0    38927895
1    38927895
2    38927895
3    73094237
4    37645215
5    86972909
Name: barcode_y, dtype: object

Upvotes: 4

Narendra Prasath
Narendra Prasath

Reputation: 1531

Try out this,

def fillValues(x):    
   x = x['barcode_x'] if np.isnan(x['barcode_y']) else x['barcode_y']
   return x

df["barcode_y"] = df.apply(lambda x : fillValues(x),axis=1)
print(df)

Upvotes: -1

Orenshi
Orenshi

Reputation: 1873

I'd use combine_first in this case... especially if barcode_y is not dtype object

df.barcode_y.combine_first(df.barcode_x)

If barcode_y is dtype object, I think you can go this extra step like below:

>>> df
   barcode_x barcode_y
0          1         0
1        123      None
2        543
>>> df.barcode_y = df.barcode_y.combine_first(df.barcode_x)
>>> df
   barcode_x barcode_y
0          1         0
1        123       123
2        543
>>> df.loc[df.barcode_y.str.strip()=='', 'barcode_y'] = df.loc[df.barcode_y.str.strip()=='', 'barcode_x']
>>> df
   barcode_x  barcode_y
0          1          0
1        123        123
2        543        543

Upvotes: 0

JE_Muc
JE_Muc

Reputation: 5774

I recommend masking to accomplish what you want:

df['barcode_y'][df['barcode_y'].isna()] = df['barcode_x'][df['barcode_y'].isna()]

This will work universally, not depending if the columns are sorted in some way, for example if barcode_y is before or after barcode_x.

Upvotes: 0

Related Questions