Ting Wang
Ting Wang

Reputation: 231

How Can I replace NaN in a row with values in another row Pandas

I tried several methods to replace NaN in a row with values in another row, but none of them worked as expected. Here is my Dataframe:

test = pd.DataFrame(
    {
        "a": [1, 2, 3, 4, 5], 
        "b": [4, 5, 6, np.nan, np.nan], 
        "c": [7, 8, 9, np.nan, np.nan], 
        "d": [7, 8, 9, np.nan, np.nan]
     }
)

   a    b    c    d
0  1   4.0  7.0  7.0
1  2   5.0  8.0  8.0
2  3   6.0  9.0  9.0
3  4   NaN  NaN  NaN
4  5   NaN  NaN  NaN

I need to replace NaN in 4th row with values first row, i.e.,

   a     b     c     d
0  1   **4.0   7.0   7.0**
1  2    5.0   8.0   8.0
2  3    6.0   9.0   9.0
3  4   **4.0   7.0   7.0**
4  5    NaN   NaN   NaN

And the second question is how can I multiply some/part values in a row by a number, for example, I need to double the values in second two when the columns are ['b', 'c', 'd'], then the result is:

   a     b     c     d
0  1    4.0   7.0   7.0
1  2   **10.0  16.0  16.0**
2  3    6.0   9.0   9.0
3  4    NaN   NaN   NaN
4  5    NaN   NaN   NaN

Upvotes: 2

Views: 1924

Answers (2)

jpp
jpp

Reputation: 164613

Indexing with labels

If you wish to filter by a, and a values are unique, consider making it your index to simplify your logic and make it more efficient:

test = test.set_index('a')
test.loc[4] = test.loc[4].fillna(test.loc[1])
test.loc[2] *= 2

Boolean masks

If a is not unique and Boolean masks are required, you can still use fillna with an additional step::

mask = test['a'].eq(4)
test.loc[mask] = test.loc[mask].fillna(test.loc[test['a'].eq(1).idxmax()])
test.loc[test['a'].eq(2)] *= 2

Upvotes: 1

yatu
yatu

Reputation: 88226

First of all, I suggest you do some reading on Indexing and selecting data in pandas. Regaring the first question you can use .loc() with isnull() to perform boolean indexing on the column vaulues:

mask_nans = test.loc[3,:].isnull()
test.loc[3, mask_nans] = test.loc[0, mask_nans]

And to double the values you can directly multiply by 2 the sliced dataframe also using .loc():

test.loc[1,'b':] *= 2

   a     b     c     d
0  1   4.0   7.0   7.0
1  2  10.0  16.0  16.0
2  3   6.0   9.0   9.0
3  4   4.0   7.0   7.0
4  5   NaN   NaN   NaN

Upvotes: 2

Related Questions