bearwinterfirth
bearwinterfirth

Reputation: 3

Append a value to a single df column, without concatenating a row

I have a pandas df where each column have some numerical values, followed by some NaNs. The number of values and NaNs differ between columns. I want to append a single value at 'first non-NaN position' in a specific column.

My pandas df looks something like this:

   A   B   C   D
0  5   7   2   3
1  2   1  NaN  4
2  4   6  NaN  5
3 NaN  4  NaN  6
4 NaN  3  NaN NaN
5 NaN NaN NaN NaN

I want to add ("append") a value at the bottom of a specific column, for instance, I want to change the first 'NaN' in column 'A' to the value '3'. The desired outcome should be:

   A   B   C   D
0  5   7   2   3
1  2   1  NaN  4
2  4   6  NaN  5
3  3   4  NaN  6
4 NaN  3  NaN NaN
5 NaN NaN NaN NaN

I would like to use 'append', but it's deprecated. I have tried 'concat', but I don't want a whole new row, just append a single value at the bottom of a single column.

Upvotes: 0

Views: 30

Answers (2)

wjandrea
wjandrea

Reputation: 33145

The opposite of what you want is Series.first_valid_index() (find the first non-missing value). Pandas doesn't have a "first_invalid_index" (to find the first missing value), but we can use .shift() and .last_valid_index() to accomplish the same thing in this case.

col = 'A'
first_invalid_index = df[col].shift().last_valid_index()
df.at[first_invalid_index, col] = 3

Result:

     A    B    C    D
0  5.0  7.0  2.0  3.0
1  2.0  1.0  NaN  4.0
2  4.0  6.0  NaN  5.0
3  3.0  4.0  NaN  6.0
4  NaN  3.0  NaN  NaN
5  NaN  NaN  NaN  NaN

I'm not sure if I would actually recommend this, since it seems fragile – just want to give you options.

Upvotes: 1

ouroboros1
ouroboros1

Reputation: 14369

You can use Series.fillna with limit=1:

df['A'] = df['A'].fillna(3, limit=1)

Output:

     A    B    C    D
0  5.0  7.0  2.0  3.0
1  2.0  1.0  NaN  4.0
2  4.0  6.0  NaN  5.0
3  3.0  4.0  NaN  6.0
4  NaN  3.0  NaN  NaN
5  NaN  NaN  NaN  NaN

An alternative could be to use Series.isna + Series.idxmax and assign via df.loc. Here you first want to check if isna results in any True values. If not, you will overwrite the first value in the column:

if df['A'].isna().any():
    df.loc[df['A'].isna().idxmax(), 'A'] = 3

Upvotes: 1

Related Questions