Meh
Meh

Reputation: 7166

Remove leading NaN in pandas

How can I remove leading NaN's in pandas?

pd.Series([np.nan, np.nan, np.nan, 1, 2, np.nan, 3])

I want to remove only the first 3 NaN's from above, so the result should be:

pd.Series([1, 2, np.nan, 3])

Upvotes: 20

Views: 5782

Answers (4)

EdChum
EdChum

Reputation: 393933

Here is another method using pandas methods only:

In [103]:
s = pd.Series([np.nan, np.nan, np.nan, 1, 2, np.nan, 3])
first_valid = s[s.notnull()].index[0]
s.iloc[first_valid:]

Out[103]:
3     1
4     2
5   NaN
6     3
dtype: float64

So we filter the series using notnull to get the first valid index. Then use iloc to slice the series

EDIT

As @ajcr has pointed out it is better to use the built-in method first_valid_index as this does not return a temp series which I'm using to mask in the above answer, additionally using loc uses the index label rather than iloc which uses ordinal position which works for the general case where the index is not an int64Index:

In [104]:
s = pd.Series([np.nan, np.nan, np.nan, 1, 2, np.nan, 3])
s.loc[s.first_valid_index():]

Out[104]:
3     1
4     2
5   NaN
6     3
dtype: float64

Upvotes: 18

Divakar
Divakar

Reputation: 221514

Two more approaches could be suggested here, assuming A as the input series.

Approach #1: With slicing -

A[np.where(~np.isnan(A))[0][0]:] 

Approach #2: With masking -

A[np.maximum.accumulate(~np.isnan(A))]

Sample run -

In [219]: A = pd.Series([np.nan, np.nan, np.nan, 1, 2, np.nan, 3])

In [220]: A
Out[220]: 
0   NaN
1   NaN
2   NaN
3     1
4     2
5   NaN
6     3
dtype: float64

In [221]: A[np.where(~np.isnan(A))[0][0]:]       # Approach 1
Out[221]: 
3     1
4     2
5   NaN
6     3
dtype: float64

In [222]: A[np.maximum.accumulate(~np.isnan(A))]  # Approach 2
Out[222]: 
3     1
4     2
5   NaN
6     3
dtype: float64

Upvotes: 1

clemtoy
clemtoy

Reputation: 1731

To remove the leading np.nan:

tab = [np.nan, np.nan, np.nan, 1, 2, np.nan, 3]
pd.Series(tab[tab.index([n for n in tab if np.isnan(n)].pop(0)):])

Upvotes: -1

bakkal
bakkal

Reputation: 55448

Find first non-nan index

To find the index of the first non-nan item

s = pd.Series([np.nan, np.nan, np.nan, 1, 2, np.nan, 3])

nans = s.apply(np.isnan)

first_non_nan = nans[nans == False].index[0] # get the first one

Output

s[first_non_nan:]
Out[44]:
3     1
4     2
5   NaN
6     3
dtype: float64

Upvotes: 2

Related Questions