Python adding column to dataframe causes NaN

Question

I have a series and df

s = pd.Series([1,2,3,5])
df = pd.DataFrame()

When I add columns to df like this

df.loc[:, "0-2"] = s.iloc[0:3]
df.loc[:, "1-3"] = s.iloc[1:4]

I get df

   0-2  1-3
0    1  NaN
1    2  2.0
2    3  3.0

Why am I getting NaN? I tried create new series with correct idxs, but adding it to df still causes NaN.

What I want is

3novak · Accepted Answer

Try either of the following lines.

df.loc[:, "1-3"] = s.iloc[1:4].values
# -OR-
df.loc[:, "1-3"] = s.iloc[1:4].reset_index(drop=True)

Your original code is trying unsuccessfully to match the index of the data frame df to the index of the subset series s.iloc[1:4]. When it can't find the 0 index in the series, it places a NaN value in df at that location. You can get around this by only keeping the values so it doesn't try to match on the index or resetting the index on the subset series.

>>> s.iloc[1:4]
1    2
2    3
3    5
dtype: int64

Notice the index values since the original, unsubset series is the following.

>>> s
0    1
1    2
2    3
3    5
dtype: int64

The index of the first row in df is 0. By dropping the indices with the values call, you bypass the index matching which is producing the NaN. By resetting the index in the second option, you make the indices the same.

Python adding column to dataframe causes NaN

Answers (1)

Related Questions