Adding a column to pandas data frame fills it with NA

Question

I have this pandas dataframe:

          SourceDomain                           1  2         3
0  www.theguardian.com     profile.theguardian.com  1  Directed
1  www.theguardian.com  membership.theguardian.com  2  Directed
2  www.theguardian.com   subscribe.theguardian.com  3  Directed
3  www.theguardian.com            www.google.co.uk  4  Directed
4  www.theguardian.com        jobs.theguardian.com  5  Directed

I would like to add a new column which is a pandas series created like this:

Weights = Weights.value_counts()

However, when I try to add the new column using edgesFile[4] = Weights it fills it with NA instead of the values:

          SourceDomain                           1  2         3   4
0  www.theguardian.com     profile.theguardian.com  1  Directed NaN
1  www.theguardian.com  membership.theguardian.com  2  Directed NaN
2  www.theguardian.com   subscribe.theguardian.com  3  Directed NaN
3  www.theguardian.com            www.google.co.uk  4  Directed NaN
4  www.theguardian.com        jobs.theguardian.com  5  Directed NaN

How can I add the new column keeping the values? Thanks?

Dani

unutbu · Accepted Answer

You are getting NaNs because the index of Weights does not match up with the index of edgesFile. If you want Pandas to ignore Weights.index and just paste the values in order then pass the underlying NumPy array instead:

edgesFile[4] = Weights.values

Here is an example which demonstrates the difference:

In [14]: df = pd.DataFrame(np.arange(4)*10, index=list('ABCD'))

In [15]: df
Out[15]: 
    0
A   0
B  10
C  20
D  30

In [16]: s = pd.Series(np.arange(4), index=list('CDEF'))

In [17]: s
Out[17]: 
C    0
D    1
E    2
F    3
dtype: int64

Here we see Pandas aligning the index:

In [18]: df[4] = s

In [19]: df
Out[19]: 
    0   4
A   0 NaN
B  10 NaN
C  20   0
D  30   1

Here, Pandas simply pastes the values in s into the column:

In [20]: df[4] = s.values

In [21]: df
Out[21]: 
    0  4
A   0  0
B  10  1
C  20  2
D  30  3

Adding a column to pandas data frame fills it with NA

Answers (2)

Related Questions