JSolomonCulp
JSolomonCulp

Reputation: 1574

Added column to existing dataframe but entered all numbers as NaN

So I created two dataframes from existing CSV files, both consisting of entirely numbers. The second dataframe consists of an index from 0 to 8783 and one column of numbers and I want to add it on as a new column to the first dataframe which has an index consisting of a month, day and hour. I tried using append, merge and concat and none worked and then tried simply using:

x1GBaverage['Power'] = x2_cut

where x1GBaverage is the first dataframe and x2_cut is the second. When I did this it added x2_cut on properly but all the values were entered as NaN instead of the numerical values that they should be. How should I be approaching this?

Upvotes: 2

Views: 78

Answers (2)

marwan
marwan

Reputation: 494

x1GBaverage['Power'] = x2_cut.values

problem solved :)

The thing about pandas is that values are implicitly linked to their indices unless you deliberately specify that you only need the values to be transferred over.

Upvotes: 1

Randy
Randy

Reputation: 14847

If they're the same row counts and you just want to tack it on the end, the indexes either need to match, or you need to just pass the underlying values. In the example below, columns 3 and 5 are the index matching & value versions, and 4 is what you're running into now:

In [58]: df = pd.DataFrame(np.random.random((3,3)))

In [59]: df
Out[59]:
          0         1         2
0  0.670812  0.500688  0.136661
1  0.185841  0.239175  0.542369
2  0.351280  0.451193  0.436108

In [61]: df2 = pd.DataFrame(np.random.random((3,1)))

In [62]: df2
Out[62]:
          0
0  0.638216
1  0.477159
2  0.205981

In [64]: df[3] = df2

In [66]: df.index = ['a', 'b', 'c']

In [68]: df[4] = df2

In [70]: df[5] = df2.values

In [71]: df
Out[71]:
          0         1         2         3   4         5
a  0.670812  0.500688  0.136661  0.638216 NaN  0.638216
b  0.185841  0.239175  0.542369  0.477159 NaN  0.477159
c  0.351280  0.451193  0.436108  0.205981 NaN  0.205981

If the row counts differ, you'll need to use df.merge and let it know which columns it should be using to join the two frames.

Upvotes: 0

Related Questions