Pythonista anonymous
Pythonista anonymous

Reputation: 8970

How to join a Series to a DataFrame?

Is there any way to join a Series to a DataFrame directly?

The join would be on a field of the dataframe and on the index of the series.

The only way I found was to convert the series to a dataframe first, as in the code below.

import numpy as np
import pandas as pd

df = pd.DataFrame()
df['a'] = np.arange(0, 4)
df['b'] = np.arange(100, 104)


s = pd.Series(data=np.arange(100, 103))

# this doesn't work
# myjoin = pd.merge(df, s, how='left', left_on='a', right_index=True)

# this does
s = s.reset_index()
# s becomes a Dataframe
# note you cannot reset the index of a series inplace
myjoin = pd.merge(df, s, how='left', left_on='a', right_on='index')

print myjoin

Upvotes: 18

Views: 24709

Answers (3)

Ando Jurai
Ando Jurai

Reputation: 1049

That's a very late answer, but what worked for me was building a dataframe with the columns you want to retrieve in your series, name this series as the index you need, append the series to the dataframe (if you have supplementary elements in the series they are added to the dataframe, which in some application may be convenient), then join the final dataframe by this index to the original dataframe you want to expand. Agreed it is not direct, but that's still the most convenient way if you have a lot of series, instead of transforming each in a dataframe first.

Upvotes: 0

Alex
Alex

Reputation: 487

I guess http://pandas.pydata.org/pandas-docs/stable/generated/pandas.concat.html might help.

For example inner/outer join.

pd.concat((df,s), axis=1)
Out[26]: 
   a    b    0
0  0  100  100
1  1  101  101
2  2  102  102
3  3  103  NaN

In [27]: pd.concat((df,s), axis=1, join='inner')
Out[27]: 
   a    b    0
0  0  100  100
1  1  101  101
2  2  102  102

Upvotes: 6

Alex
Alex

Reputation: 2459

Try concat():

import numpy as np
import pandas as pd

df= pd.DataFrame()
df['a']= np.arange(0,4)
df['b']= np.arange(100,104)

s =pd.Series(data = np.arange(100,103))

new_df = pd.concat((df, s), axis=1)
print new_df

This prints:

   a    b    0
0  0  100  100
1  1  101  101
2  2  102  102
3  3  103  NaN

Upvotes: -1

Related Questions