Reputation: 71
I have a dataframe (df) as such:
A B
1 a
2 b
3 c
And a series: S = pd.Series(['x','y','z'])
I want to repeat the dataframe df for each value in the series. The expected result is to be like this:
result:
S A B
x 1 a
y 1 a
z 1 a
x 2 b
y 2 b
z 2 b
x 3 c
y 3 c
z 3 c
How do I achieve this kind of output? I'm thinking of merge or join but mergeing is giving me a memory error. I am dealing with a rather large dataframe and series. Thanks!
Upvotes: 6
Views: 3657
Reputation: 19947
Setup
df = pd.DataFrame({'A': {0: 1, 1: 2, 2: 3}, 'B': {0: 'a', 1: 'b', 2: 'c'}})
S = pd.Series(['x','y','z'], name='S')
Solution
#Convert the Series to a Dataframe with desired shape of the output filled with S values.
#Join df_S to df to get As and Bs
df_S = pd.DataFrame(index=np.repeat(S.index,3), columns=['S'], data= np.tile(S.values,3))
df_S.join(df)
Out[54]:
S A B
0 x 1 a
0 y 1 a
0 z 1 a
1 x 2 b
1 y 2 b
1 z 2 b
2 x 3 c
2 y 3 c
2 z 3 c
Upvotes: 0
Reputation: 210832
UPDATE:
here is a bit changed @A-Za-z's solution which might be bit more memory saving, but it's slower:
x = pd.DataFrame(index=range(len(df) * len(S)))
for col in df.columns:
x[col] = np.repeat(df[col], len(s))
x['S'] = np.tile(S, len(df))
Old incorrect answer:
In [94]: pd.concat([df.assign(S=S)] * len(s))
Out[94]:
A B S
0 1 a x
1 2 b y
2 3 c z
0 1 a x
1 2 b y
2 3 c z
0 1 a x
1 2 b y
2 3 c z
Upvotes: 3
Reputation: 38415
Using numpy, lets say you have series and df of diffenent lengths
s= pd.Series(['X', 'Y', 'Z', 'A']) #added a character to s to make it length 4
s_n = len(s)
df_n = len(df)
pd.DataFrame(np.repeat(df.values,s_n, axis = 0), columns = df.columns, index = np.tile(s,df_n)).rename_axis('S').reset_index()
S A B
0 X 1 a
1 Y 1 a
2 Z 1 a
3 A 1 a
4 X 2 b
5 Y 2 b
6 Z 2 b
7 A 2 b
8 X 3 c
9 Y 3 c
10 Z 3 c
11 A 3 c
Upvotes: 12