Reputation: 3729
This is my DataFrame that should be repeated for 5 times:
>>> x = pd.DataFrame({'a':1,'b':2}, index = range(1))
>>> x
a b
0 1 2
I want to have the result like this:
>>> x.append(x).append(x).append(x)
a b
0 1 2
0 1 2
0 1 2
0 1 2
But there must be a smarter way than appending 4 times. Actually the DataFrame I’m working on should be repeated 50 times.
I haven't found anything practical, including those like np.repeat
---- it just doesn't work on a DataFrame.
Could anyone help?
Upvotes: 80
Views: 102476
Reputation:
Without numpy
, we could also use Index.repeat
and loc
(or reindex
):
x.loc[x.index.repeat(5)].reset_index(drop=True)
or
x.reindex(x.index.repeat(5)).reset_index(drop=True)
Output:
a b
0 1 2
1 1 2
2 1 2
3 1 2
4 1 2
Upvotes: 2
Reputation: 71570
Try using numpy.repeat
:
>>> import numpy as np
>>> df = pd.DataFrame(np.repeat(x.to_numpy(), 5, axis=0), columns=x.columns)
>>> df
a b
0 1 2
1 1 2
2 1 2
3 1 2
4 1 2
Upvotes: 11
Reputation: 111
Apply by row-lambda is a universal approach in my opinion:
df = pd.DataFrame([[1, 2], [3, 4]], columns=["A", "B"])
df.apply(lambda row: row.repeat(2), axis=0) #.reset_index()
Out[1]:
A B
0 1 2
0 1 2
1 3 4
1 3 4
Upvotes: 0
Reputation: 375445
I think it's cleaner/faster to use iloc
nowadays:
In [11]: np.full(3, 0)
Out[11]: array([0, 0, 0])
In [12]: x.iloc[np.full(3, 0)]
Out[12]:
a b
0 1 2
0 1 2
0 1 2
More generally, you can use tile
or repeat
with arange
:
In [21]: df = pd.DataFrame([[1, 2], [3, 4]], columns=["A", "B"])
In [22]: df
Out[22]:
A B
0 1 2
1 3 4
In [23]: np.tile(np.arange(len(df)), 3)
Out[23]: array([0, 1, 0, 1, 0, 1])
In [24]: np.repeat(np.arange(len(df)), 3)
Out[24]: array([0, 0, 0, 1, 1, 1])
In [25]: df.iloc[np.tile(np.arange(len(df)), 3)]
Out[25]:
A B
0 1 2
1 3 4
0 1 2
1 3 4
0 1 2
1 3 4
In [26]: df.iloc[np.repeat(np.arange(len(df)), 3)]
Out[26]:
A B
0 1 2
0 1 2
0 1 2
1 3 4
1 3 4
1 3 4
Note: This will work with non-integer indexed DataFrames (and Series).
Upvotes: 38
Reputation: 11568
Append should work too:
In [589]: x = pd.DataFrame({'a':1,'b':2},index = range(1))
In [590]: x
Out[590]:
a b
0 1 2
In [591]: x.append([x]*5, ignore_index=True) #Ignores the index as per your need
Out[591]:
a b
0 1 2
1 1 2
2 1 2
3 1 2
4 1 2
5 1 2
In [592]: x.append([x]*5)
Out[592]:
a b
0 1 2
0 1 2
0 1 2
0 1 2
0 1 2
0 1 2
Upvotes: 2
Reputation: 16478
I would generally not repeat and/or append, unless your problem really makes it necessary - it is highly inefficiently and typically comes from not understanding the proper way to attack a problem.
I don't know your exact use case, but if you have your values stored as
values = array(1, 2)
df2 = pd.DataFrame(index=arange(0,50), columns=['a', 'b'])
df2[['a', 'b']] = values
will do the job. Perhaps you want to better explain what you're trying to achieve?
Upvotes: 3
Reputation: 139162
You can use the concat
function:
In [13]: pd.concat([x]*5)
Out[13]:
a b
0 1 2
0 1 2
0 1 2
0 1 2
0 1 2
If you only want to repeat the values and not the index, you can do:
In [14]: pd.concat([x]*5, ignore_index=True)
Out[14]:
a b
0 1 2
1 1 2
2 1 2
3 1 2
4 1 2
Upvotes: 132