Reputation: 83
suppose you have a dataframe
df = pd.DataFrame({'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':
[28,34,29,42]})
and another dataframe
df1 = pd.DataFrame({'Name':['Anna', 'Susie'],'Age':[20,50]})
as well as a list with indices
pos = [0,2].
What is the most pythonic way to create a new dataframe df2 where df1 is integrated into df right before the index positions of df specified in pos?
So, the new array should look like this:
df2 =
Age Name
0 20 Anna
1 28 Tom
2 34 Jack
3 50 Susie
4 29 Steve
5 42 Ricky
Thank you very much.
Best,
Nathan
Upvotes: 2
Views: 2425
Reputation: 95948
The behavior you are searching for is implemented by numpy.insert
, however, this will not play very well with pandas.DataFrame
objects, but no-matter, pandas.DataFrame
objects have a numpy.ndarray
inside of them (sort of, depending on various factors, it may be multiple arrays, but you can think of them as on array accessible via the .values
parameter).
You will simply have to reconstruct the columns
of your data-frame, but otherwise, I suspect this is the easiest and fastest way:
In [1]: import pandas as pd, numpy as np
In [2]: df = pd.DataFrame({'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':
...: [28,34,29,42]})
In [3]: df1 = pd.DataFrame({'Name':['Anna', 'Susie'],'Age':[20,50]})
In [4]: np.insert(df.values, (0,2), df1.values, axis=0)
Out[4]:
array([['Anna', 20],
['Tom', 28],
['Jack', 34],
['Susie', 50],
['Steve', 29],
['Ricky', 42]], dtype=object)
So this returns an array, but this array is exactly what you need to make a data-frame! And you have the other elements, i.e. the columns
already on the original data-frames, so you can just do:
In [5]: pd.DataFrame(np.insert(df.values, (0,2), df1.values, axis=0), columns=df.columns)
Out[5]:
Name Age
0 Anna 20
1 Tom 28
2 Jack 34
3 Susie 50
4 Steve 29
5 Ricky 42
So that single line is all you need.
Upvotes: 1
Reputation: 2427
Tricky solution with float indexes:
df = pd.DataFrame({'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age': [28,34,29,42]})
df1 = pd.DataFrame({'Name':['Anna', 'Susie'],'Age':[20,50]}, index=[-0.5, 1.5])
result = df.append(df1, ignore_index=False).sort_index().reset_index(drop=True)
print(result)
Output:
Name Age
0 Anna 20
1 Tom 28
2 Jack 34
3 Susie 50
4 Steve 29
5 Ricky 42
Pay attention to index
parameter in df1
creation. You can construct index
from pos
using simple list comprehension:
[x - 0.5 for x in pos]
Upvotes: 1