Nat1
Nat1

Reputation: 83

How to insert rows at specific positions into a dataframe in Python?

suppose you have a dataframe

df = pd.DataFrame({'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age': 
[28,34,29,42]})

and another dataframe

df1 = pd.DataFrame({'Name':['Anna', 'Susie'],'Age':[20,50]})

as well as a list with indices

pos = [0,2].

What is the most pythonic way to create a new dataframe df2 where df1 is integrated into df right before the index positions of df specified in pos?

So, the new array should look like this:

df2 = 
     Age   Name
0    20    Anna
1    28    Tom
2    34    Jack
3    50    Susie
4    29    Steve
5    42    Ricky

Thank you very much.

Best,

Nathan

Upvotes: 2

Views: 2425

Answers (2)

juanpa.arrivillaga
juanpa.arrivillaga

Reputation: 95948

The behavior you are searching for is implemented by numpy.insert, however, this will not play very well with pandas.DataFrame objects, but no-matter, pandas.DataFrame objects have a numpy.ndarray inside of them (sort of, depending on various factors, it may be multiple arrays, but you can think of them as on array accessible via the .values parameter).

You will simply have to reconstruct the columns of your data-frame, but otherwise, I suspect this is the easiest and fastest way:

In [1]: import pandas as pd, numpy as np

In [2]: df = pd.DataFrame({'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age':
   ...: [28,34,29,42]})

In [3]: df1 = pd.DataFrame({'Name':['Anna', 'Susie'],'Age':[20,50]})

In [4]: np.insert(df.values, (0,2), df1.values, axis=0)
Out[4]:
array([['Anna', 20],
       ['Tom', 28],
       ['Jack', 34],
       ['Susie', 50],
       ['Steve', 29],
       ['Ricky', 42]], dtype=object)

So this returns an array, but this array is exactly what you need to make a data-frame! And you have the other elements, i.e. the columns already on the original data-frames, so you can just do:

In [5]: pd.DataFrame(np.insert(df.values, (0,2), df1.values, axis=0), columns=df.columns)
Out[5]:
    Name Age
0   Anna  20
1    Tom  28
2   Jack  34
3  Susie  50
4  Steve  29
5  Ricky  42

So that single line is all you need.

Upvotes: 1

Lev Zakharov
Lev Zakharov

Reputation: 2427

Tricky solution with float indexes:

df = pd.DataFrame({'Name':['Tom', 'Jack', 'Steve', 'Ricky'],'Age': [28,34,29,42]})
df1 = pd.DataFrame({'Name':['Anna', 'Susie'],'Age':[20,50]}, index=[-0.5, 1.5])

result = df.append(df1, ignore_index=False).sort_index().reset_index(drop=True)
print(result)

Output:

    Name  Age
0   Anna   20
1    Tom   28
2   Jack   34
3  Susie   50
4  Steve   29
5  Ricky   42

Pay attention to index parameter in df1 creation. You can construct index from pos using simple list comprehension:

[x - 0.5 for x in pos]

Upvotes: 1

Related Questions