Reputation: 79
I'm working with a huge dataframe in python and sometimes I need to add an empty row or several rows in a definite position to dataframe. For this question I created a small dataframe df in order to show, what I want to achieve.
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4'],
'Price': [22000,25000,27000,35000]
}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
If a row value is 27000, I want to add an empty row before it. I can insert row after with Concat but I can't really think of a way of adding it before..
Upvotes: 4
Views: 3372
Reputation: 24314
You can also do this by concat()
method and apply()
method:
result=pd.concat((df.apply(lambda x:np.nan if x['Price']==27000 else x,1),df))
Finally use sort_index()
method,drop_duplicates()
method and reset_index()
method:
result=result.sort_index(na_position='first').drop_duplicates().reset_index(drop=True)
Now if you print result
you will get your desired output:
Brand Price
0 Honda Civic 22000.0
1 Toyota Corolla 25000.0
2 NaN NaN
3 Ford Focus 27000.0
4 Audi A4 35000.0
This will add a blank row before every row where Price=27000:
result=pd.concat((df.apply(lambda x:np.nan if x['Price']==27000 else x,1),df))
result=result.drop_duplicates().append(result[result.isna().all(1)].iloc[1:]).sort_index(na_position='first').reset_index(drop=True)
Upvotes: 2
Reputation: 59529
Create a DataFrame with the index labels based on your condition that has all null values. [Assumes df
has a non-duplicated index]. Then concat and sort_index
which will place the missing row before (because we concat df
to empty
). Then reset_index
to remove the duplicate index labels.
import pandas as pd
empty = pd.DataFrame(columns=df.columns, index=df[df.Price.eq(27000)].index)
df = pd.concat([empty, df]).sort_index().reset_index(drop=True)
# Brand Price
#0 Honda Civic 22000
#1 Toyota Corolla 25000
#2 NaN NaN
#3 Ford Focus 27000
#4 Audi A4 35000
This will add a blank row before every 27000 row
cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4','Jeep'],
'Price': [22000,25000,27000,35000,27000]}
df = pd.DataFrame(cars, columns = ['Brand', 'Price'])
empty = pd.DataFrame(columns=df.columns, index=df[df.Price.eq(27000)].index)
df = pd.concat([empty, df]).sort_index().reset_index(drop=True)
# Brand Price
#0 Honda Civic 22000
#1 Toyota Corolla 25000
#2 NaN NaN
#3 Ford Focus 27000
#4 Audi A4 35000
#5 NaN NaN
#6 Jeep 27000
Upvotes: 2
Reputation: 71689
Let us try cummax
with append
:
m = df['Price'].eq(27000).cummax()
df[~m].append(pd.Series(), ignore_index=True).append(df[m])
Brand Price
0 Honda Civic 22000.0
1 Toyota Corolla 25000.0
2 NaN NaN
2 Ford Focus 27000.0
3 Audi A4 35000.0
Upvotes: 2
Reputation: 75080
You can create a helper cumsum column for groupby then append a blank row only for the first group and then concat:
out = pd.concat((g.append(pd.Series(),ignore_index=True) if i==0 else g
for i, g in df.groupby(df['Price'].eq(27000).cumsum())))
print(out)
Brand Price
0 Honda Civic 22000.0
1 Toyota Corolla 25000.0
2 NaN NaN
2 Ford Focus 27000.0
3 Audi A4 35000.0
Upvotes: 3