Reputation: 137
In pandas
, how can I copy or move a row to the top of the Data Frame without creating a copy of the Data Frame?
For example, I managed to do almost what I want with the code below, but I have the impression that there might be a better way to accomplish this:
import pandas as pd
df = pd.DataFrame({'Probe':['Test1','Test2','Test3'], 'Sequence':['AATGCGT','TGCGTAA','ATGCATG']})
df
Probe Sequence
0 Test1 AATGCGT
1 Test2 TGCGTAA
2 Test3 ATGCATG
df_shifted = df.shift(1)
df_shifted
Probe Sequence
0 NaN NaN
1 Test1 AATGCGT
2 Test2 TGCGTAA
df_shifted.ix[0] = df.ix[2]
df_shifted
Probe Sequence
0 Test3 ATGCATG
1 Test1 AATGCGT
2 Test2 TGCGTAA
Upvotes: 7
Views: 18787
Reputation: 299
Here's an alternative that does not require a new column or sort_values
.
With default index (RangeIndex
):
df = pd.DataFrame({'Probe': ['Test1', 'Test2', 'Test3'],
'Sequence': ['AATGCGT', 'TGCGTAA', 'ATGCATG']})
Probe Sequence
0 Test1 AATGCGT
1 Test2 TGCGTAA
2 Test3 ATGCATG
to_appear_first = [2]
new_index_order = [*to_appear_first, *df.index.difference(to_appear_first)]
df.loc[new_index_order].reset_index(drop=True)
Probe Sequence
0 Test3 ATGCATG
1 Test1 AATGCGT
2 Test2 TGCGTAA
With Probe
as the index column:
df = pd.DataFrame({'Probe': ['Test1', 'Test2', 'Test3'],
'Sequence': ['AATGCGT', 'TGCGTAA', 'ATGCATG']}).set_index("Probe")
Sequence
Probe
Test1 AATGCGT
Test2 TGCGTAA
Test3 ATGCATG
to_appear_first = ["Test3"]
new_index_order = [*to_appear_first, *df.index.difference(to_appear_first)]
df.loc[new_index_order]
Sequence
Probe
Test3 ATGCATG
Test1 AATGCGT
Test2 TGCGTAA
Upvotes: 0
Reputation: 25709
Try this. You don't need to make a copy of the dataframe.
df["new"] = range(1,len(df)+1)
Probe Sequence new
0 Test1 AATGCGT 1
1 Test2 TGCGTAA 2
2 Test3 ATGCATG 3
df.ix[2,'new'] = 0
df.sort_values("new").drop('new', axis=1)
Probe Sequence
2 Test3 ATGCATG
0 Test1 AATGCGT
1 Test2 TGCGTAA
Basically, since you can't insert the row into the index at 0, create a column so you can.
If you want the index ordered, use this:
df.sort_values("new").reset_index(drop='True').drop('new', axis=1)
Probe Sequence
0 Test3 ATGCATG
1 Test1 AATGCGT
2 Test2 TGCGTAA
Edit: df.ix
is deprecated. Here's the same method with .loc
.
df["new"] = range(1,len(df)+1)
df.loc[df.index==2, 'new'] = 0
df.sort_values("new").drop('new', axis=1)
Upvotes: 8
Reputation: 137
Okay, I think I came up with a solution. By all means, please feel free to add your own answer if you think yours is better:
import numpy as np
df.ix[3] = np.nan
df
Probe Sequence
0 Test1 AATGCGT
1 Test2 TGCGTAA
2 Test3 ATGCATG
3 NaN NaN
df = df.shift(1)
Probe Sequence
0 NaN NaN
1 Test1 AATGCGT
2 Test2 TGCGTAA
3 Test3 ATGCATG
df.ix[0] = df.ix[2]
df
Probe Sequence
0 Test3 ATGCATG
1 Test1 AATGCGT
2 Test2 TGCGTAA
3 Test3 ATGCATG
Upvotes: 1