Reputation: 474
I have a dataframe where every two rows are related. I am trying to give every two rows a unique ID. I thought it would be much easier but I cannot figure it out. Let's say I have this dataframe:
df = pd.DataFrame({'Var1': ['A', 2, 'C', 7], 'Var2': ['B', 5, 'D', 9]})
print(df)
Var1 Var2
A B
2 5
C D
7 9
I would like to add an ID that would result in a dataframe that looks like this:
df = pd.DataFrame({'ID' : [1,1,2,2],'Var1': ['A', 2, 'C', 7], 'Var2': ['B', 5, 'D', 9]})
print(df)
ID Var1 Var2
1 A B
1 2 5
2 C D
2 7 9
This is just a sample, but every two rows are related so just trying to count by 1, 1, 2, 2, 3, 3 etc in the ID column.
Thanks for any help!
Upvotes: 0
Views: 264
Reputation: 1292
I don't think think is a native Pandas way to do it but this works...
import pandas as pd
df = pd.DataFrame({'Var1': ['A', 2, 'C', 7], 'Var2': ['B', 5, 'D', 9]})
df['ID'] = 1 + df.index // 2
df[['ID', 'Var1', 'Var2']]
Output:
ID Var1 Var2
0 1 A B
1 1 2 5
2 2 C D
3 2 7 9
Upvotes: 1
Reputation: 215057
You can create a sequence first and then divide it by 2 (integer division):
import numpy as np
df['ID'] = np.arange(len(df)) // 2 + 1
df
# Var1 Var2 ID
#0 A B 1
#1 2 5 1
#2 C D 2
#3 7 9 2
Upvotes: 1