Reputation: 433
Assume that I have the following dataframe
id item_name item_date item_quantity
0 computer hp 01/10/2018 50
1 computer hp 02/10/2018 201
2 computer dell 01/10/2018 45
3 computer dell 02/10/2018 59
I would like a way to create two dataframe from this one :
id item_name item_date item_quantity
0 computer hp 01/10/2018 50
1 computer hp 02/10/2018 201
id item_name item_date item_quantity
2 computer dell 01/10/2018 45
3 computer dell 02/10/2018 59
Can you explain me how to do this with a minimum (?) time ? Thank you. If you don't understand, just let me know. I will rephrase it ;)
Upvotes: 0
Views: 55
Reputation: 3710
IF you just want to split the dataframe on a given index, use the following:
import pandas as pd
df = pd.DataFrame({'Date': [1, 2, 3, 4], 'B': [1, 2, 3, 2], 'C': ['A','B','C','D']})
n = 2
df1 = df[:n]
df2 = df[n:]
Date B C
0 1 1 A
1 2 2 B
Date B C
2 3 3 C
3 4 2 D
Upvotes: 1
Reputation: 14103
you can groupby and use transform(min)
to find the min date in each group then use np.split()
to split on the index and create new dataframes on each groups min date
# group df on name and the find the min date of each group
group = df.groupby('item_name')['item_date'].transform('min')
# filter find the matches of min date in the original df
x = df.loc[df['item_date'] == group]
# get the indices
idx = list(x.index.values)
# split the df into dfs
dfs = np.split(df, idx)
dfs[1]
id item_name item_date item_quantity
0 0 computer hp 1/10/2018 50
1 1 computer hp 2/10/2018 201
dfs[2]
id item_name item_date item_quantity
2 2 computer dell 1/10/2018 45
3 3 computer dell 2/10/2018 59
Upvotes: 1