Mistapopo
Mistapopo

Reputation: 433

Create two dataframe from a given dataframe

Assume that I have the following dataframe

id  item_name      item_date   item_quantity
0   computer hp    01/10/2018   50
1   computer hp    02/10/2018  201
2   computer dell  01/10/2018  45
3   computer dell  02/10/2018  59

I would like a way to create two dataframe from this one :

new_df1

id  item_name      item_date   item_quantity
0   computer hp    01/10/2018   50
1   computer hp    02/10/2018  201

new_df2

id  item_name      item_date   item_quantity
2   computer dell  01/10/2018  45
3   computer dell  02/10/2018  59

Can you explain me how to do this with a minimum (?) time ? Thank you. If you don't understand, just let me know. I will rephrase it ;)

Upvotes: 0

Views: 55

Answers (2)

2Obe
2Obe

Reputation: 3710

IF you just want to split the dataframe on a given index, use the following:

import pandas as pd 



df = pd.DataFrame({'Date': [1, 2, 3, 4], 'B': [1, 2, 3, 2], 'C': ['A','B','C','D']})

n = 2

df1 = df[:n]
df2 = df[n:]




   Date  B  C
0     1  1  A
1     2  2  B
   Date  B  C
2     3  3  C
3     4  2  D

Upvotes: 1

It_is_Chris
It_is_Chris

Reputation: 14103

you can groupby and use transform(min) to find the min date in each group then use np.split() to split on the index and create new dataframes on each groups min date

# group df on name and the find the min date of each group
group = df.groupby('item_name')['item_date'].transform('min')

# filter find the matches of min date in the original df
x = df.loc[df['item_date'] == group]

# get the indices
idx = list(x.index.values)

# split the df into dfs
dfs = np.split(df, idx)

dfs[1]

    id  item_name   item_date   item_quantity
0   0   computer hp 1/10/2018   50
1   1   computer hp 2/10/2018   201

dfs[2]

    id  item_name   item_date   item_quantity
2   2   computer dell   1/10/2018   45
3   3   computer dell   2/10/2018   59

Upvotes: 1

Related Questions