Reputation: 87
It's info for some purchases made by clients on phone accessories, my real data would look something like this:
Abstract Model 1 ~Samsung S6 | Sold: 4
I've got a dataset that looks something like this:
item sold
Design1 ~Model1 1
Design2 ~Model1 2
Design1 ~Model2 3
Design2 ~Model2 1
I want to break the item
column into 2 columns , design
and model
, and count each time a design
has been sold, and a model
has been sold, individually, based on the selling data of design+model combinations in the input.
My expected output, based on the first dataset, would look something like this:
expected output:
design design_sold model model_sold
Design1 4 Model1 3
Design2 3 Model2 4
Upvotes: 1
Views: 1171
Reputation: 11192
try this,
df[['Design','Model']]=df['item'].str.split(' ~',expand=True)
print pd.concat([df.groupby('Design',as_index=False)['sold'].sum().rename(columns={'sold':'Desgin Sold'}),df.groupby('Model',as_index=False)['sold'].sum().rename(columns={'sold':'Model Sold'})],axis=1)
Output:
Design Desgin Sold Model Model Sold
0 Design1 4 Model1 3
1 Design2 3 Model2 4
Explanation:'
1. .str.split()
used to split your series into frame.
groupby
model and design and perform sum
on grouped object.
rename
the column and concat
your dataframe.
Upvotes: 1