Pandas - split columns and count occurrences

Question

It's info for some purchases made by clients on phone accessories, my real data would look something like this: Abstract Model 1 ~Samsung S6 | Sold: 4

I've got a dataset that looks something like this:

item               sold
Design1 ~Model1 1
Design2 ~Model1 2
Design1 ~Model2 3
Design2 ~Model2 1

I want to break the item column into 2 columns , design and model, and count each time a design has been sold, and a model has been sold, individually, based on the selling data of design+model combinations in the input.

My expected output, based on the first dataset, would look something like this:

expected output:        

design    design_sold    model      model_sold

Design1   4              Model1     3
Design2   3              Model2     4

Mohamed Thasin ah · Accepted Answer

try this,

df[['Design','Model']]=df['item'].str.split(' ~',expand=True)
print pd.concat([df.groupby('Design',as_index=False)['sold'].sum().rename(columns={'sold':'Desgin Sold'}),df.groupby('Model',as_index=False)['sold'].sum().rename(columns={'sold':'Model Sold'})],axis=1)

Output:

    Design  Desgin Sold   Model  Model Sold
0  Design1            4  Model1           3
1  Design2            3  Model2           4

Explanation:' 1. .str.split() used to split your series into frame.

groupby model and design and perform sum on grouped object.
rename the column and concat your dataframe.

Pandas - split columns and count occurrences

Answers (1)

Related Questions