Rookie
Rookie

Reputation: 13

Pandas - Count occurences in a column

I have this file with 19 columns of mixed dtypes. One of the column Names contain elements which are separated by space. For example:

Col1     Col2      
adress1   x
adress2   a b
adress3   x c
adress4   a x d

What I want to do is go over Col2 and find out how many times each element occurs and put the result in a new column along with its corresponding in Col1

Note the above columns were already processed as a Dataframe. I have this which somewhat give me the results but not what I want ultimately.

new_df = pd.Dataframe(old_df.Col2.str.split(' ').tolist(), index=old_df.Col1).stack

How do I put the results in a new column (replacing Col2) and also have the remaining columnS?

Something like:

Col1     Col2      Col3
adress1   x        something
adress2   a        something1 
adress2   b        something1
adress3   x        NaN
adress3   c        NaN

Also calculate occurrence of items in Col2?

Upvotes: 0

Views: 56

Answers (1)

BENY
BENY

Reputation: 323226

We can do split first then do explode

s=df.assign(Col2=df.Col2.str.split()).explode('Col2')
s=s.groupby(['Col1','Col2']).size().to_frame('count').reset_index()
Out[48]: 
      Col1 Col2  count
0  adress1    x      1
1  adress2    a      1
2  adress2    b      1
3  adress3    c      1
4  adress3    x      1
5  adress4    a      1
6  adress4    d      1
7  adress4    x      1

Upvotes: 1

Related Questions