Value counts for specific items in a DataFrame

Question

I have a dataframe (df) of messages that appears similar the following:

From                To
person1@gmail.com   stranger1@gmail.com
person2@gmail.com   stranger1@gmail.com, stranger2@gmail.com
person3@gmail.com   person1@gmail.com, stranger2@gmail.com

I want to count the amount of times each email appears from a specific list. My list being:

lst = ['person1@gmail.com', 'stranger2@gmail.com', 'person3@gmail.com']

I'm hoping to receive a dataframe/series/dictionary with a result like this:

list_item              Total_Count
person1@gmail.com      2
stranger2@gmail.com    2
person3@gmail.com      1

I'm tried several different things, but haven't succeeded. I thought I could try something like the for loop below (it returns a Syntax Error), but I cannot figure out the right way to write it.

for To,From in zip(df.To, df.From): 
    for item in lst:
        if To,From contains item in emails:
            Count(item)

Should this type of task be accomplished with a for loop or are there out of the box pandas methods that could solve this easier?

cs95 · Accepted Answer

`stack`-based

Split your To column, stack everything and then do a value_counts:

v = pd.concat([df.From, df.To.str.split(', ', expand=True)], axis=1).stack()
v[v.isin(lst)].value_counts()

stranger2@gmail.com    2
person1@gmail.com      2
person3@gmail.com      1
dtype: int64

`melt`

Another option is to use melt:

v = (df.set_index('From')
      .To.str.split(', ', expand=True)
      .reset_index()
      .melt()['value']
)
v[v.isin(lst)].value_counts()

stranger2@gmail.com    2
person1@gmail.com      2
person3@gmail.com      1
Name: value, dtype: int64

Note that set_index + str.split + reset_index is synonymous to pd.concat([...])...

Value counts for specific items in a DataFrame

Answers (1)

`stack`-based

`melt`

Related Questions

Value counts for specific items in a DataFrame

Answers (1)

stack-based

melt

Related Questions

`stack`-based

`melt`