Is there a proper way to convert dataframe column containing in each row comma separated strings to a dictionary?

Question

ID	Genre
1	Drama, Comedy
2	Action, Science Fiction, Suspense
3	Horror, Suspense
4	Comedy
5	Action, Drama, Sports
6	Comedy, Science Fiction
7	Drama

I have a Pandas dataframe with around 10000 rows, these above are the first 7th rows. I would like to create another dataframe or a dictionary that will count the occurrence of every genre existing in the dataset.

Something like this:

Genre	Count
Drama	3
Comedy	3
Action	2
Science Fiction	2
Suspense	2
Sports	1

Ps. In the original df there are more then this 6 genres.

I have an idea how to do it iteratively with a for loop:

my_dict = {} 
for elem in df['Genre']:
    for genre in elem.split(','):
        if genre in my_dict:
            my_dict[genre] += 1
        else:
            my_dict[genre] = 1

However, is there any way to achieve this using NumPy or Pandas functions? As with 10000 rows it takes a lot of time.

Is there a proper way to convert dataframe column containing in each row comma separated strings to a dictionary?

Answers (1)

Related Questions