kitchenprinzessin
kitchenprinzessin

Reputation: 1043

Count occurrences of a value in a Series

I am newbie to Python. I am reading a csv file (with 3 columns: lib, imports, import_tuples). How can I count the number of occurrences of a value appears in the second column in the the column?

Example:

lib,imports,import_tuples 
lib1,"[0, 1, 2, 3, 4, 5]","[[5, 1, 2], [2,1,3],[2, 4, 1]]" 
lib2,"[4, 65, 99, 100]","[[4, 65, 100], [100, 4],[99, 65]]"

Expected Output (for lib1)
0 1 2 3 4 5
0 3 3 1 1 1 

import pandas
from collections import Counter
df = pandas.read_csv('temp_data.csv')
myList = second.values.T.tolist()

c = df["import_tuples"].str.split(',').apply(Counter)
data = pandas.DataFrame({n: c.apply(lambda x: x.get(n, 0)) for n in myList})
data =  c.to_frame()

Upvotes: 0

Views: 1334

Answers (1)

Stefan
Stefan

Reputation: 42905

You can use pandas.Series.str.findall() to extract the numbers from the strings and then use collections.Counter:

from collections import Counter
df['imports'] = df.imports.str.findall(r'\d+')
df['import_tuples'] = df.import_tuples.str.findall(r'\d+')
df['imports_counted'] = df.apply(lambda x: {i: Counter(x.import_tuples).get(i) for i in x.imports}, axis=1)

    lib             imports                 import_tuples  \
0  lib1  [0, 1, 2, 3, 4, 5]   [5, 1, 2, 2, 1, 3, 2, 4, 1]   
1  lib2    [4, 65, 99, 100]  [4, 65, 100, 100, 4, 99, 65]   

                                     imports_counted  
0  {'2': 3, '5': 1, '0': None, '3': 1, '4': 1, '1...  
1               {'99': 1, '4': 2, '100': 2, '65': 2}  

Upvotes: 1

Related Questions