Reputation: 337
i am working on python code to calculate the occurrences of few values in a column within a CSV.
Example - CSV1 is as below
**Type Value**
Simple test
complex problem
simple formula
complex theory
simple idea
simple task
I need to get the content of value for type simple and complex i.e
**Type Value**
simple test
simple formula
simple idea
simple task
complex theory
complex problem
And query other CSV which is CSV1 on the total count of occurrences of simple list i.e [test, formula, idea, task] and complex list i.e [theory, problem]
Other CSV2 is
**Category**
test
test
test
formula
formula
formula
test
test
idea
task
task
idea
task
idea
task
problem
problem
theory
problem
problem
idea
task
problem
test
Both CSV1 and CSV2 are dynamic, from CSV1 as example for type "simple' get the list of the corresponding values and refer CSV2 to know what's count for each value. i.e counts of test, idea, task, formula.
Same for Complex type I tried multiple methods with pandas but not expecting result as expected. Any pointers please.
Upvotes: 1
Views: 126
Reputation: 863751
Use:
df2['cat'] = df2['Category'].map(df1.set_index('Value')['Type'])
df2 = df2['cat'].value_counts().rename_axis('a').reset_index(name='b')
print (df2)
a b
0 simple 18
1 complex 6
Upvotes: 2
Reputation: 46
Much like @jezrael,however I would first groupby the second csv. This would help in merging if the second csv is very large.
df2=cv2.groupby('value').agg(cnt=('value','count')).reset_index()
This would give me a dataframe with two columns, value and count.
Now, you can merge it with CV1
df1 = cv1.merge(df2,on=['value'],how='inner')
Upvotes: 1