Reputation: 221
I have 3 dataframes in Pandas:
1) user_interests:
With 'user' as an id, and 'interest' as an interest:
2) similarity_score:
With 'user' as a unique id matching ids in user_interests:
3) similarity_total:
With 'interest' being a list of all the unique interests in user_interets:
What I need to do:
Step 1: Look up interest from similarity_table to user_interests
Step 2: Take the corresponding user from user_interests and match it to the user in similarity_score
Step 3: Take the corresponding similarity_score from similarity_score and add it to the corresponding interest in similarity_total
The ultimate objective being to total the similarity scores of all users interested in the subjects in similarity_total. A diagram may help:
I know this can be done in Pandas in one line, however I am not there yet. If anyone can point me in the right direction, that would be amazing. Thanks!
Upvotes: 0
Views: 45
Reputation: 153500
IIUC, I think you need:
user_interest['similarity_score'] = user_interest['users'].map(similarity_score.set_index('user')['similarity_score'])
similarity_total = user_interest.groupby('interest', as_index=False)['similarity_score'].sum()
Output:
interest similarity_score
0 Big Data 1.000000
1 Cassandra 1.338062
2 HBase 0.338062
3 Hbase 1.000000
4 Java 1.154303
5 MongoDB 0.338062
6 NoSQL 0.338062
7 Postgres 0.338062
8 Python 0.154303
9 R 0.154303
10 Spark 1.000000
11 Storm 1.000000
12 decision tree 0.000000
13 libsvm 0.000000
14 machine learning 0.000000
15 numpy 0.000000
16 pandas 0.000000
17 probability 0.000000
18 regression 0.000000
19 scikit-learn 0.000000
20 scipy 0.000000
21 statistics 0.000000
22 statsmodels 0.000000
Upvotes: 2
Reputation: 367
I'm not sure what code you have already written but have you tried something similar to this for the merging? It's not one line though.
# Merge user_interest with similarity_total dataframe
ui_st_df = user_interests.merge(similarity_total, on='interest',how='left').copy()
# Merge ui_st_df with similarity_score dataframe
ui_ss_df = ui_st_df.merge(similarity_score, on='user',how='left').copy()
Upvotes: 0