jalazbe
jalazbe

Reputation: 2005

get information from pandas' dataframe and python 3

I have two csv files. Contacts and Users.

How I load data into dataframes and merge them

First, I load a dataframe with the name of the users:

import pandas as pd
import numpy as np
df_users= pd.read_csv('./Users_001.csv',sep=',',usecols=[0,2,3])

Then I load the information from contacts of each user

df_contacts = pd.read_csv('./Contacts_001.csv',sep=',',usecols=[0,1,5,48,55,56,57,83,58])

df_users columns name are: user_id, Name, Surname

df_contacts columns name are: Contact ID, id user owner, fullname, qualification, ...

I want to merge both dataframes using user_id and 'id user owner' since they represent the same information. To to this I first change the name of the columns on df_contacts and then I merge

dfcontactos.columns = ['ID de Contacto','user_id','fullname','qualification','accesibility' ... ]
df_us_cont = pd.merge(dfcontactos,df_usuarios,on='user_id')

Now df_us_cont has the information from users and contacts.

What I want to do

There are only 18 user_id but there are 500 contacts. For each user I want to know:

What I have tried and fail

df_qua_lower100 = df_us_cont[df_us_cont['qualification']<100]
df_qua_lower100['user_id'].value_counts()

So far with this I am able to get the information on how many contacts with qualification<100 has each user_id. But I am unable to look how many have 'accesibility>=4'

I have tried to explain the best I could

Upvotes: 1

Views: 138

Answers (1)

Abhishek Sharma
Abhishek Sharma

Reputation: 2039

First thing you can merge without changing column names

df_us_cont = dfcontactos.merge(dfcontactos,left_on='id user owner',right_on='user_id')

You can add as many conditions as you want if you use loc

df_us_cont.loc[(df_us_cont['qualification']<100) & (df_us_cont['accesibility']>=4),'user_id'].value_counts()

Number of contacts with qualification > 100 and < 300

df_us_cont.loc[(df_us_cont['qualification']>100) &(df_us_cont['qualification']<300) & (df_us_cont['accesibility']>=4),'user_id'].value_counts()

Number of contacts with qualification > 300

df_us_cont.loc[(df_us_cont['qualification']>300) & (df_us_cont['accesibility']>=4),'user_id'].value_counts()

Upvotes: 1

Related Questions