Harish reddy
Harish reddy

Reputation: 431

Using VLookup Function on a dataframe

I have two data frames l3 which contains 559 rows and the other day frame 'mains' contains 58 rows and now I want to create a column 'new' in l3 such that if the value in l3 column 'dum' is present in any of the rows of the column 'Datecomb' in mains dataframe then the new column should contain True else False

I Tried :

def f(i):
    if((i == (mains['Datecomb']).any()) == True):
        return 'True'
    else:
        return 'False'

 l3['new']=l3['dum'].apply(f)

But this function returned only "False" I used value counts for both l3['dum'] and mnains['Datecomb'] and i could see common values using Ctrl+F option on chrome...

Upvotes: 0

Views: 46

Answers (1)

Mike Tomaino
Mike Tomaino

Reputation: 170

For your specific request, we can actually do this in one line with some method chaining and list comprehension:

I3['new'] = [mains['Datecomb'].where(mains['Datecomb'] == i).count() > 0 for i in I3['dum'].values]

This expression uses list comprehension to count the number of entries in mains['Datecomb'] for each value in I3['dum'], returns True if that count is > 0, and sets the resulting list as a column I3['new'].

If you are unfamiliar with list comprehension, take a look online as some examples. It is a powerful tool!

Alternatively, you can use the DataFrame.join() method to join the results of a value_counts() > 0 operation on mains:

I3 = I3.join(mains['Datecomb'].value_counts() > 0, on = 'dum').fillna(False)
I3 = I3.rename(columns={'Datecomb': 'new'})

For more general lookup operations, using the join(), merge(), or concat() methods are probably better. I just found the first method above to work well for your specific request

I tested this on the following sample data set and both methods worked

I3 = pd.DataFrame([
     ['dum1', 'old1'],
     ['dum2', 'old2'],
     ['dum3', 'old3'],
     ['dum4', 'old4']],
     columns = ['dum', 'old'] )
mains = pd.DataFrame([
     ['dum2', 'not_used2'],
     ['dum3', 'not_used3'],
     ['dum3', 'not_used4']],
     columns = ['Datecomb', 'NotUsed'])

Upvotes: 1

Related Questions