Rv R
Rv R

Reputation: 311

find similar match from list to column names of a dataframe in python

I have a list of possible column names for 'net amount' i.e

list1 = ['total amount', 'total cash', 'net amount']

I have a dataframe whose column names for instance are

df.columns = ['accounts receivables ffa', 'net amount of the year', 'cash refunded', 'payement']

I want to match list1 with possible names for 'net amount' with the df and it should fetch me 'net amount of the year'

match list1 with df.columns and get the similar match of the column name from the df

Any suggestions please?

Thanks in advance

Upvotes: 0

Views: 2134

Answers (2)

Oscar Cefetra
Oscar Cefetra

Reputation: 1

How about loop over both the list and the column names. Then check if the list item (a string) is a substring of the column name (also a string).

for el in list1:
    for col_name in df.columns:
        if el in col_name:
            print(col_name)

Upvotes: 0

Ran Cohen
Ran Cohen

Reputation: 751

you can use https://pypi.org/project/pyjarowinkler/

from pyjarowinkler import distance
import pandas as pd 

df = pd.DataFrame( [], columns=['accounts receivables ffa', 'net amount of the year', 'cash refunded', 'payement'])
lst1 = ['total amount', 'total cash', 'net amount']
    
for item in  lst1:
    for col in df.columns:
        if distance.get_jaro_distance(item,col) >0.85:
            print(item,";",col)

enter image description here

Upvotes: 1

Related Questions