Reputation: 311
I have a list of possible column names for 'net amount' i.e
list1 = ['total amount', 'total cash', 'net amount']
I have a dataframe whose column names for instance are
df.columns = ['accounts receivables ffa', 'net amount of the year', 'cash refunded', 'payement']
I want to match list1
with possible names for 'net amount'
with the df
and it should fetch me 'net amount of the year'
match list1
with df.columns
and get the similar match of the column name from the df
Any suggestions please?
Thanks in advance
Upvotes: 0
Views: 2134
Reputation: 1
How about loop over both the list and the column names. Then check if the list item (a string) is a substring of the column name (also a string).
for el in list1:
for col_name in df.columns:
if el in col_name:
print(col_name)
Upvotes: 0
Reputation: 751
you can use https://pypi.org/project/pyjarowinkler/
from pyjarowinkler import distance
import pandas as pd
df = pd.DataFrame( [], columns=['accounts receivables ffa', 'net amount of the year', 'cash refunded', 'payement'])
lst1 = ['total amount', 'total cash', 'net amount']
for item in lst1:
for col in df.columns:
if distance.get_jaro_distance(item,col) >0.85:
print(item,";",col)
Upvotes: 1