Mining for Term that is "Included In" Entry Rather than "Equal To"

Question

I am doing some data mining. I have a database that looks like this (pulling out three lines):

100324822$10032482$1$PS$BENICAR$OLMESARTAN MEDOXOMIL$1$Oral$UNK$$$Y$$$$021286$$$TABLET$ 1014687010$10146870$2$SS$BENICAR HCT$HYDROCHLOROTHIAZIDE\OLMESARTAN MEDOXOMIL$1$Oral$1/2 OF 40/25MG TABLET$$$Y$$$$$.5$DF$FILM-COATED TABLET$QD 115700162$11570016$5$C$Olmesartan$OLMESARTAN$1$Unknown$UNK$$$U$U$$$$$$$

My Code looks like this :

    with open('DRUG20Q4.txt') as fileDrug20Q4:
        drugTupleList20Q4 = [tuple(map(str, i.split('$'))) for i in fileDrug20Q4]
    drug20Q4 = []
    for entryDrugPrimaryID20Q4 in drugTupleList20Q4:
        drug20Q4.append((entryDrugPrimaryID20Q4[0], entryDrugPrimaryID20Q4[3], entryDrugPrimaryID20Q4[5]))
    fileDrug20Q4.close()

    drugNameDataFrame20Q4 = pd.DataFrame(drug20Q4, columns = ['PrimaryID', 'Role', 'Drug Name']) drugNameDataFrame20Q4 = pd.DataFrame(drugNameDataFrame20Q4.loc[drugNameDataFrame20Q4['Drug Name'] == 'OLMESARTAN'])

Currently the code will pull only entries with the exact name "OLMESARTAN" out, how do I capture all the variations, for instance "OLMESARTAN MEDOXOMIL" etc? I can't simply list all the varieties as there's an infinite amount of variations, so I would need something that captures anything with the term "OLMESARTAN" within it.

Thanks!

Mining for Term that is "Included In" Entry Rather than "Equal To"

Answers (1)

Related Questions

Mining for Term that is &quot;Included In&quot; Entry Rather than &quot;Equal To&quot;

Answers (1)

Related Questions

Mining for Term that is "Included In" Entry Rather than "Equal To"