BVG
BVG

Reputation: 65

Finding a long word broken by a new line

I'm trying to search for a list of words, and so I have generated this code:

narrative = "Lasix 40 mg b.i.d., for three days along with potassium chloride slow release 20 mEq b.i.d. for three days, Motrin 400 mg q.8h"

meds_name_final_list = ["lasix", "potassium chloride slow release", ...]


def all_occurences(file, str):
    initial = 0
    while True:
        initial = file.find(str, initial)
        if initial == -1:
            return
        yield initial
        initial += len(str)
    offset = []
    for item in meds_name_final_list:
        number = list(all_occurences(narrative.lower(), item))
        offset.append(number)

Desired output: list of the starting index/indices in the corpora of the word being a search for, e.g:

offset = [[1], [3, 10], [5, 50].....]

This code works perfectly for not so long words such as antibiotics, emergency ward, insulin etc. However, long words that are broken by new line spacing are not detected by the function above.

Desired word: potassium chloride slow release

Any suggestion to solve this?

Upvotes: 3

Views: 95

Answers (1)

Thinh Pham
Thinh Pham

Reputation: 402

How about this?

def all_occurences(file, str):
    initial = 0
    file = file.replace('\n', ' ')
    while True:
      initial = file.find(str, initial)
      if initial == -1: return
      yield initial
      initial += len(str)

Upvotes: 3

Related Questions