Finding a long word broken by a new line

Question

I'm trying to search for a list of words, and so I have generated this code:

narrative = "Lasix 40 mg b.i.d., for three days along with potassium chloride slow release 20 mEq b.i.d. for three days, Motrin 400 mg q.8h"

meds_name_final_list = ["lasix", "potassium chloride slow release", ...]


def all_occurences(file, str):
    initial = 0
    while True:
        initial = file.find(str, initial)
        if initial == -1:
            return
        yield initial
        initial += len(str)
    offset = []
    for item in meds_name_final_list:
        number = list(all_occurences(narrative.lower(), item))
        offset.append(number)

Desired output: list of the starting index/indices in the corpora of the word being a search for, e.g:

offset = [[1], [3, 10], [5, 50].....]

This code works perfectly for not so long words such as antibiotics, emergency ward, insulin etc. However, long words that are broken by new line spacing are not detected by the function above.

Desired word: potassium chloride slow release

Any suggestion to solve this?

Thinh Pham · Accepted Answer

How about this?

def all_occurences(file, str):
    initial = 0
    file = file.replace('
', ' ')
    while True:
      initial = file.find(str, initial)
      if initial == -1: return
      yield initial
      initial += len(str)

Finding a long word broken by a new line

Answers (1)

Related Questions