Function works for small samples but not larger ones (Python)

Question

I'm trying to make a function to see if words appear within a certain distance of one another, my code is as follows:



file_cont = [['man', 'once', 'upon', 'time', 'love', 
'princess'], ['python', 'code', 'cool', 'uses', 'java'],
['man', 'help', 'test', 'weird', 'love']] #words I want to measure 'distance' between

dat = [{ind: val for val, ind in enumerate(el)} for el in file_cont]

def myfunc(w1, w2, dist, dat):
    arr = []
    for x in dat:
        i1 = x.get(w1)
        i2 = x.get(w2)
        if (i1 is not None) and (i2 is not None) and (i2 - i1 <= dist ):    
            arr.append(list(x.keys())[i1:i2+1])
    return arr

It works in this instance,

myfunc("man", "love",4, dat) returns [['man', 'once', 'upon', 'time', 'love'], ['man', 'help', 'test', 'weird', 'love']] which is what I want

The problem I have is when I use a much bigger dataset (the elements of file_cont becomes thousands of words), it outputs odd results

For example I know the words 'jon' and 'snow' appear together in at least one instance in one of the elements of file_cont

When I do myfunc('jon','snow',6,dat) I get:

[[], [], ['castle', 'ward'], [], [], []]

something completely out of context, it doesn't mention 'jon' or 'snow'

What is the problem here and how would I go about fixing it?

Function works for small samples but not larger ones (Python)

Answers (1)

Related Questions