Reputation: 656

Basic Python string processing?

So I was attempting to process a whole paragraph from a random article. The details are tedious but there has been one thing that keeps making me confused.

Here is my code:

def prevword_ave_len(word):    
    count = 0
    wordlength = 0
    mystr = "Call me Ishmael. Some years ago - never mind how long precisely - having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people's hats off - then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me."
    l1 = mystr.split()
    s1= list()
    #print(l1)

    if word in l1:
        if l1.index(word) == 0:
            return 0
        else:
            for element in l1:                
                s1.append(l1[l1.index(word) - 1]) #get that word to s1 list for future use
                l1.pop(l1.index(word)) # delete the occurrence so that it will not mess up later on in this loop. 
                #print(s1)
    else:
        return False

My goal is to determine if a word exists in that huge list of words. However when I tried to test it out it seems something is wrong and I cannot figure it out after about two hours of painful reviewing of my code.

My error is when I try this:

prevword_ave_len('the')

Python returns False to me instead of the true index of 'the'. As you can see I am trying to get that index and then try to find the rest of the indices so that I can get the word before them and do blablabla. But that's not the point cuz I am stuck right now. Can someone point out what am I doing wrong?

ERROR

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "program.py", line 14, in prevword_ave_len
    s1.append(l1[l1.index(word)]) 
    ValueError: 'the' is not in list

Upvotes: 0

Answers (3)

Jamie Bull

Reputation: 13519

This seems a simpler way of doing things:

def prevword_ave_len(word):    
    mystr = "Call me Ishmael. Some years ago - never mind how long precisely - having little or no money in my purse, and nothing particular to interest me on shore, I thought I would sail about a little and see the watery part of the world. It is a way I have of driving off the spleen and regulating the circulation. Whenever I find myself growing grim about the mouth; whenever it is a damp, drizzly November in my soul; whenever I find myself involuntarily pausing before coffin warehouses, and bringing up the rear of every funeral I meet; and especially whenever my hypos get such an upper hand of me, that it requires a strong moral principle to prevent me from deliberately stepping into the street, and methodically knocking people's hats off - then, I account it high time to get to sea as soon as I can. This is my substitute for pistol and ball. With a philosophical flourish Cato throws himself upon his sword; I quietly take to the ship. There is nothing surprising in this. If they but knew it, almost all men in their degree, some time or other, cherish very nearly the same feelings towards the ocean with me."
    l1 = mystr.split()
    s1 = list()

    if not word in l1:
        return False

    while word in l1:
        prevword = l1.pop(l1.index(word) - 1)
        s1.append(prevword) #get that prevword to s1 list for future use
        l1.pop(l1.index(word)) # remove that instance of word

    return sum(len(w) for w in s1) / len(s1) # remember to use float(len(s1)) for Python 2.x

print prevword_ave_len('the')

Upvotes: 0

Antwane

Reputation: 22588

This code returns 0 when the word is found at first position, False when the word is not found in the paragraph, and nothing in all other cases. There is no return statement that actually returns an index.

Try this:

def prevword_ave_len(word):    
  mystr = "Call me Ishmael. [...] ocean with me."
  # Convert the string to an array of words
  l1 = mystr.split()

  # 'word' has been found in 'mystr'
  if word in l1:
    # return the index of 'word' in 'l1'
    return l1.index(word)
  else:
    return False

In addition, you loop over each element of the list and remove the searched word from the list l1 to put it in the list s1. As a result, when your loop reach the next element, it try to perform l1.index(word), but 'word' has been removed (with list.pop()) from the list in the previous step. That's why you get an error like "ValueError: 'the' is not in list".

Upvotes: 0

joel goldstick

Reputation: 4483

Youse if x in y:

paragraph = "Now is the time for all good men to come to the aid of their country"

words = paragraph.split()

if 'time' in words:
    print "time is there"
else:
    print "not found"

You may want to first replace certain characters (like , - -- : ;) with a space

or you can use

i = paragraph.find(word)

This will return 0 or the index where the word is found.

Upvotes: 1

Basic Python string processing?

Answers (3)

Related Questions