Minu
Minu

Reputation: 480

List index out of range - Index Error Python

I am writing a function that will iterate through a list of text items - parse each item, and append the parsed items back into a list. The code is as below:

clean_list = []

def to_words( list ):
    i = 0
    while i <= len(list):
        doc = list[i]
        # 1. Remove HTML
        doc_text = BeautifulSoup(doc).get_text() 
        # 2. Remove non-letters (not sure if this is advisable for all documents)       
        letters_only = re.sub("[^a-zA-Z]", " ", doc_text) 
        # 3. Convert to lower case, split into individual words
        words = letters_only.lower().split()                                               
        # 4. Remove stop words
        stops = set(stopwords.words("english"))
        meaningful_words = [w for w in words if not w in stops]   
        # 5. Join the words back into one string separated by space, and return the result.
        clean_doc = ( " ".join( meaningful_words ))   
        i = i+1
        clean_list.append(clean_doc)

But when I pass the list into this function, to_words(list), I get this error: IndexError: list index out of range

I tried experimenting without technically defining the to_words function i.e. avoiding the loop, manually changing i as 0,1,2 etc, and following through the steps of the function; this works fine.

Why am I facing this error when I use the function (and loop)?

Upvotes: 1

Views: 1382

Answers (1)

Rahul K P
Rahul K P

Reputation: 16081

Change while i <= len(list) to while i < len(list)

List indexing start from 0 so, i <= len(list) will satisfy the index as equal to len(list) so that's will make an index error.

1 . Better use for rather than using file loop, list support iterating through the list. Like

for elem in list_:
    # Do your operation here

2 . Don't use list as a variable name.

Upvotes: 1

Related Questions