How can I write these nested if statements more elegantly?

Question

I'm writing a python program that removes duplicate words from a file. A word is defined as any sequence of characters without spaces and a duplicate is a duplicate regardless of the case so: duplicate, Duplicate, DUPLICATE, dUplIcaTe are all duplicates. The way it works is I read in the original file and store it as a list of strings. I then create a new empty list and populate it one at a time, checking whether the current string already exists in the new list. I run into problems when I try to implement the case conversion, which checks for all the instances of a specific case format. I've tried rewriting the if statement as:

 if elem and capital and title and lower not in uniqueList:

     uniqueList.append(elem)

I've also tried writing it with or statements as well:

 if elem or capital or title or lower not in uniqueList:

     uniqueList.append(elem)

However, I still get duplicates. The only way the program works properly is if I write the code like so:

def remove_duplicates(self):

    """
    self.words is a class variable, which stores the original text as a list of strings    
    """

    uniqueList = []

    for elem in self.words: 

        capital = elem.upper()
        lower = elem.lower()
        title = elem.title()

        if elem == '
':
            uniqueList.append(elem)

        else:

            if elem not in uniqueList:
                if capital not in uniqueList:
                    if title not in uniqueList:
                        if lower not in uniqueList:
                            uniqueList.append(elem)

    self.words = uniqueList

Is there any way I can write these nested if statements more elegantly?

Barmar · Accepted Answer

Combine the tests with and

if elem not in uniqueList and capital not in uniqueList and title not in uniqueList and lower not in uniqueList:

You can also use set operations:

if not set((elem, capital, title, lower)).isdisjoint(uniqueList):

But instead of testing all the different forms of elem, it would be simpler if you just put only lowercase words in self.words in the first place.

And make self.words a set instead of a list, then duplicates will be removed automatically.

How can I write these nested if statements more elegantly?

Answers (2)

Related Questions