Yuxuan Chen
Yuxuan Chen

Reputation: 1

Python dictionary comprehension with if-statement, but the if-statement contains the dictionary itself

I'd like to create a dictionary for a string of letters, whose keys are the unique letters and the values are the indices where the letters first appear. For example: word = 'mississippi', the correct answer should be {'m':0,'i':1, 's':2, 'p':8}

I tried to write 'pythonic' codes:

dict = {word[i]:i  for i in range(len(word))  if word[i] not in dict.keys()}

However, what I got was: {'i': 10, 'm': 0, 'p': 9, 's': 6}, which was as if the dict wasn't updated and was still empty when the if-statement was called.

A normal for loop did the right thing:

for i in range(len(word)):

    if word[i] not in first_apperance_dict.keys():

        dict[word[i]]=i

Output: {'i': 1, 'm': 0, 'p': 8, 's': 2}

So, why is that? Is there a pythonic elegant code for this problem? In general, shall I only put "static" variables in the if-statement in list/dictionary comprehensions?

Upvotes: 0

Views: 658

Answers (2)

R.A.Munna
R.A.Munna

Reputation: 1709

Already a good explanation is given by @BrenBarn. I just show you using find() method. This method return the first occurrence index of the char from given string.

>>> word = 'mississippi'
>>> {w:word.find(w) for w in set(word)}
{'i': 1, 'p': 8, 's': 2, 'm': 0}

Upvotes: 0

BrenBarn
BrenBarn

Reputation: 251408

You can't use the comprehension to refer to the dict it's creating, because the dict isn't created until after the comprehension finishes.

In general, you can't use comprehensions to do the kind of thing you're doing here. In a comprehension, each value should depend on just one value from the iteration. But in your computation each value depends on the result of all previous values (via their effect on the dict being created).

For your specific example here there is a simpler way, because the result you want doesn't actually depend on previous values. You just want the first occurrence of each character in the word, which can be done directly:

>>> {char: word.index(char) for char in word}
{'i': 1, 'm': 0, 'p': 8, 's': 2}

In this version later occurrences of a character "overwrite" the value from earlier ones, but they overwrite with the same value, so it has no effect. An even nicer version would be:

>>> {char: word.index(char) for char in set(word)}
{'i': 1, 'm': 0, 'p': 8, 's': 2}

This iterates only over the unique characters in the word, not over all characters.

Upvotes: 4

Related Questions