Westworld
Westworld

Reputation: 300

Translating for loops and if statements into dictionary comprehension

I have been told that list comprehension is advantageous over nested for loops.

On the Python course I did, we learnt for and while loops and not list or dictionary comprehension. Now I am trying to improve my code. I usually do this by first writing what I want in nested loops and then "Translate" into a list comprehension.

In this case, I am trying to build a dictionary structured like this:

{chemical_name_1 :
{field_1: xxxx
field_2: yyyy
field_3: zzzz}
{chemical_name_2 :
{field_1: xxxy
field_2: yyyz
field_3: zzzx}
...}

This is the for loop with if statements that works fine.

wantedfields= ["ADI","General human health issues","CAS RN"]

results={}
for p in rand_links.keys():
    tableText = [td.text.strip() for td in pesticideSoup[p].find_all("td", recursive=True)]
    resd={}
    for n,t in enumerate(tableText):
        for field in wantedfields:
            if field in t:
                 resd[t]=tableText[n+1]
            results[p]=resd

Where pesticideSoup is a dictionary, pulled from beautiful soup. It is structured like this:

{chemical_name_1 :
<!DOCTYPE html>
<html>
<head>some HTML code</head>,
chemical_name_2 :
<!DOCTYPE html>
<html>
<head>some HTML code</head>
}

And rand_links is a dictionary too, simply with the chemical name as key, and a URL as the value.

This is my attempt at translating the nested for loops into a dictionary comprehension:

results = {p: {t:tableText[n+1] if field in t} for field in wantedfields for t in [td.text.strip() for td in pesticideSoup[p].find_all("td", recursive=True)] for p in rand_links.keys()}

I want the same result that I am getting with the nested for loops, but I am getting syntax error.

Please can you:

Any tips and help appreciated!

Upvotes: 0

Views: 125

Answers (2)

h4z3
h4z3

Reputation: 5459

You have to go from the big thing towards the small one - rearrange your "for" in dict comprehension to be the other way around. ;)

Also, for the nested dict, do another comprehension, otherwise you will get at most single-element dictionaries if your code ever works.

Remember that you can use normal functions inside comprehensions! You can do some parsing for a single thing as a function and then put fun(pesticideSoup[p]) as your value!

And code can be broken in multiple lines for readability.

results = {p: {t:tableText[n+1] 
               for t in [td.text.strip() for td in pesticideSoup[p].find_all("td", recursive=True)]
               for field in wantedfields if field in t} 
           for p in rand_links.keys()}

^That's basically your code with those modifications.

It will break because I had no idea what tableText would actually mean in this case - actually, I don't understand your original for n,t in enumerate(tableText) either because you're filling your sub-dict with tableText[n]:tableText[n+1] there (if tableText[n] is wanted)?

But this version is actually readable and it's possible you'll be better at spotting any bugs you may have made. (Or do the function thing I mentioned.)

Upvotes: 1

Tom Dalton
Tom Dalton

Reputation: 6190

I have been told that list comprehension is advantageous over nested for loops.

As with anything style-related, it's somewhat a matter of preference. For code, readability is a significant benefit.

foo = []
for thing in things:
    foo.append(thing + 5)

I'd argue that the above is less readable or clean than

foo = [thing + 5 for thing in things]

So in this case, the comprehension would be my preference. However, for your example, the comprehension

results = {p: {t:tableText[n+1] if field in t} for field in wantedfields for t in [td.text.strip() for td in pesticideSoup[p].find_all("td", recursive=True)] for p in rand_links.keys()}

is (IMO) a horrendous unreadable mess. Would you want someone to write that code for you to read? The "unpacked" version of that, using loops over multiple lines is likely to be significantly more readable, and so in spite of it taking up more lines of code, it would be my preference.

'Better' and 'worse' in code are subjective - it depends what you want. For the tasks python is well suited for, code readability is usually more useful than any advantages gained by 'optimising' the code in some way.

Upvotes: 1

Related Questions