Reputation: 5243

Python dictionary generation

I am trying to make a dictionary of array values in python but since I am new to python I seem to be missing something simple. Someone kick me in the butt please.

So a data like

col1, col2
asd, foo
asd, bar
dsa, baz

should become

{
    asd: ['foo', 'bar'],
    dsa: ['baz']
}

But

rows = run_query('SELECT * FROM query_datasets')

datasets = {}

for row in rows:

    if not datasets[row['col1']]:
        datasets[row['col1']] = []

    datasets[row['col1']].append(row['col2'])

return datasets

Gives foo.

I can see the rows if I print them so data is there.

Upvotes: 2

Answers (3)

Quentin

Reputation: 700

Close! This should fix it.

rows = run_query('SELECT * FROM query_datasets')

datasets = {}

for row in rows:
    key = row['col1']
    val = row['col2']
    if not datasets.get(key): # You can use .get() to see if a dict has a key
        datasets[key] = [val]
    else:
        datasets[key].append(val)

return datasets

Upvotes: 0

wkl

Reputation: 79893

Your original code should not run without raising an exception because you try to access non-existent keys on the datasets dict.

Using a `defaultdict`

For your use case this is my preferred method.

You can simplify a lot using the defaultdict convenience collection.

from collections import defaultdict

# Make datasets a defaultdict that automatically initializes
# an empty list for a key
datasets = defaultdict(list)

rows = run_query('SELECT * FROM query_datasets')

for row in rows:
    datasets[row['col1']].append(row['col2'])

Using EAFP (easier to ask forgiveness than permission)

Here we rely on catching the KeyError thrown when accessing a non-existent key in a dictionary, and then initializing that value in the dict when that happens. If it already exists, just append to the list we've already created.

datasets = {}

rows = run_query('SELECT * FROM query_datasets')

for row in rows:
    try:
        datasets[row[0]].append(row[1])
    except KeyError:
        datasets[row[0]] = [row[1]]

Using key existence check

Here we check if row[0] in datasets. The logic is similar to the previous example.

rows = run_query('SELECT * FROM query_datasets')

for row in rows:
    if row[0] in datasets:
        datasets[row[0]].append(row[1])
    else:
        datasets[row[0]] = [row[1]]

Upvotes: 2

tobias_k

Reputation: 82889

Your condition if not datasets[row['col1']]: is wrong. This would get the element of that key from the dict and check whether it's "truthy", but the dict might not have that key yet, thus raising an exception. Instead, it should probably be if row['col1'] not in datasets::

for row in rows:
    if row['col1'] not in datasets:
        datasets[row['col1']] = []
    datasets[row['col1']].append(row['col2'])

Alternatively, you could use dict.setdefault to set and return a default value in case the key is not yet in the dict.

for row in rows:    
    datasets.setdefault(row['col1'], []).append(row['col2'])

Upvotes: 1

Python dictionary generation

Answers (3)

Using a defaultdict

Using EAFP (easier to ask forgiveness than permission)

Using key existence check

Related Questions

Using a `defaultdict`