sula7ifa
sula7ifa

Reputation: 95

Program to construct list of tuples

I am trying to write a program that takes in a dictionary of frequencies and an integer, and returns a list of tuples containing all the words that appear more than min_times.

def words_often(freqs, min_times):
tuple_list = []
for key in freqs:
    word_list = []
    if freqs[key] > min_times:
        store_value = freqs[key]
        for key2 in freqs:
            if freqs[key2] == store_value:
                word_list += [key2]
    if freqs[key] not in tuple_list:
        tuple_list += [(word_list, store_value)]
return tuple_list


#test program
freqs = {'yeah':15, 'one': 1, 'crazy': 3, 'lonely': 1}

print(words_often(freqs, 0))

There's something wrong however, the return value for the test above, is :

[([‘yeah’], 15), ([‘one’, ‘lonely’], 1), ([‘crazy’], 3), ([‘one’, ‘lonely’], 1)]

This return value shouldn't have the last entry, because it's a duplicate.

How can I make my code simpler, because a lot is going on and I can't determine the problem.

Edit: I need the words inside the tuples to be grouped in lists. For example the first entry should be (['yeah'], 15) and for words that have the same value (one and lonely), I need them to be grouped like (['one', 'lonely'], 1)

Upvotes: 2

Views: 87

Answers (6)

Transhuman
Transhuman

Reputation: 3547

Use collections.defaultdict

freqs = {'yeah':15, 'one': 1, 'crazy': 3, 'lonely': 1}
from collections import defaultdict
def words_often(freqs, min_times):
    d_dict = defaultdict(list)
    for k,v in freqs.items():
        d_dict[v].append(k)
    return [(v,k) for k,v in d_dict.items() if k>min_times]

print(words_often(freqs, 0))

Output:

[(['yeah'], 15), (['one', 'lonely'], 1), (['crazy'], 3)]

Upvotes: 0

andrew_reece
andrew_reece

Reputation: 21274

You can use Pandas:

import pandas as pd

[(word, freq) for freq, word in (
    pd.Series(freqs)
      .reset_index()
      .groupby(0, as_index=False)
      .agg(lambda x: list(x))
      .values
)]

# [(['lonely', 'one'], 1), (['crazy'], 3), (['yeah'], 15)]

Whatever solution you end up going with, it may be helpful to consider that this is a reduce operation on word frequency, and frequency occupies the value slot in your word:freq key:value pairs.

reduce and/or groupby operations work by collapsing on keys, and then creating some aggregation of the associated values. That's why you see a number of these answers reversing freqs at some point, to get things in shape for the reduce operation.

Upvotes: 0

not_python
not_python

Reputation: 902

freqs = {'yeah':15, 'one': 1, 'crazy': 3, 'lonely': 1}
m = 0
from collections import defaultdict
def answer(d, m):
    out = defaultdict(list)
    for e, i in d.items():
        if i > m:
            out[i].append(e)
    return [(e, i) for i, e in out.items()]

This will work.

Upvotes: 0

13aal
13aal

Reputation: 1674

This will return a list of tuples from a given dict if the value in the given dict is greater then the given minimum requirements:

def convert(items, min):
    return [(key, items[key]) for key in items.iterkeys() if items[key] > min]

For example, with your shown dict:

freqs = {'yeah': 15, 'one': 1, 'crazy': 3, 'lonely': 1}
convert(freqs, 0)
# [('crazy', 3), ('lonely', 1), ('yeah', 15), ('one', 1)]

This basically does a for loop on a single line that is called a list comprehension. Read about them, they will save your life.


If you want the first value in the tuple to be a list, the simplest way would be to add [] around the insertion of the value:

def convert(items, min):
    return [([key], items[key]) for key in items.iterkeys() if items[key] > min]

And another example with your given dict:

freqs = {'yeah': 15, 'one': 1, 'crazy': 3, 'lonely': 1}
convert(freqs, 0)
# [(['crazy'], 3), (['lonely'], 1), (['yeah'], 15), (['one'], 1)]

Upvotes: 0

Eric Duminil
Eric Duminil

Reputation: 54233

Since you want to group keys by values, you could use itertools.groupby:

from itertools import groupby
data = {'yeah':15, 'one': 4, 'crazy': 3, 'lonely': 4}
min_times = 3

get_value = lambda kv: kv[1]
sorted_data = sorted(data.items(), key= get_value, reverse=True)
print(sorted_data)
# [('yeah', 15), ('one', 4), ('lonely', 4), ('crazy', 3)]


print([([v[0] for v in vs], k) for k,vs in groupby(sorted_data, key= get_value) if k > min_times])
# [(['yeah'], 15), (['one', 'lonely'], 4)]

Upvotes: 2

kgf3JfUtW
kgf3JfUtW

Reputation: 14928

List comprehension may make your code simpler.

from collections import defaultdict

def words_often(freqs, min_times):
    words = [(key, freqs[key]) for key in freqs if freqs[key] >= min_times]
    # words = [('yeah', 15), ('one', 1), ('crazy', 3), ('lonely', 1)]

    d = defaultdict(list)
    for word, freq in words:
        d[freq].append(word)
    # d = {15: ['yeah'], 1: ['one', 'lonely'], 3: ['crazy']}

    return [(d[freq], freq) for freq in d]

# Test
freqs = {'yeah':15, 'one': 1, 'crazy': 3, 'lonely': 1, 'zero':0}
print(words_often(freqs, 1))
# [(['yeah'], 15), (['one', 'lonely'], 1), (['crazy'], 3)]

Upvotes: 0

Related Questions