Omkar
Omkar

Reputation: 2433

(key, value) pair using Python Lambdas

I am trying to work on a simple word count problem and trying to figure if that can be done by use of map, filter and reduce exclusively.

Following is an example of an wordRDD(the list used for spark):

myLst = ['cats', 'elephants', 'rats', 'rats', 'cats', 'cats']

All i need is to count the words and present it in a tuple format:

counts = [('cat', 1), ('elephant', 1), ('rat', 1), ('rat', 1), ('cat', 1)]

I tried with simple map() and lambdas as:

counts = myLst.map(lambdas x: (x, <HERE IS THE PROBLEM>))

I might be wrong with the syntax or maybe confused. P.S.: This isnt a duplicate questin as rest answers give suggestions using if/else or list comprehensions.

Thanks for the help.

Upvotes: 1

Views: 2758

Answers (4)

Abhay
Abhay

Reputation: 5

You Can use map() to get this result:

myLst = ['cats', 'elephants', 'rats', 'rats', 'cats', 'cats']

list(map(lambda x : (x,len(x)), myLst))

Upvotes: 0

Matt Mahowald
Matt Mahowald

Reputation: 11

If you don't want the full reduce step done for you (which aggregated the counts in SuperSaiyan's answer), you can use map this way:

    >>> myLst = ['cats', 'elephants', 'rats', 'rats', 'cats', 'cats']
    >>> counts = list(map(lambda s: (s,1), myLst))
    >>> print(counts)
    [('cats', 1), ('elephants', 1), ('rats', 1), ('rats', 1), ('cats', 1), ('cats', 1)]

Upvotes: 0

zengr
zengr

Reputation: 38899

Not using a lambda but gets the job done.

from collections import Counter
c = Counter(myLst)
result = list(c.items())

And the output:

In [21]: result
Out[21]: [('cats', 3), ('rats', 2), ('elephants', 1)]

Upvotes: 1

UltraInstinct
UltraInstinct

Reputation: 44454

You don't need map(..) at all. You can do it with just reduce(..)

>>> def function(obj, x):
...     obj[x] += 1
...     return obj
...
>>> from functools import reduce
>>> reduce(function, myLst, defaultdict(int)).items()
dict_items([('elephants', 1), ('rats', 2), ('cats', 3)])

You can then iterate of the result.


However, there's a better way of doing it: Look into Counter

Upvotes: 2

Related Questions