Li Haoyi
Li Haoyi

Reputation: 15802

elegantly splitting a list (or dict) into two via some arbitrary function in python

Is there any elegant way of splitting a list/dict into two lists/dicts in python, taking in some arbitrary splitter function?

I could easily have two list comprehensions, or two selects, but it seems to me there should be some better way of doing it that avoids iterating over every element twice.

I could do it easily with a for loop and if statement, but that takes something like 7 lines of code for what should be a very simple operation.

Any ideas?

Edit:

Just for reference, my two solutions would be,

# given dict cows, mapping cow names to weight
# fast solution
fatcows = {}
thincows = {}
for name, weight in cows:
    if weight < 100:
        thincows[name] = weight
    else:
        fatcows[name] = weight

# double-list-comprehension solution would be
fatcows = {name: weight for name, weight in cows.items() if weight > 100}
thincows = {name: weight for name, weight in cows.items() if weight < 100}

I was thinking there must be something more elegant than this that i never thought of, something like:

thincows, fatcows = ??? short expression involving cows ???

I know it's possible to do by writing higher order functions stuff to do it for me, and i know how to do it manually. I was just wondering if there was some super-elegant language feature to do it for me.

It's like you can write your own subroutines and whatnot to do a SELECT on a list, or you can just say

thincows = select(cows, lambda c: c.weight < 100)

I was hoping there would be some similarly elegant way of splitting the list, with one pass

Upvotes: 10

Views: 283

Answers (6)

senderle
senderle

Reputation: 151097

How about 3 lines?

fatcows, thincows = {}, {}
for name, weight in cows.items():
    (fatcows if weight > 50 else thincows)[name] = weight

Tested:

>>> cows = {'bessie':53, 'maud':22, 'annabel': 77, 'myrna':43 }
>>> fatcows, thincows = {}, {}
>>> for name, weight in cows.items():
...     (fatcows if weight > 50 else thincows)[name] = weight
... 
>>> fatcows
{'annabel': 77, 'bessie': 53}
>>> thincows
{'maud': 22, 'myrna': 43}

Upvotes: 6

Eryk Sun
Eryk Sun

Reputation: 34280

More fun with cows :)

import random; random.seed(42)
cows = {n:random.randrange(50,150) for n in 'abcdefghijkl'}

thin = {}
for name, weight in cows.iteritems():
    thin.setdefault(weight < 100, {})[name] = weight

>>> thin[True]
{'c': 77, 'b': 52, 'd': 72, 'i': 92, 'h': 58, 'k': 71, 'j': 52}

>>> thin[False]
{'a': 113, 'e': 123, 'l': 100, 'g': 139, 'f': 117}

Upvotes: 3

hanung
hanung

Reputation: 358

OK, its about cows :)

cows = {'a': 123, 'b': 90, 'c': 123, 'd': 70}

select = lambda cows, accept: {name: weight for name, weight
                               in cows.items()
                               if accept(weight)}

thin = select(cows, lambda x: x < 100)
fat  = select(cows, lambda x: x > 100)

Upvotes: 1

Adam Rosenfield
Adam Rosenfield

Reputation: 400512

Any solution is going to take O(N) time to compute, whether it be through two passes through the list or one pass that does more work per item. The simplest way is just to use the tools that are available to you: itertools.ifilter and itertools.ifilterfalse:

def bifurcate(predicate, iterable):
    """Returns a tuple of two lists, the first of which contains all of the
       elements x of `iterable' for which predicate(x) is True, and the second
       of which contains all of the elements x of `iterable` for which
       predicate(x) is False."""
    return (itertools.ifilter(predicate, iterable),
            itertools.ifilterfalse(predicate, iterable))

Upvotes: 5

TorelTwiddler
TorelTwiddler

Reputation: 6166

Pretty simple, without any outside tools:

my_list = [1,2,3,4]
list_a = []
list_b = []

def my_function(num):
    return num % 2

generator = (list_a.append(item) if my_function(item) else list_b.append(item)\
        for item in my_list)
for _ in generator:
    pass

Upvotes: 2

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 799150

It can be done with a genex, sorting, and itertools.groupby(), but it will probably not be much more efficient than the brute-force solution.

Brute-force solution:

def bifurcate(pred, seq):
  if pred is None:
    pred = lambda x: x
  res1 = []
  res2 = []
  for i in seq:
    if pred(i):
      res1.append(i)
    else:
      res2.append(i)
  return (res2, res1)

Elegant solution:

import itertools
import operator

def bifurcate(pred, seq):
  if pred is None:
    pred = lambda x: x
  return tuple([z[1] for z in y[1]] for y in
    itertools.groupby(sorted((bool(pred(x)), x) for x in seq),
    operator.itemgetter(0)))

Upvotes: 3

Related Questions