lukmac
lukmac

Reputation: 4841

What is the Pythonic way to find the longest common prefix of a list of lists?

Given: a list of lists, such as [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]

Todo: Find the longest common prefix of all sublists.

Exists: In another thread "Common elements between two lists not using sets in Python", it is suggested to use "Counter", which is available above python 2.7. However our current project was written in python 2.6, so "Counter" is not used.

I currently code it like this:

l = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
newl = l[0]
if len(l)>1:
    for li in l[1:]:
    newl = [x for x in newl if x in li]

But I find it not very pythonic, is there a better way of coding?

Thanks!

New edit: Sorry to mention: in my case, the shared elements of the lists in 'l' have the same order and alway start from the 0th item. So you wont have cases like [[1,2,5,6],[2,1,7]]

Upvotes: 23

Views: 15064

Answers (5)

Will
Will

Reputation: 75683

It is inefficient as it doesn't early-out as soon as a mismatch is found, but its tidy:

([i for i,(j,k) in enumerate(zip(a,b)) if j!=k] or [0])[0]

Upvotes: 0

user3246749
user3246749

Reputation: 661

os.path.commonprefix() works well for lists :)

>>> x = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> import os
>>> os.path.commonprefix(x)
[3, 2, 1]

Upvotes: 66

senderle
senderle

Reputation: 151207

Given your example code, you seem to want a version of reduce(set.intersection, map(set, l)) that preserves the initial order of the first list.

This requires algorithmic improvements, not stylistic improvements; "pythonic" code alone won't do you any good here. Think about the situation that must hold for all values that occur in every list:

Given a list of lists, a value occurs in every list if and only if it occurs in nlist lists, where nlist is the total number of lists.

If we can guarantee that each value occurs only once in every list, then the above can be rephrased:

Given a list of lists of unique items, a value occurs in every list if and only if it occurs nlist times total.

We can use sets to guarantee that the items in our lists are unique, so we can combine this latter principle with a simple counting strategy:

>>> l = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> count = {}
>>> for i in itertools.chain.from_iterable(map(set, l)):
...     count[i] = count.get(i, 0) + 1
...     

Now all we have to do is filter the original list:

>>> [i for i in l[0] if count[i] == len(l)]
[3, 2, 1]

Upvotes: 2

Luka Rahne
Luka Rahne

Reputation: 10517

I am not sure how pythonic it is

from itertools import takewhile,izip

x = [[3,2,1], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]

def allsame(x):
    return len(set(x)) == 1

r = [i[0] for i in takewhile(allsame ,izip(*x))]

Upvotes: 18

jterrace
jterrace

Reputation: 67138

Here's an alternative way using itertools:

>>> import itertools
>>> L = [[3,2,1,4], [3,2,1,4,5], [3,2,1,8,9], [3,2,1,5,7,8,9]]
>>> common_prefix = []
>>> for i in itertools.izip(*L):
...    if i.count(i[0]) == len(i):
...       common_prefix.append(i[0])
...    else:
...       break
... 
>>> common_prefix
[3, 2, 1]

Not sure how "pythonic" it might be considered though.

Upvotes: 2

Related Questions