Andrea Armani
Andrea Armani

Reputation: 147

Merge ordered Lists in Python without prior knowledge about the ordering function

I am having a hard time to solve the following issue: I have to merge N lists. Each list contains some string objects. For each list, although I do not know which is the ordering function, I know that it is ordered. Moreover, the final list should respect all the ordering of the child that generated it. For instance:

l1 = ['This','world']
l2 = ['This','is','a','world','!']
l3 = ['a','hello','world']

merged_list = merge_function(l1,l2,l3)

The results I would like to achieve is to receive a list containing

merged_list # ['This','is','a','hello','world','!']

But I cannot figure out the way to do it, as the lists are not following any rule beside the order in which the elements are provided. Any help would be appreciated.

EDIT: The focus of my question is not on how to merge some lists. The problem is that the elements should be merged in a way that they respect all the ordering of the original lists. So I cannot use sets as they have their own ordering policy. I think I have to stick to lists because they are more flexible, but I need to find a way to insert new elements in the correct position within the final list, as I cannot simply append them.

For instance:

l1 = ['This','world']
l2 = ['This','is','a','world','!']
merged_list = l1

If I then simply append the missing element to the merged_list I would obtain:

merged_list # ['This','world','is','a','!']

and this list breaks the ordering of l2. I hope now I explained the issue a bit better.

Upvotes: 4

Views: 110

Answers (2)

gaurav
gaurav

Reputation: 1339

I am simply iterating through combination like l1 as main list & l2 & l3 as sub list , making common a list of words whose words should be present in result list if not i breaking the loop , following next pair like l2 as main and l3 as sub list

def check_conditions(main_lst,sub_lsts,result,remaning):
    for i,main_word in enumerate(main_lst):
        temp = [] # list of word should be present is result before appending
        try:
            for j in sub_lsts:      
                temp.extend(j[:j.index(main_word)] if main_word in j else [])
            
            if main_word not in result:
                for k in temp:
                    if k not in result: # if k is not present just break all loop
                        remaning.extend(main_lst[i+1:])
                        return (result,remaning)
                
                if main_word in remaning:
                    del remaning[remaning.index(i)]

                result.append(main_word) 
        except ValueError as err:
            pass
    return ([],[])


def arrange_function(lsts=[]):
    result_lst,remaning_lst = [],[]
    for i,lst in enumerate(lsts):
        try:
            main_lst,sub_lsts = lst,lsts[i+1:]
            check_conditions(main_lst,sub_lsts,result_lst,remaning_lst)
        except IndexError:
            pass
    return result_lst+remaning_lst


l1 = ['This','world']
l2 = ['This','is','a','world','!']
l3 = ['a','hello','world']
print(arrange_function(lsts=[l1,l2,l3]))
# ['This', 'is', 'a', 'hello', 'world', '!']

l1 = ['a', 'world']
l2 = ['This', 'is']
l3 = ['is', 'a']
print(arrange_function(lsts=[l1,l2,l3]))
#['This', 'is', 'a', 'world']

Upvotes: 1

Olivier Melançon
Olivier Melançon

Reputation: 22314

The main difficulty with your problem is if we have a case like this:

l1 = ['a', 'world']
l2 = ['This', 'is']
l3 = ['is', 'a']

In that case the expected result would be ['This', 'is', 'a', 'world']. Although, the relation 'This' < 'world' must be obtained by transitivity.

Here is a solution which determines your ordering by generating the full table of predecessors using a fixed point algorithm.

def merge_function(*lists):
    predecessors = {w: set() for w in set(w for lst in lists for w in lst)}

    # Add trivial predecessors from the lists
    # For example in ['foo', 'bar', 'baz']:
    # we know 'foo' is a predecessor of 'baz'
    for l in lists:
        tail = l[::-1]
        for word in l:
            tail.pop()
            for successor in tail:
                predecessors[successor].add(word)

    # Use transitive property to update predecessor
    # This is the fixed point part of the algorithm
    change_occured = True
    while change_occured:
        change_occured = False
        for word, word_predecessors in predecessors.items():
            for predecessors_set in predecessors.values():
                if word in predecessors_set and any(w not in predecessors_set for w in word_predecessors):
                    change_occured = True
                    predecessors_set.update(word_predecessors)

    return sorted(predecessors, key=lambda w: predecessors[w])

l1 = ['a', 'world']
l2 = ['This', 'is']
l3 = ['is', 'a']

merged_list = merge_function(l1,l2,l3)

print(merged_list) # ['This', 'is', 'a', 'world']

Upvotes: 1

Related Questions