Is there an algorithm to merge this kind of list?

Question

I have a list like this:

a   a   .   a
b   .   .   a
a   .   a   .
c   .   a   .

or in list format

[['a', 'a', '.', 'a'],
 ['b', '.', '.', 'a'],
 ['a', '.', 'a', '.'],
 ['c', '.', 'a', '.']]

and I want to merge it into [['a','b','c'],['a'],['a'],['a']], or

a,b,c    a    a    a

so that when two consecutive rows share the same letter at any of the four columns, their elements will be merged non-redundantly. I can do it by pairwise comparisons, but wonder if there are formal algorithms to do this?

Thanks.

Paul Sasik · Accepted Answer

You didn't specify the language but you can create a HashMap / HashTable for each column and populate it with the column values. (Your key and value will be the same thing.) Populating a HashMap means you cannot have duplicate keys so you will end uo with a list of unique values in each collection. Then pull out the values from each hashMap into an array, or arrays. If the periods in your sample data are actually periods you will have to ignore them as you loop through the array otherwise you will get them as output.

Take a look at Python dictionaries.

the pseudo code for this solution (Python will look similar ;-)

Create a list of dictionaries (list length = # of columns)
Loop over columns
Loop over rows
Insert data into appropriate dictionary *
Loop over list of dictionaries
Loop over dictionary values
Create new set of arrays with unique values

Is there an algorithm to merge this kind of list?

Answers (1)

Related Questions