user3689497
user3689497

Reputation: 93

How to compare contents in a list in haskell, to see if they are partially same?

This is very tricky for me.

Given a very long list,

[[100,11,1,0,40,1],[100,12,3,1,22,23],[101,11,1,0,45,1],[101,12,3,1,28,30],[102,11,1,0,50,1],[102,12,3,1,50,50],[103,11,1,0,50,1],[103,12,3,1,50,50],[104,11,1,0,50,25],[104,12,3,1,50,50],[105,11,1,0,50,49],[105,12,3,0,30,30],[106,11,1,0,50,49],[106,12,3,0,25,26],[107,11,1,1,33,20],[107,12,3,0,25,26],[108,11,1,1,2,1],[108,12,3,1,20,24],[109,11,1,1,2,1],[109,12,3,1,28,31],[110,11,1,0,40,1],[110,12,3,1,22,23]..]

Now igore the first number of each list, if two lists are the same in spite of the first number, for example

[101,11,1,0,50,1] alike [102,11,1,0,50,1]

We keep the latter list, until the whole list is all checked. Ideally the result should be like :

[[102,11,1,0,50,1],[103,12,3,1,50,50]..]

My idea is to use map to take the first number away, use nub and \\ to get rid of all repeat results, make it into like

[[11,1,0,50,1],[12,3,1,50,50],[12,3,1,50,50],[11,1,0,50,49],[12,3,0,25,26],[11,1,1,2,1],[11,1,0,40,1],[12,3,1,22,23],[11,1,0,45,1],[12,3,1,28,30]..]

and use Set.isSubsetOf to filter the orignal list out.

However, this idea is too complex and difficult to implenment. Since I am a beginner of Haskell, is there a better way to sort this? Or I can use a recursion function instead(still need efforts though)?

Upvotes: 2

Views: 212

Answers (3)

Evgenij Pashkin
Evgenij Pashkin

Reputation: 51

If you need to compact a list by merging all items with same tail, try this code.

compactList:: [[Int]] -> [[Int]]
compactList list = reverse $ compactList' list [] 

compactList':: [[Int]] -> [[Int]] -> [[Int]]
compactList' [] res = res
compactList' (l:ls) res
    | inList l res
    = compactList' ls res
    | otherwise
    = compactList' ls  (l:res)

inList :: [Int] -> [[Int]] -> Bool
inList [] _ = False
inList _ [] = False
inList val@(x:xs) ((x':xs'):ls)
    | xs == xs'
    = True
    | otherwise
    = inList val ls

If we need "keep the latter list" (I missed that previously), then just change the place of reverse

compactList list = reverse $ compactList' list []

Upvotes: 1

amnn
amnn

Reputation: 3716

From your description, I take it you want to get a list of the last lists to possess each of the unique tails. You can do it like this:

lastUniqueByTail :: Eq a => [[a]] -> [[a]] 
lastUniqueByTail = reverse . nubBy ((==) `on` tail) . reverse

Note this only works for finite lists of non-empty lists

You can find on in the Data.Function module, and you can find nubBy in Data.List.

So here's an explanation on each part, we'll work from the inside out:

  • ((==) `on` tail) This function performs comparisons between two lists, and determines them to be equal if their tails are equal (i.e. it performs an equality check ignoring the first element).

    The on function is what is doing most of the magic here. It has the following type signature:

    on :: (b -> b -> c) -> (a -> b) -> a -> a -> c
    

    And it is essentially defined as (f `on` g) x y = f (g x) (g y), so substituting the functions we provided, we get ((==) `on` tail) x y = (tail x) == (tail y).

  • Then, nubBy is like nub except that you can provide it a comparator to test equality on. We give it the function we defined above so that it discards elements with the same tail.

  • But, like nub, nubBy keeps the first element in each equivalence class it finds. (i.e. if it finds two elements that are equal in the list, it always picks the first). We want the last such elements, which is why we must reverse first (so that the last element that would have been equal becomes the first, and so is kept).
  • Finally, we reverse at the end to get the order back how it was to begin with.

Upvotes: 4

ssm
ssm

Reputation: 5373

You can just find the Euclidian distance between 2 vectors. For example, if you have two lists [a0, a1, ... , an] and [b0, b1, ..., bn], then the square of the Euclidian distance between them would be

sum [ (a - b) ^ 2 | (a, b) <- zip as bs ]

This can give you an idea of how close one list is to another. Ideally you should take the square root, but for what you are doing, I don't know if that is necessary.

Edit:

Sorry I misunderstood your question. According to your definition, alike is defined so:

alike as bs = (tail as) == (tail bs)

If that is the case, then something like the following should do the trick:

xAll [] = []
xAll xa@(x:xaa) = a : xAll b
    where
        a = last $ filter (alike x) xa
        b = filter (\m -> not (alike x m)) xa

But not sure if this is what you are looking for ...

Upvotes: 0

Related Questions