Reputation: 13
I am working on a uni assignment to implement a left outer join in Haskell using only the prelude function definitions. I have tried using list comprehensions:
ljoin :: Eq a => [(a,b)] -> [(a,c)] -> [(a,b, Maybe c)]
ljoin xs ys = [if x1 == y1 then (x1,x2, Just y2) else (x1,x2, Nothing)| (x1, x2) <- xs, (y1,y2) <- ys]
This produces the following output for the example:
Main> ljoin [(2,"S"),(1,"J")] [(2,1),(2,2),(3,4)]
[(2,"S",Just 1),(2,"S",Just 2),(2,"S",Nothing),(1,"J",Nothing),(1,"J",Nothing),(1,"J",Nothing)]
(399 reductions, 834 cells)
Is anyone able to give me any tips to get rid of the duplicate entries in the result? The (2, "S", Nothing) should not be in the results and only 1 of the (1, "J", Nothing) term should be present.
Thanks in advance.
Upvotes: 1
Views: 533
Reputation: 9807
So when you do [f x y | x <- xs, y <- ys]
you are doing a Cartesian product or inner join: every element from xs
will be paired against every value of ys
. That's not what you want.
What you want is to first go over all of the elements of xs
, then get a list of all of the values in ys
which match, then dispatch over whether that latter list is null, to include a Nothing
if you need it, then adjoin x1
and x2
to the elements of that list to make triples. So we start with writing that "general strategy":
ljoin xs ys = concat [map (adjoin x1 x2) $ handleNulls $ getMatches x1 ys
| (x1, x2) <- xs] where
The "concat" is needed because each element of xs
will be producing its own list, and we want to "flatten" them all into one big list of results, that's the only thing that was missing from our general strategy. Now we fill in the details:
ljoin xs ys = concat [map (adjoin x1 x2) $ handleNulls $ getMatches x1 ys
| (x1, x2) <- xs] where
adjoin x1 x2 x3 = (x1, x2, x3)
handleNulls zs | null zs = [Nothing]
| otherwise = map Just zs
getMatches x1 ys = [y2 | (y1, y2) <- ys, y1 == x1]
Now there are two ways that you might mean "using only the prelude function definitions": this could mean that you're not allowed any import
statements but you're allowed to make the expression as complex as you like (sane!), or it could mean that you're not allowed any other definitions at all, including where
(insane!). I don't see any way to do the latter without cheating for the handleNulls
step, but you can sort of "hide" the cheating as a function application because it's not a recursive binding:
ljoin xs ys = concat [ zip3 (repeat x1) (repeat x2) .
(\zs -> if null zs then [Nothing] else map Just zs) .
map snd . filter ((x1 ==) . fst) $ ys | (x1, x2) <- xs]
But that's just "yuck"; the previous version is much clearer.
Upvotes: 2
Reputation: 548
If keys in ys
are unique, then the simplest option would be to use Prelude.lookup
:
ljoin xs ys = [ (kx, x, lookup kx ys) | (kx,x) <- xs ]
Otherwise, it boils down to something like this:
ljoin ((kx,x):xs) ys = (case [ y | (ky, y) <- ys, ky==kx ] of
[] -> [(kx, x, Nothing)]
ys' -> [(kx, x, Just y') | y' <- ys'])
++ ljoin xs ys
ljoin [] _ = []
Or equivalently, if you want to get fancy with comprehensions:
ljoin xs ys = [ (kx, x, my) | (kx, x) <- xs, my <- case [ y | (ky, y) <- ys, kx==ky ] of { [] -> [Nothing]; ys' -> fmap Just ys'} ]
You can, of course, replace case
with pattern-matching function:
ljoin xs ys = [ (kx, x, my) | (kx, x) <- xs, my <- listToMaybe [ y | (ky, y) <- ys, kx==ky ] ]
where listToMaybe [] = [Nothing]
listToMaybe ys' = map Just ys'
Upvotes: 0