Hennes
Hennes

Reputation: 1370

Haskell: How to use a HashMap in a main function

I beg for your help, speeding up the following program:

main = do
  jobsToProcess <- fmap read getLine
  forM_ [1..jobsToProcess] $ \_ -> do
    [r, k] <- fmap (map read . words) getLine :: IO [Int]
    putStrLn $ doSomeReallyLongWorkingJob r k

There could(!) be a lot of identical jobs to do, but it's not up to me modifying the inputs, so I tried to use Data.HashMap for backing up already processed jobs. I already optimized the algorithms in the doSomeReallyLongWorkingJob function, but now it seems, it's quite as fast as C.

But unfortunately it seems, I'm not able to implement a simple cache without producing a lot of errors. I need a simple cache of Type HashMap (Int, Int) Int, but everytime I have too much or too few brackets. And IF I manage to define the cache, I'm stuck in putting data into or retrieving data from the cache cause of lots of errors.

I already Googled for some hours but it seems I'm stuck. BTW: The result of the longrunner is an Int as well.

Upvotes: 1

Views: 1681

Answers (3)

I am just adding this answer since I feel like the other answers are diverging a bit from the original question, namely using hashtable constructs in Main function (inside IO monad).

Here is a minimal hashtable example using hashtables module. To install the module with cabal, simply use

cabal install hashtables

In this example, we simply put some values in a hashtable and use lookup to print a value retrieved from the table.

import qualified Data.HashTable.IO as H

main :: IO ()
main = do
        t <- H.new :: IO (H.CuckooHashTable Int String)
        H.insert t 22 "Hello world"
        H.insert t 5 "No problem"
        msg <- H.lookup t 5
        print msg

Notice that we need to use explicit type annotation to specify which implementation of the hashtable we wish to use.

Upvotes: 0

Daniel Wagner
Daniel Wagner

Reputation: 153152

It's pretty simple to make a stateful action that caches operations. First some boilerplate:

{-# LANGUAGE FlexibleContexts #-}
import Control.Monad.State
import Data.Map (Map)
import qualified Data.Map as M
import Debug.Trace

I'll use Data.Map, but of course you can substitute in a hash map or any similar data structure without much trouble. My long-running computation will just add up its arguments. I'll use trace to show when this computation is executed; we'll hope not to see the output of the trace when we enter a duplicate input.

reallyLongRunningComputation :: [Int] -> Int
reallyLongRunningComputation args = traceShow args $ sum args

Now the caching operation will just look up whether we've seen a given input before. If we have, we'll return the precomputed answer; otherwise we'll compute the answer now and store it.

cache :: (MonadState (Map a b) m, Ord a) => (a -> b) -> a -> m b
cache f x = do
    mCached <- gets (M.lookup x)
    case mCached of
        -- depending on your goals, you may wish to force `result` here
        Nothing -> modify (M.insert x result) >> return result
        Just cached -> return cached
    where
    result = f x

The main function now just consists of calling cache reallyLongRunningComputation on appropriate inputs.

main = do
    iterations <- readLn
    flip evalStateT M.empty . replicateM_ iterations
        $   liftIO getLine
        >>= liftIO . mapM readIO . words
        >>= cache reallyLongRunningComputation
        >>= liftIO . print

Let's try it in ghci!

> main
5
1 2 3
[1,2,3]
6
4 5
[4,5]
9
1 2
[1,2]
3
1 2
3
1 2 3
6

As you can see by the bracketed outputs, reallyLongRunningComputation was called the first time we entered 1 2 3 and the first time we entered 1 2, but not the second time we entered these inputs.

Upvotes: 5

trevor cook
trevor cook

Reputation: 1590

I hope i'm not too far off base, but first you need a way to carry around the past jobs with you. Easiest would be to use a foldM instead of a forM.

import Control.Monad
import Data.Maybe

main = do
  jobsToProcess <- fmap read getLine
  foldM doJobAcc acc0 [1..jobsToProcess] 
  where
    acc0 = --initial value of some type of accumulator, i.e. hash map
    doJobAcc acc _ = do
      [r, k] <- fmap (map read . words) getLine :: IO [Int]
      case getFromHash acc (r,k) of
         Nothing -> do 
           i <-  doSomeReallyLongWorkingJob r k
           return $ insertNew acc (r,k) i
         Just i -> do
            return acc

Note, I don't actually use the interface for putting and getting the hash table key. It doesn't actually have to be a hash table, Data.Map from containers could work. Or even a list if its going to be a small one.

Another way to carry around the hash table would be to use a State transformer monad.

Upvotes: 1

Related Questions