xzhao
xzhao

Reputation: 41

Haskell Program Low Performance Compared with Perl

I am doing a Facebook Hackercup 2015 problem with Haskell and got stuck on this problem.

Input: Begins with an integer T, the number of questions. For each question, there is one line containing 3 space-separated integers:A, B, and K.

Output: For the ith question, print a line containing "Case #i: " followed by the number of integers in the inclusive range [A, B] with a primacity of K.

Primacity of a number X is the number of its prime factors. For example, the primacity of 12 is 2 (as it's divisible by primes 2 and 3), the primacity of 550 is 3 (as it's divisible by primes 2, 5, and 11), and the primacity of 7 is 1 (as the only prime it's divisible by is 7).

1 ≤ T ≤ 100  2 ≤ A ≤ B ≤ 10^7  1 ≤ K ≤ 10^9 

Here is my Haskell solution:

import System.IO
import Data.List
import Control.Monad

incEvery :: Int -> [(Int -> Int)]
incEvery n = (cycle ((replicate (n-1) id) ++ [(+ 1)]))

primes2 :: [Int]
primes2 = sieve 2 (replicate (10^7) 0)
  where
    sieve _ [] = []
    sieve n (a:xs) = (a + (if a == 0 then 1 else 0))
      : if a == 0 then
          sieve (n+1) (zipWith ($) (incEvery n) xs)
        else
          sieve (n+1) xs

process :: (Int, Int, Int) -> Int
process (lo, hi, k) =
  length . filter (\(a, b) -> a >= lo && a <= hi && b == k) . zip [2,3..] $ primes2

readIn :: [String] -> (Int, Int, Int)
readIn =
  (\[x, y, z] -> (x, y, z)) . fmap (read::String->Int) . take 3

lib :: String -> String
lib xs = unlines . fmap (\(i, x) -> "Case #" ++ (show i) ++ ": " ++ x) . zip [1,2..]
  . fmap parse . tail . lines $ xs
  where
    parse = (show . process . readIn . words)

main :: IO ()
main = interact lib

Here is my Perl solution:

use strict;
use warnings;

my $max = 10000010;

my @f = (0) x $max;

for my $i (2 .. $max) {
    if($f[$i] == 0) {
        $f[$i] = 1;
        # print $i . "\n";
        for my $j (2 .. ($max / $i)) {
            $f[$i * $j] ++;
        }
    }
}

my $k =  <STDIN>;
for my $i (1 .. $k) {
    my $line = <STDIN>;
    if($line) {
        chomp $line;
        my ($a, $b, $t) = split(' ', $line);
        my $ans = 0;
        for my $j ($a .. $b) {
            if($f[$j] == $t) {
                $ans ++;
            }
        }
        print "Case #$i: " . $ans . "\n";
    }    
}

Though I am using the same sieving algorithm for both languages, the Haskell version is significantly slower than Perl version on 10^7 scale of data. Basically the following Haskell function is slower than its Perl counterpart:

incEvery :: Int -> [(Int -> Int)]
incEvery n = (cycle ((replicate (n-1) id) ++ [(+ 1)]))

primes2 :: [Int]
primes2 = sieve 2 (replicate (10^7) 0)
  where
   sieve _ [] = []
   sieve n (a:xs) = (a + (if a == 0 then 1 else 0))
   : if a == 0 then
      sieve (n+1) (zipWith ($) (incEvery n) xs)
     else
      sieve (n+1) xs

I think both recursion and (zipWith ($) (incEvery n) xs) are causing the problem. Any ideas?

Upvotes: 4

Views: 566

Answers (2)

user3970496
user3970496

Reputation:

There is absolutely no reason why you need to resort to imperative programming to gain performance. The unique thing about Haskell is you have to learn to think differently if you want to program in a purely functional manner. Exploit laziness to speed things up a bit:

{-# LANGUAGE ScopedTypeVariables #-}

import Control.Applicative ( pure, (<$>) )
import Data.List           ( nub )
import Data.Monoid         ( (<>) )

isPrime :: (Integral i) => i -> Bool
isPrime n = isPrime_ n primes
  where isPrime_ n (p:ps)
          | p * p > n      = True
          | n `mod` p == 0 = False
          | otherwise      = isPrime_ n ps

primes :: (Integral i) => [i]
primes = 2 : filter isPrime [3,5..]

primeFactors :: (Integral i) => i -> [i]
primeFactors n = factors n primes
  where factors n (x:xs)
          | x * x > n      = [n]
          | n `mod` x == 0 = x : factors (n `div` x) (x:xs)
          | otherwise      = factors n xs

primacity :: (Integral i) => i -> Int
primacity = length . nub . primeFactors

user :: IO Int
user = do
  xs <- getLine
  let a :: Int = read . takeWhile (/=' ') . dropN 0 $ xs
  let b :: Int = read . takeWhile (/=' ') . dropN 1 $ xs
  let k :: Int = read . takeWhile (/=' ') . dropN 2 $ xs
  let n = length . filter (== k) . fmap primacity $ [a..b]
  pure n
    where
      dropN 0 = id
      dropN n = dropN (pred n) . drop 1 . dropWhile (/= ' ')

printNTimes :: Int -> Int -> IO ()
printNTimes 0 _ = pure ()
printNTimes n total = do
  ans <- user
  putStr $ "Case #" <> show (total - n + 1) <> ": "
  putStrLn $ show ans
  printNTimes (pred n) total

main :: IO ()
main = do
  n :: Int <- read <$> getLine
  printNTimes n n

This is basically mutual recursion mixed with laziness. Might take a while to understand it, but I can guarantee that it's fast.

Upvotes: 8

Reid Barton
Reid Barton

Reputation: 14999

Yes, of course. You're effectively using two different algorithms. Your Haskell zipWith ($) (incEvery n) xs has to process every entry of your list, while your Perl for my $j (2 .. ($max / $i)) { $f[$i * $j] ++; } only has to process the entries it actually increments, which is a factor of $i faster. This is a prototypical example of a problem where mutable arrays are helpful: in Haskell you can use STUArray, for example.

Upvotes: 6

Related Questions