Reputation: 2129
I have a file with some data in it. This data never changes and I want to make it available outside of the IO monad. How can I do that?
Example (note that this is just an example, my data is not computable):
primes.txt:
2 3 5 7 13
code.hs:
primes :: [Int]
primes = map read . words . unsafePerformIO . readFile $ "primes.txt"
Is this a "legal" use of unsafePerformIO
? Are there alternatives?
Upvotes: 14
Views: 1552
Reputation: 707
This is based on the answer by dflemstr. Given that you want to load a list of integers you
may wish to perform the read
at compile time as well. I'm just writing it out because it would have been useful for me to see the example, and I hope it helps someone else.
import Language.Haskell.TH
import Language.Haskell.TH.Quote
intArray' :: String -> Q Exp
intArray' s = return $ ListE e
where
e = map (LitE . IntegerL . read) $ words s
intArray :: QuasiQuoter
intArray = QuasiQuoter { quoteExp = intArray' }
intArrayFile :: QuasiQuoter
intArrayFile = quoteFile intArray
To use it...
{-# LANGUAGE QuasiQuotes #-}
import TT
primes :: [Int]
primes = [intArrayFile|primes.txt|]
main = print primes
The benefits are
Upvotes: 2
Reputation: 71410
Using unsafePerformIO
in that way isn't great.
The declaration primes :: [Int]
says that primes
is a list of numbers. One particular list of numbers, that doesn't depend on anything.
In fact, however, it depends on the state of file "primes.txt" when the definition happens to be evaluated. Someone could alter this file to alter the value that primes
appears to have, which shouldn't be possible according to its type.
In the presence of a hypothetical optimisation which decides that primes
should be recomputed on demand rather than stored in memory in full (after all, its type says we'll get the same thing every time we recompute it), primes
could even appear to have two different values during a single run of the program. This is the sort of problem that can come with using unsafePerformIO
to lie to the compiler.
In practice, all of the above are probably unlikely to be a problem.
But the theoretically correct thing to do is to not make primes
a global constant (because it's not a constant). Instead, you make the computation that needs it parameterised on it (i.e. take primes
as an argument), and in the outer IO
program you read the file and then call the pure computation by passing the pure value the IO
program extracted from the file. You get the best of both worlds; you don't have to lie to the compiler, and you don't have to put your entire program in IO
. You can use constructs such as the Reader monad to avoid having to manually pass primes
around everywhere, if that helps.
So you can use unsafePerformIO
if you want to just get on with it. It's theoretically wrong, but unlikely to cause issues in practice.
Or you can refactor your program to reflect what's really going on.
Or, if primes
really is a global constant and you just don't want to literally include a huge chunk of data in your program source, you can use TemplateHaskell as demonstrated by dflemstr.
Upvotes: 6
Reputation: 62818
Your program does not define exactly when this file gets loaded. If the file does not exist, this will throw an exception, and there's no telling exactly where that will happen. (I.e., potentially after your program already did some observable real-world stuff.) Similar remarks apply if somebody decides to change the contents of the file; you don't know exactly when it gets read, and so which contents you'll get. (Unlikely to be a problem if the file isn't supposed to change.)
As for alternatives: One possibility is to create a global mutable variable [which in itself is somewhat evil], and insert the contents of the file into this variable from the main I/O thread. This way, the file gets read in at a well-defined moment. [I notice you're using lazy I/O as well, so you'd only be defining when the file gets opened.]
Really, the "correct" thing is to manually thread the data to every function that needs it. I can understand why you might want to not to that; it is a pain. You would perhaps use some kind of state monad to avoid doing this manually though...
Upvotes: 4
Reputation: 26157
You could use TemplateHaskell to read in the file at compile time. The data of the file would then be stored as an actual string in the program.
In one module (Text/Literal/TH.hs
in this example), define this:
module Text.Literal.TH where
import Language.Haskell.TH
import Language.Haskell.TH.Quote
literally :: String -> Q Exp
literally = return . LitE . StringL
lit :: QuasiQuoter
lit = QuasiQuoter { quoteExp = literally }
litFile :: QuasiQuoter
litFile = quoteFile lit
In your module, you can then do:
{-# LANGUAGE QuasiQuotes #-}
module MyModule where
import Text.Literal.TH (litFile)
primes :: [Int]
primes = map read . words $ [litFile|primes.txt|]
When you compile your program, GHC will open the primes.txt
file and insert its contents where the [litFile|primes.txt|]
part is.
Upvotes: 21
Reputation: 38708
Yes, it should be fine. You could add a {-# NOINLINE primes #-}
pragma to be safe — not sure whether GHC would ever inline a CAF.
The only alternative I can think of is to do the same thing during compile time (using Template Haskell), essentially embedding the primes into the binary. However, I prefer your version — note that the primes
list will be actually read & created lazily!
Upvotes: 4