asafc
asafc

Reputation: 367

OCaml string length limitation when reading from stdin\file

As part of a Compiler Principles course I'm taking in my university, we're writing a compiler that's implemented in OCaml, which compiles Scheme code into CISC-like assembly (which is just C macros). the basic operation of the compiler is such:

  1. Read a *.scm file and convert it to an OCaml string.
  2. Parse the string and perform various analyses.
  3. Run a code generator on the AST output from the semantic analyzer, that outputs text into a *.c file.
  4. Compile that file with GCC and run it in the terminal.

Well, all is good and well, except for this: I'm trying to read an input file, that's around 4000 lines long, and is basically one huge expressions that's a mix of Scheme if & and. I'm executing the compiler via utop. When I try to read the input file, I immediately get a stack overflow error message. It is my initial guess that the file is just to large for OCaml to handle, but I wasn't able to find any documentation that would support this theory.

Any suggestions?

Upvotes: 2

Views: 1036

Answers (2)

asafc
asafc

Reputation: 367

Well, it turns out that the limitation was the amount of maximum ram the OCaml is configured to use.

I ran the following command in the terminal in order to increase the quota:

export OCAMLRUNPARAM="l=5555555555"

This worked like a charm - I managed to read and compile the input file almost instantaneously.

For reference purposes, this is the code that reads the file:

let file_to_string input_file =
  let in_channel = open_in input_file in
  let rec run () =
    try
      let ch = input_char in_channel in ch :: (run ())
    with End_of_file ->
      ( close_in in_channel;
       [] )
  in list_to_string (run ());;

where list_to_string is:

let list_to_string s =
  let rec loop s n =
    match s with
    | [] -> String.make n '?'
    | car :: cdr ->
       let result = loop cdr (n + 1) in
       String.set result n car;
       result
  in
  loop s 0;;

funny thing is - I wrote file_to_string in tail recursion. This prevented the stack overflow, but for some reason went into an infinite loop. Oh, well...

Upvotes: 0

Jeffrey Scofield
Jeffrey Scofield

Reputation: 66803

The maximum string length is given by Sys.max_string_length. For a 32-bit system, it's quite short: 16777211. For a 64-bit system, it's 144115188075855863.

Unless you're using a 32-bit system, and your 4000-line file is over 16MB, I don't think you're hitting the string length limit.

A stack overflow is not what you'd expect to see when a string is too long.

It's more likely that you have infinite recursion, or possibly just a very deeply nested computation.

Upvotes: 2

Related Questions