David
David

Reputation: 331

Golang : fatal error: runtime: out of memory

I trying to use this package in Github for string matching. My dictionary is 4 MB. When creating the Trie, I got fatal error: runtime: out of memory. I am using Ubuntu 14.04 with 8 GB of RAM and Golang version 1.4.2.

It seems the error come from the line 99 (now) here : m.trie = make([]node, max)

The program stops at this line.

This is the error:

fatal error: runtime: out of memory

runtime stack:
runtime.SysMap(0xc209cd0000, 0x3b1bc0000, 0x570a00, 0x5783f8)
    /usr/local/go/src/runtime/mem_linux.c:149 +0x98
runtime.MHeap_SysAlloc(0x57dae0, 0x3b1bc0000, 0x4296f2)
    /usr/local/go/src/runtime/malloc.c:284 +0x124
runtime.MHeap_Alloc(0x57dae0, 0x1d8dda, 0x10100000000, 0x8)
    /usr/local/go/src/runtime/mheap.c:240 +0x66

goroutine 1 [running]:
runtime.switchtoM()
    /usr/local/go/src/runtime/asm_amd64.s:198 fp=0xc208518a60 sp=0xc208518a58
runtime.mallocgc(0x3b1bb25f0, 0x4d7fc0, 0x0, 0xc20803c0d0)
    /usr/local/go/src/runtime/malloc.go:199 +0x9f3 fp=0xc208518b10 sp=0xc208518a60
runtime.newarray(0x4d7fc0, 0x3a164e, 0x1)
    /usr/local/go/src/runtime/malloc.go:365 +0xc1 fp=0xc208518b48 sp=0xc208518b10
runtime.makeslice(0x4a52a0, 0x3a164e, 0x3a164e, 0x0, 0x0, 0x0)
    /usr/local/go/src/runtime/slice.go:32 +0x15c fp=0xc208518b90 sp=0xc208518b48
github.com/mf/ahocorasick.(*Matcher).buildTrie(0xc2083c7e60, 0xc209860000, 0x26afb, 0x2f555)
    /home/go/ahocorasick/ahocorasick.go:104 +0x28b fp=0xc208518d90 sp=0xc208518b90
github.com/mf/ahocorasick.NewStringMatcher(0xc208bd0000, 0x26afb, 0x2d600, 0x8)
    /home/go/ahocorasick/ahocorasick.go:222 +0x34b fp=0xc208518ec0 sp=0xc208518d90
main.main()
    /home/go/seme/substrings.go:66 +0x257 fp=0xc208518f98 sp=0xc208518ec0
runtime.main()
    /usr/local/go/src/runtime/proc.go:63 +0xf3 fp=0xc208518fe0 sp=0xc208518f98
runtime.goexit()
    /usr/local/go/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc208518fe8 sp=0xc208518fe0
exit status 2

This is the content of the main function (taken from the same repo: test file)

var dictionary = InitDictionary()   
var bytes = []byte(""Partial invoice (€100,000, so roughly 40%) for the consignment C27655 we shipped on 15th August to London from the Make Believe Town depot. INV2345 is for the balance.. Customer contact (Sigourney) says they will pay this on the usual credit terms (30 days).")   

var precomputed = ahocorasick.NewStringMatcher(dictionary)// line 66 here
fmt.Println(precomputed.Match(bytes))

Upvotes: 5

Views: 41257

Answers (3)

Set resource limit to unlimited worked for me

if ulimit -a return 0 run ulimit -c unlimited

Maybe set a real size limit to be more secure

Upvotes: 1

T. Claverie
T. Claverie

Reputation: 12256

Your structure is awfully inefficient in terms of memory, let's look at the internals. But before that, a quick reminder of the space required for some go types:

  • bool: 1 byte
  • int: 4 bytes
  • uintptr: 4 bytes
  • [N]type: N*sizeof(type)
  • []type: 12 + len(slice)*sizeof(type)

Now, let's have a look at your structure:

type node struct {
    root bool        // 1 byte
    b []byte         // 12 + len(slice)*1
    output bool      // 1 byte
    index int        // 4 bytes
    counter int      // 4 bytes
    child [256]*node // 256*4 = 1024 bytes
    fails [256]*node // 256*4 = 1024 bytes
    suffix *node     // 4 bytes
    fail *node       // 4 bytes
}

Ok, you should have a guess of what happens here: each node weighs more than 2KB, this is huge ! Finally, we'll look at the code that you use to initialize your trie:

func (m *Matcher) buildTrie(dictionary [][]byte) {
    max := 1
    for _, blice := range dictionary {
        max += len(blice)
    }
    m.trie = make([]node, max)

    // ...
}

You said your dictionary is 4 MB. If it is 4MB in total, then it means that at the end of the for loop, max = 4MB. It it holds 4 MB different words, then max = 4MB*avg(word_length).

We'll take the first scenario, the nicest one. You are initializing a slice of 4M of nodes, each of which uses 2KB. Yup, that makes a nice 8GB necessary.

You should review how you build your trie. From the wikipedia page related to the Aho-Corasick algorithm, each node contains one character, so there is at most 256 characters that go from the root, not 4MB.

Some material to make it right: https://web.archive.org/web/20160315124629/http://www.cs.uku.fi/~kilpelai/BSA05/lectures/slides04.pdf

Upvotes: 13

stonith
stonith

Reputation: 166

The node type has a memory size of 2084 bytes. I wrote a litte program to demonstrate the memory usage: https://play.golang.org/p/szm7AirsDB

As you can see, the three strings (11(+1) bytes in size) dictionary := []string{"fizz", "buzz", "123"} require 24 MB of memory.

If your dictionary has a length of 4 MB you would need about 4000 * 2084 = 8.1 GB of memory.

So you should try to decrease the size of your dictionary.

Upvotes: 5

Related Questions