user5536767
user5536767

Reputation:

Is there a way to import large (16.5 million entry) data structures as objects quickly?

I realize I might be pushing the limits of a scripting language like AHK, but I would think it should be possible to import some data structure with 16.5 million entries into an object relatively quickly. I mean the JSON file I'm trying to import is just 250MB, games load files of that size very quickly right?

I'm trying to use an AHK JSON library to import this 250MB JSON file and it's taking 15 minutes. I realize JSON probably isn't designed for large data loading, but how can I import data more quickly?

I'm open to any format or method.

Here's my code currently, some of it is commented out, which is the code used to generate and export the object to begin with:

#MaxMem 512
FileDelete, Log.txt

getTimestamp()
{
    DllCall("QueryPerformanceCounter", "Int64*", timestamp)
    DllCall("QueryPerformanceFrequency", "Int64*", frequency)
    return Round(timestamp * 1000 / frequency)
}

splitRGBColor(RGBColor, ByRef red, ByRef green, ByRef blue)
{
    red := RGBColor >> 16 & 0xFF
    green := RGBColor >> 8 & 0xFF
    blue := RGBColor & 0xFF
}

joinRGBColor(red, green, blue)
{
    SetFormat Integer, H
    red += 0
    green += 0
    blue += 0
    SetFormat Integer, D
    StringTrimLeft, red, red, 2
    StringTrimLeft, green, green, 2
    StringTrimLeft, blue, blue, 2
    redLength := StrLen(red)
    greenlength := StrLen(green)
    blueLength := StrLen(blue)
    if (redLength < 2) {
        red = 0%red%
    }
    if (greenLength < 2) {
        green = 0%green%
    }
    if (blueLength < 2) {
        blue = 0%blue%
    }
    hex := "0xff" . red . green . blue
    return hex
}

roundHexColor(color1ARGB, colorChunkSize){
    ;FileAppend, % "Hex: " . color1ARGB . "`n", Log.txt
    splitRGBColor(color1ARGB, red, green, blue)
    ;FileAppend, % "Red: " . red . " Green: " . green . " Blue: " . blue . "`n", Log.txt

    red := Round(red / colorChunkSize) * colorChunkSize
    green := Round(green / colorChunkSize) * colorChunkSize
    blue := Round(blue / colorChunkSize) * colorChunkSize

    color1ARGB := joinRGBColor(red, green, blue)
    ;FileAppend, % "Rounded hex: " . color1ARGB . "`n", Log.txt
    return color1ARGB
}

;condensedColors := {}

;loop, 255
;{
;   r := A_index
;   loop, 255
;   {
;       g := A_index
;       loop, 255
;       {
;           b := A_index
;           rgbHexRaw := joinRGBColor(r, g, b)
;           rgbHexRouded := roundHexColor(rgbHexRaw, 5)
;           condensedColors[rgbHexRaw] := rgbHexRounded
;       }
;   }
;}

;colorsJSON := JSON.Dump(condensedColors)
;FileDelete, condensedColors.json
;FileAppend, % colorsJSON, condensedColors.json

FileRead, condensedColorsJSON, condensedColors.json
condensedColors := JSON.Load(condensedColorsJSON)

testColor := 0xff3f975c

FileAppend, % "Test: " . testColor . " is rounded to " . condensedColors[testColor] . ".`n", Log.txt

runCounter := 160000
start := getTimestamp()
loop, %runCounter%
{
    roundedColor := condensedColors[0xff3f975c]
}
end := getTimestamp()
duration := end - start
average := duration / runCounter

FileAppend, % "We rounded " . runCounter . " colors in " . duration . "ms, or " . average . "ms per rounded color value.`n", Log.txt

Upvotes: 0

Views: 60

Answers (1)

Haskell McRavin
Haskell McRavin

Reputation: 641

You're going to have to pre-parse the file or cut it up into chunks and process it in threads.

You can do that with a custom piece of software. Java makes threads and parsing JSON easy. If you want it to actually be fast write it in C.

If you want a pure browser based solution you'll have to roll a new database and pre-process insertions.

Upvotes: 0

Related Questions