Jongware
Jongware

Reputation: 22457

Binary processing speed with JS dialect

Working with Adobe's dialect of Javascript called ExtendScript (reportedly based on ECMA-262, ed.3/ISO/IEC 16262), I aim to process binary files. ES does not support ArrayBuffer, and so I read parts of the files into a string and use toCharCode to access byte, integer, and long values.

However, that comes with a severe speed penalty. Reading 215,526 items (mixed byte, word, and long), I get this performance:

charCodeAt 29,368 ms (note there is a significant jitter of +/-5% in my timings due to random disk read fluctuations)

String.prototype.uwordValueAt = function(index)
{
   index = index || 0;
   return this.charCodeAt(index)+(this.charCodeAt(index+1)<<8);
}

(and similar functions for byteValueAt and longValueAt).

I tried replacing charCodeAt with a direct look-up this way:

var ascToBin = {};
var ascToBinH = {};
for (i=0; i<256; i++)
{
    ascToBin[String.fromCharCode(i)] = i;
    ascToBinH[String.fromCharCode(i)] = i<<8;
}

so I could use this instead:

String.prototype.wordValueAt = function(index)
{
    index = index || 0;
    return ascToBin[this[index]]^ascToBinHS[this[index+1]];
}

with the following result:

ascTobin lookup: 29,528 ms

Hardly significant -- sometimes it is slightly faster due to timing jittering. Leaving out the index dummy check doesn't make a significant impact.

The read algorithm itself cannot be easily improved: the data consists of pointers to yet more data, and as far as I can tell all data is read only once. The data is non-sequentially stored, but I take care to read as large as possible buffers (short of reading the entire file at once). At any rate, I don't believe disk access is a real bottleneck, as the pointers-to-data and its associated data are packed together in 1/2K chunks (of which there are 284, totalling 11,616 individual data packets, in this particular worst-case file).

A typical large-ish file loads in 3.5 seconds, which is OK, but I'd still like to strip out every possible nanosecond. Is there a better alternative to using String and charCodeAt?

Upvotes: 0

Views: 346

Answers (1)

Esailija
Esailija

Reputation: 140228

No and it is not the charCodeAt method that is slow but the implementation.

If it's possible to use other implementation or language and implementation, you should do that.

Upvotes: 3

Related Questions