Reputation: 25930
I wrote a large algorithm that parses a string (5MB long as tested). After some changes, I noticed a substantial drop in performance, about 30%-35%, so I started debugging it with various performance measurements, and found something strange.
It turns out, my algorithm drastically slowed down after I removed the following line in the beginning of the algorithm, which was called only once:
text.match(/\n/g);
If I simply put that line at the top of the algorithm, without ever using its results, the performance is up by 30-35%, and the rest of changes in the algorithm appear to make no difference whatsoever.
It seems that executing such line somehow internally gives Node.js a boost in further string processing, that I cannot explain or analyze any further.
I then started testing it across different versions of Node.js, and found out that this is happening in the current 10.15.3, but hot hapenning in v4.x or v12.x.
Such huge inconsistency, and utterly cryptic effect on performance, I don't know what to make of it.
Can anybody shed some light on why running an extra RegEx search like that can suddenly provide a boost in Node.js? Or is it somehow specific in my case?
UPDATE
I have logged an issue against Node.js for this.
Upvotes: 4
Views: 175
Reputation: 85481
Depending on how a string in V8 was created, it is stored either as a sequential array or "sliced" (a binary tree). A binary tree is much less cache-friendly and traversing one multiple times will incur significant performance penalty due to pipeline stalls caused by cache misses.
A side effect of a ReGex match()
in V8 was a call to String::Flatten
. That would lead to the string being sequentialized in memory. That side effect was unfortunately removed in a later V8 version.
Node 10 exposes a new function %FlattenString
which can be used to sequentialize a string explicitly.
As a version-independent solution you can use the flatstr
module. Depending on V8 version it either calls %FlattenString(s)
or Number(s)
(relying on the side effect).
Upvotes: 3