CargoMeister
CargoMeister

Reputation: 4319

Node.js/Express Netflix Issue

I'm a little confused by this issue Netflix ran into with Express. They started to see a build of latency in their APIs. We use Express for everything, and I'd like to avoid any sudden problems.

Here's a link to the article.

http://www.infoq.com/news/2014/12/expressjs-burned-netflix

The way it's written, it sounds like a problem with Express, and how it's handling routing. But in the end, they state the following:

"After dig into their source code the team found out the problem. It resided in a periodic function that was being executed 10 times per hour and whose main purpose was to refresh route handlers from an external source. When the team fixed the code so that the function would stop adding duplicate route handlers, the latency and CPU usage increases went away."

I don't understand what exactly they were trying to do. I don't believe this was something that Express was doing on it's own. Sounds like they were doing something a bit oddball, and it didn't work out. I'd think load testing would have revealed this. Anyway, anyone who understands this better who can comment on what the problem actually was? The entire section at the top of the article talks about how Express rotates through the routes list, but I really don't see how iterating over what should not be a very large array would cause that much of a delay.

Upvotes: 1

Views: 598

Answers (1)

Thomas Beirne
Thomas Beirne

Reputation: 185

The best counterpoint explanation of this I've seen is Eran Hammer's. The comments are also illuminating. Of particular interest are the following excerpts from Yunong Xiao's (the author of the Netflix post) comment:

The specific problem we encountered was not a global handler but the express static file handler with a simple string path. We were adding the same static router handler each time we refreshed our routes. since this route handler was in the global routing array, it meant that every request that was serviced by our app had to iterate though this handler.

It was absolutely our mis-use of the Express API that caused this -- after all, we were leaking that specific handler! However, had Express 1) not stored static handlers with simple strings in the global routing array, and 2) rejected duplicate routing handlers, or 3) not taken 1ms of CPU time to merely iterate through this static handler, then we would not have experienced such drastic performance problems. Express would have masked the fact that we had this leak -- and perhaps this would have bit us down the road in another subtle way.

 

Our application has over 100 GET routes (and growing), even using the Express's Router feature -- which lets you compose arrays of handlers for each path inside the global route array, we'd still have to iterate through all 100 handlers for each request. Instead, we built our own custom global route handler, which takes in the context of a request (including its path) and returns a set of handlers specific to the request such that we don't have to iterate through handlers we don't need.

This was our implementation, which separated the global handlers that every request needs from handlers specific to each request. I'm sure more optimal solutions are out there.

Upvotes: 2

Related Questions