Tuesday, September 6, 2016

node.js performance optimization and single threaded architecture

Leave a Comment

I'm running a Node.js app with express and want to start increasing its performance. Several routes are defined. Let's have an basic example:

app.get('/users', function (req, res) {     User.find({}).exec(function(err, users) {         res.json(users);     } }); 

Let's assume we have 3 clients A, B and C, who try to use this route. Their requests arrive on the server in the order A, B, C with 1 millisecond difference in between.

1. If I understand the node.js architecture correctly, every request will be immediately handled, because Users.find() is asynchronous and there is non-blocking code?

Let's expand this example with a synchronous call:

 app.get('/users', function (req, res) {         var parameters = getUserParameters();          User.find({parameters}).exec(function(err, users) {             res.json(users);         }     }); 

Same requests, same order. getUserParameters() takes 50 milliseconds to complete.

2. A will enter the route callback-function and blocks the node.js thread for 50 milliseconds. B and C won't be able to enter the function and have to wait. When A finishes getUsersParameters() it will continue with the asynchronous User.find() function and B will now enter the route callback-function. C will still have to wait for 50 more milliseconds. When B enters the asynchronous function, C's requests can be finally handled. Taken together: C has to wait 50 milliseconds for A to finish, 50 milliseconds for B to finish and 50 milliseconds for itself to finish (for simplicity, we ignore the waiting time for the asynchronous function)?

Assuming now, that we have one more route, which is only accessible by an admin and will be called every minute through crontab.

app.get('/users', function (req, res) {       User.find({}).exec(function(err, users) {         res.json(users);     } });  app.get('/admin-route', function (req, res) {     blockingFunction(); // this function takes 2 seconds to complete }); 

3. When a request X hits admin-route and blockingFunction() is called, will A,B and C, who will call /users right after X's request have to wait 2 seconds until they even enter the route callback-function?

4. Should we make every self defined function, even if it takes only 4 milliseconds, as an asynchronous function with a callback?

3 Answers

Answers 1

The answer is "Yes", on #3: blocking means blocking the event loop, meaning that any I/O (like handling an HTTP request) will be blocked. In this case, the app will seem unresponsive for those 2 seconds.

However, you have to do pretty wild things for synchronous code to take 2 seconds (either very heavy calculations, or using a lot of the *Sync() methods provided by modules like fs). If you really can't make that code asynchronous, you should consider running it in a separate process.

Regarding #4: if you can easily make it asynchronous, you probably should. However, just having your synchronous function accept a callback doesn't miraculously make it asynchronous. It depends on what the function does if, and how, you can make it async.

Answers 2

The ground principle is anything locking up the CPU (long-running for loops for instance) or anything using I/O or the network must be asynchronous. You could also consider moving out CPU-intensive logic out of node JS, perhaps into a Java/Python module which exposes a WebService which node JS can call.

As an aside, take a look at this module (might not be production-ready). It introduces the concept of multithreading in NodeJS: https://www.npmjs.com/package/webworker-threads

Answers 3

#3 Yes

#4 Node.js is for async programming and hence its good to follow this approach to avoid surprises in performance

Meanwhile, you can use cluster module of Node.js to improve performance and throughput of your app.

If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment