How does Meteor scale out vertically and horizontally?


#1

Node utilises a single-threaded engine. So if you run a Node process on a quad-core machine it only utilises one core, right? But Node.js has also a Cluster module that enables multi-threaded execution, thus, enabling utilisation of all CPU cores. However, the Cluster module only works for vertical scaling (scales on the same machine). There is a nice article summarising the situation. Then, my questions;

  1. What is the preferred method in scaling Meteor apps vertically? And your good and bad experiences, shortly if possible?
  2. More interestingly, how can the horizontal scalability be achieved in Meteor, basically?

#2

Vertically:

  • nginx in front acting as a reverse proxy and also as a load balancer
  • multiple meteor app instances according to the number of cores available per machine

Horizontally:

  • load balancer in front of each machine
  • multiple instances to handle the load
  • autoscaling using AWS autoscaling group
  • no specific settings required for meteor to support this setup

#3

There are specific settings required on the reverse proxy though: sticky sessions


#4

There’s a very important point that so many people overlook when designing their Node.js (and Meteor) apps to ensure vertical scalability:

It is only the Node.js EVENT LOOP that is limited to 1 CPU core.

Node.js also has a pool of background (libuv) threads that asynchronous tasks are delegated to. The operating system’s scheduler distributes these threads amongst all available CPU cores.

The number of threads in the Node.js threadpool is configured with the environment variable UV_THREADPOOL_SIZE. The default value is 4, but it can be increased to up to 128.

If you design your app to leverage the thread pool for its computationally intensive tasks, this frees up the CPU time available to the event loop to handle more requests.

The typical way to use the Node.js thread pool for your own app’s functionality is by:

  1. Implementing an Asynchronous C++ Addon for Node.js using Nan or NAPI

  2. Running Javascript code within in a Webworker Thread

Number 1 generally gives you the best performance and most flexibility, as you can integrate virtually any C/C++ library into your app.

Case Study:

Our Meteor webapp uses SHA-512 crypt password authentication.

Originally, our password authentication code relied on the SHA-512 hashing function provided by the Node.js crypto module. To this day, all the crypto module’s hashing functions are synchronous and execute in the event loop, hogging its CPU time.

As a result, whenever a user logged it, the event loop would be stalled for 2-3 seconds (most of which is the time taken to perform 200,000 rounds of SHA-512 hashing).

It was clear that it would be best to make the password authentication code asynchronous and execute in the thread pool.

Node Webworker threads were not a practical implementation option because calling Node.js native modules within a Webworker thread (i.e. crypto) is not currently supported.

Although there exists a pure Javascript implementation of SHA-512 crypt that could have been called from within a Webworker thread, it is incredibly slow (During testing, I remember it taking 15 seconds to perform 200,000 rounds of SHA-512 hashing).

Instead, I created a fork of an existing Node.js C++ addon shacrypt and enhanced it to support asynchronous execution within a libuv thread. I also added support to allow it to be compiled under a Windows environment, which a couple of our programmers used for Meteor development at the time.

The SHA-512 crypt password authentication now takes about 0.25 seconds and doesn’t stall the event loop.

The only minor inconvenience when using Node C++ addons is that you need to have C++ development tools installed on your system in order to build them. For Linux/OSX users, this is straightforward. For Windows users, you can download Microsoft’s Build Tools for Visual Studio 2017.

I also note that when I built this addon, I utilised the Native Abstractions for Node.js (Nan) API.

People who are writing Node C++ addons today should consider using the more modern replacement, N-API, which is intended to provide better compatibility with future versions of Node.js.

For More Info

These are the reference materials that helped me implement the asynchronous C++ add-on using Nan:



Official Node.js website page on creating C/C++ Addons:

https://nodejs.org/docs/latest-v8.x/api/addons.html
https://nodejs.org/docs/latest-v8.x/api/n-api.htm


#5

Also, Node 10 introduces experimental support for threads. Fortunately, it seems to be a pragmatic (and familiar) way of using them, which doesn’t appear to introduce the nasty complications that Java has, for example.


#6
  • Mongodb database with sharding.
  • redis oplog

#7

Blockquote multiple meteor app instances according to the number of cores available per machine

You mean multiple instances of the same app running on different ports, right?


#8

Also, a single instance of a Meteor app may be able to achieve vertical scalability through the use of the Node.js cluster module.

I previously posted about its use here:


#9

Yes. Different ports