Thanks for sharing, this is helpful.
I am curious, can you share a bit about the nature of the task being performed in those 2%?
Thanks for sharing, this is helpful.
I am curious, can you share a bit about the nature of the task being performed in those 2%?
App is in Beta (small group of users) and schedules tasks into a personās calendar. It only needs to do the scheduling if something changes (tasks change or calendar changes) but when it does there is a lot of processing that occurs for a short period of time.
You can check it out here if you are interested https://yomez.com
Yes got it, I was curious about the CPU intensive use case, I will definitely check the app.
Have you thought about using worker thread for the CPU intensive task?
I did something similar with an image processing function. Images where uploaded to storage directly from client (to prevent eating the server RAM) and the processing (compression and manipulation) were done using Google Cloud Functions to protect the server CPU.
I usually try to offload any CPU or RAM intensive tasks of the server machine (specially NodeJS servers) and restrict the server to serving results.
Thanks for sharing again, I think itās common, I had my share of those
No, Did not think of using worker threads. Iām assuming you mentioned that for parallel execution? (CPU was already pegged on Galaxy so would not have helped)
I was using a separate instance from the user facing instance to prevent impact (which I think is what you mentioned) but it was still too slow and scaling would have been a challenge + additional cost.
Lambda was a win for this use case much better on all fronts (much faster speed, lower cost, automatic scaling/parallel execution).
Yes, youāre right.
I personally think this the ideal case for function as a service, and good thing nowadays weāve those cloud functions.
I have a lot to say. Consider this post to be part 1.
Scaling/performance tips that DO NOT require changes to your code
Use a dedicated server instead of a VPS. I refer to my previous post An Enemy of Scalability - Hypervisor (Virtualization) Overhead.
If you are doing lots of file I/O or if you are making heavy use of the database, make sure your dedicated server has an SSD, preferably an NVMe SSD. Hetzner and OVH offer dedicated servers with NVMe SSDs. There are a few other providers as well.
Use Nginx as a reverse-proxy in front of your Meteor app. Nginx will do what it is most efficient at - terminating your HTTPS connection and serving static assets. You want to avoid the node process having its time needlessly wasted.
Our Meteor deployment script also compresses static assets on disk using the brotli compressor. Nginxās brotli module has an option brotli_static that enables Nginx to automatically serve the compressed version of assets from disk by looking for files with the .brotli extension instead of having to waste its CPU time compressing the asset on-the-fly.
If you can, make all subsystems running on the same server communicate with each other using UNIX sockets instead of TCP sockets. Communication over UNIX sockets incurs significantly less overhead.
In our deployments, Nginx passes requests onto Meteor via its UNIX socket file (e.g. /var/lib/mysql/mysql.sock
) and Meteor passes requests onto MySQL via its UNIX socket (e.g. /var/run/meteor/meteor.sock
).
To configure Meteor to listen on a UNIX socket, specify the UNIX_SOCKET_PATH environment variable.
Other common services like MongoDB, PostgreSQL, Redis and Memcached can be configured to listen on UNIX sockets as well.
Cloudflare supports proxying WebSockets so it works well with Meteor apps. It is useful for:
Preventing the IP address of your origin server from being exposed,
Providing some protection against DDOS attacks
Serving static assets from Cloudflareās edge locations closest to the user and reducing your origin serverās data usage.
If you have a server outage, you can use Cloudflareās API to quickly divert traffic to a hot standby Meteor app server without any DNS propagation delay
Cloudflare also offer a load balancer if you have a cluster of Meteor app servers and need to distribute traffic between them.
If you are using Cloudflare with Meteor, you should modify your Meteor startup script to set the environment variable HTTP_FORWARDED_COUNT=1
If you are using Cloudflare with Nginx and Meteor, you should modify it to set HTTP_FORWARDED_COUNT=2
Scaling/performance tips that DO require changes to your code
Do everything you can to avoid running CPU intensive code in the Node.js event loop. Instead, such code should run asynchronously in the thread pool.
Up until recently, this often had to be done by writing a Node Addon in C++ using Native Abstractions for Node (NaN) or N-API. This approach is commonly used by number-crunching code that has to run at top speed, e.g. high performance crypto packages like bcrypt and shacrypt.
However, thanks to the introduction of the node.js worker_threads module combined with the SharedArrayBuffer data type, it has become more practicable to write thread pool code in JavaScript and get acceptable performance.
For more info:
A previous post of mine on the thread āHow does Meteor scale out vertically and horizontally?ā
Nodejs.orgās article Donāt Block the Event Loop (or the Worker Pool)
There is also nodeās inbuilt cluster module that allows you to spawn multiple node.js (Meteor) worker processes. It has its uses and I have commented on it in the past. Today, I would advise people to first try and solve their problems using the thread pool and only use the cluster module as a last resort.
A few days ago I discovered this NPM package threads.js that claims to āmake web workers & worker threads as simple as a [JavaScript] function callā.
I havenāt had to use it yet, but I will definitely give it a try with Meteor at some point.
Great tips. I want to also remind all of us (including myself because even after 26+ years of doing this I still forget), Profile your code before deciding what to optimize (i.e. find the root cause for why things are slow).
For me, the root cause for many slow things was the code (fetching or updating one record at a time in the DB). Also CPU was also a limit and no amount of threading would help with that so I went to AWS Lambda to run some of the code for the specific situation I had.
Root Cause
I hadnāt realized Cloudflare could work with Meteor. This is great info!
Thatās been the case for quite a while. If youāre looking for a guide, hereās an older discussion: Simple guide for optimising your Meteor app with Cloudflare (Cache, TTFB, Firewall, etc)
Two more tips for running Meteor apps behind Cloudflare:
In Cloudflare Caching Settings, ensure you disable Always Online.
If your Meteor app goes down or if there is a brief interruption to Internet connectivity between Cloudflare and the origin server, disabling Always Online will avoid a prolonged delay until Cloudlfare recognises that your Meteor app is online again.
If you have enabled Content Security Policy (CSP) for your Meteor app, ensure that you add a Cloudflare Page Rule to bypass Cloudflareās cache for the URL that accesses the Meteor runtime settings file meteor_runtime_config.js.
This is how the page rule would look on the Cloudflare dashboard once configured:
https://mymeteorapp.com/meteor_runtime_config.js
Cache Level: Bypass
If your Meteor app is located in a subfolder, then your page rule would look something like this:
https://mymeteorapp.com/*/meteor_runtime_config.js
Cache Level: Bypass
If you donāt add this bypass rule, Cloudflare will automatically add a 14,400 second (4 hour) HTTP expires header for the meteor_runtime_config.js file.
This will cause a problem when the time comes to update Meteor to a more recent version or make any other significant changes that affect Meteorās runtime configuration settings. The userās web browser will keep retrieving a stale cached version of meteor_runtime_config.js and go into a crazy reload loop.
This is avoided by adding the above Cloudflare Page Rule which ensures that meteor_runtime_config.js is always served fresh.
So many helpful tips and insights here!
Iād like to echo the importance of eliminating unnecessary pub/sub to avoid costly overheads. Shameless plug: I recently introduced pub-sub-lite - a package addressing this very issue.
Thank you so much @npvn
Aint shameless if it helps with performance/scaling it should be there, but could you please explain briefly how it would reduce the costly overhead of pub/sub, I think that would great and will drive more adoption, including myself.
Thanks @alawi. The packageās main goal is to make it very easy to convert an existing pub/sub (that youāve identified as unnecessary) into a Method, by simply replacing Meteor.publish
with Meteor.publishLite
and Meteor.subscribe
with Meteor.subscribeLite
. Under the hood your data will be sent from server to client via a Method invocation. Itās very similar to a traditional Method invocation, with some added benefits:
Meteor.subscribeLite
provides a handle that reactively returns true
once data has arrived (similar to the behaviour of a real subscription handle). So your existing client-side rendering logic wonāt need any modification.Besides that, the package also provides Meteor.methodsEnhanced
and Meteor.callEnhanced
that work in the same way as Meteor.methods
and Meteor.call
, with some extra features:
In essence, the package helps you quickly āfixā existing unnecessary pubs/subs (by converting them to Methods) and provide an enhanced version of Methods that is more convenient to use.
That sounds like a really elegant solution.
Could you please tell me which of the AWS Lambda triggering methods you have chosen in your specific case? Did you use the Amazon API Gateway REST API?
Yes, I used the API Gateway so there is a simple http interface to trigger the code I wanted. I used common code from my meteor project (without meteor specific libraries) and added a little bit on top so it worked in Lambda. Used the āSAMā stuff from AWS to make it really easy to test locally and deploy in AWS.
Weāve now scaled meteor to 25,000 monthly active users. With daily events picking at 25,000 a day. With over 400,000 documents per collection.
Whe ended up removing all publish/subscribe for methods. We were running into memory related issues with the subscriptions with large collections.
We expect that by the end of the year we will have over 25,000 users daily. We are planing on moving memory intensive API to AWS lambda for external integrations.
We have one large monolithic (admin app) which runs the meteor installation as well as three standalone React apps that connect via a custom DPP
How many documents in total? We have two collections with over 700,000 docs in it, in total weāre at 2,100,000 but with a low double digit number of user
We havenāt seen any memory problems when our collections grew, itās all pretty stable and our backend app can serve 6-7 users running complicated queries in the smallest AWS configuration. Both CPU and memory are balanced then.
Waow Impressive, what is your secret?
Secret for what? Having that many docs in MongoDb with so little users? As I explained, weāre using DNA data and as you can imagine the number of connections between people are infinite basically.
Or what secret do you want to know?