How to optimize Meteor + Galaxy + Mongo Cloud?

danielmain · August 12, 2019, 9:16am

we are using as solution for meteorjs the galaxy server with mongo cloud. The solution is pretty easy for deployment etc. But unfortunately we experience lag and delay on the server, which is much bigger than the delay in our localhost server. We also tryed to test the different meteor calls and publications with meteor-down, the data provided shows on the galaxy server up to 10 times slower response time than the localhost. So we have tryed to connect the localhost with the mongocloud - and there we have the same slow speed like on galaxy.

We updated already mongodb on mongo cloud to the version 4.0.x (latest stable release). We also put indexes everywhere where possible. We are using the Atlas M10 Instance - AWS (AWS cloud provider)

So our concern is regarding mongodb cloud with clusters. Somehow it is too slow and we don’t know how to make it faster. We are thinking that we are not the only one who is using this, but why we experience such lag in the mongodb connection? How to make it work faster? Do you have any suggestions?

eluck · August 12, 2019, 12:27pm

You should probably make sure that the both Mongo and Meteor servers reside in the same AWS region. When the servers are physically close to each other the lag is usually smaller.

If it doesn’t work for you we (Astraload perf team) can weigh in and solve your performance issues this or another way.

storyteller · August 14, 2019, 10:57pm

Beside what @eluck has said M10 is one of the smallest instances, for DB. So if you have large queries or larger load you might want to consider upgrading it to a larger instance with more resources. Also check that you have properly configured oplog. Another option to increase performance would be adding Reddis Oplog, but that is a scaling improvement (guessing that you use M10 that is not the issue here).

drew · August 15, 2019, 3:17am

Hey, we actually encountered this problem on our sharded cluster running on Galaxy & Atlas.

Here’s a quick list of tasks we performed:

Ensure db and web app were hosted in same geographic region
Index all collections
Find collections that were causing wait times by using APM and adding this.unblock()
Add shard keys to each find & fetch
Implemented redis-oplog (made the most impact)

For context, here was our performance before these changes were made:

After:

Hope this helps

danielmain · September 4, 2019, 8:25pm

Thanks @drew. We are looking into implementing redis-oplog. What do you recommend for the redis server?

drew · September 7, 2019, 7:22am

We just spun up an EC2 instance and installed redis following this tutorial: How To Install and Secure Redis on Ubuntu 18.04 | DigitalOcean

Make sure to follow the security instructions and allow access through firewalls

arggh · September 7, 2019, 1:06pm

To me it sounds like the physical distance between the DB and Galaxy is the issue. I’d double check to make sure both reside in the same region/datacenter.

I thought redig-oplog was more of a scaling tool to reduce the impact of high load on app servers by replacing oplog tailing with a more fine grained system based on Redis. I’m not sure it can help you, when at the end of the day, you still have to fetch the data from the DB sitting on the opposite side of the planet (…in the worst case scenario).

danielmain · September 13, 2019, 10:49pm

@drew Thanks for your advice. We setup an EC2 instance and installed redis as described. However it doesn’t seem to improve our average pub/sub or method response times in the APM stats. I am running a meteor-down test of 100 simultaneous users to load our site. Any thoughts?

danielmain · September 13, 2019, 10:50pm

@arggh our DB and Galaxy both reside in Frankfurt

drew · September 14, 2019, 4:20am

I had to do a lot of research on the forums to setup redis-oplog correctly.

In .meteor/packages ensure cultofcoders:redis-oplog and disable-oplog are listed on top of all the other third-party installed meteor packages (order matters).

Have you also ensured that a connection is being made in console on server run? It should print a message with Redis connected or similar on default settings.

danielmain · September 15, 2019, 11:00pm

@drew yes the connection to redis is made. This is an example of the log messages that I see on our galaxy server:

2019-09-15 17:48:17-05:00Application process starting, version 668
ctrnb
2019-09-15 17:48:27-05:00RedisOplog - Established connection to redis.
ctrnb
2019-09-15 17:48:32-05:00[1568587712158] - [RedisSubscriptionManager] Subscribing to channel: __dummy_coll_Ec3SC58jBatMji77T
ctrnb
2019-09-15 17:48:32-05:00[1568587712162] - [RedisSubscriptionManager] Unsubscribing from channel: __dummy_coll_Ec3SC58jBatMji77T
ctrnb
2019-09-15 17:48:32-05:00[1568587712228] - [RedisSubscriptionManager] Subscribing to channel: __dummy_coll_Ec3SC58jBatMji77T
ctrnb
2019-09-15 17:48:32-05:00[1568587712230] - [RedisSubscriptionManager] Unsubscribing from channel: __dummy_coll_Ec3SC58jBatMji77T
ctrnb
2019-09-15 17:48:36-05:00Meteor APM: Successfully connected
ctrnb
2019-09-15 17:48:41-05:00Meteor APM: completed instrumenting the app
ctrnb
2019-09-15 17:48:41-05:00Server started in development environment
0wa2c
2019-09-15 17:48:51-05:00The container is being stopped due to a change in container type.
0wa2c
2019-09-15 17:48:51-05:00Application exited with signal: terminated
ctrnb
2019-09-15 17:50:13-05:00[1568587813957] - [RedisSubscriptionManager] Subscribing to channel: meteor_accounts_loginServiceConfiguration
ctrnb
2019-09-15 17:50:14-05:00[1568587814051] - [RedisSubscriptionManager] Subscribing to channel: configurations
ctrnb
2019-09-15 17:50:33-05:00[1568587833643] - [RedisSubscriptionManager] Unsubscribing from channel: meteor_accounts_loginServiceConfiguration
ctrnb
2019-09-15 17:50:33-05:00[1568587833646] - [RedisSubscriptionManager] Unsubscribing from channel: configurations
ctrnb
2019-09-15 17:50:33-05:00[1568587833669] - [RedisSubscriptionManager] Subscribing to channel: meteor_accounts_loginServiceConfiguration
ctrnb
2019-09-15 17:50:33-05:00[1568587833670] - [RedisSubscriptionManager] Subscribing to channel: users::opdH6vNJPR6P2rDeL
ctrnb
2019-09-15 17:50:33-05:00[1568587833797] - [RedisSubscriptionManager] Subscribing to channel: configurations
ctrnb
2019-09-15 17:50:34-05:00[1568587834198] - [RedisSubscriptionManager] Subscribing to channel: roles
ctrnb
2019-09-15 17:50:34-05:00[1568587834222] - [RedisSubscriptionManager] Subscribing to channel: rolesGroups
ctrnb
2019-09-15 17:50:34-05:00[1568587834485] - [RedisSubscriptionManager] Subscribing to channel: ChatRooms
ctrnb
2019-09-15 17:50:34-05:00[1568587834510] - [RedisSubscriptionManager] Subscribing to channel: Trainings
ctrnb
2019-09-15 17:50:34-05:00[1568587834534] - [RedisSubscriptionManager] Subscribing to channel: products
ctrnb
2019-09-15 17:50:34-05:00[1568587834581] - [RedisSubscriptionManager] Subscribing to channel: payments
ctrnb
2019-09-15 17:50:34-05:00[1568587834606] - [RedisSubscriptionManager] Subscribing to channel: ratings
ctrnb
2019-09-15 17:50:34-05:00[1568587834630] - [RedisSubscriptionManager] Subscribing to channel: TrainingPrograms
ctrnb
2019-09-15 17:53:25-05:00[1568588005428] - [RedisSubscriptionManager] Subscribing to channel: users
ctrnb
2019-09-15 17:54:49-05:00[1568588089825] - [RedisSubscriptionManager] Subscribing to channel: users::rksPmEdBpCb3osqfQ
ctrnb
2019-09-15 17:54:49-05:00[1568588089825] - [RedisSubscriptionManager] Subscribing to channel: users::8GzNkgLDwNtjLHPc8
ctrnb
2019-09-15 17:54:49-05:00[1568588089825] - [RedisSubscriptionManager] Subscribing to channel: users::qor3evG8z3ZrxJtMm
ctrnb
2019-09-15 17:54:50-05:00[1568588090624] - [RedisSubscriptionManager] Subscribing to channel: users::nzHrSxStK4csyZzL6
ctrnb
2019-09-15 17:54:50-05:00[1568588090624] - [RedisSubscriptionManager] Subscribing to channel: users::N2wgvaEHpuvpsfWh3
ctrnb
2019-09-15 17:54:50-05:00[1568588090624] - [RedisSubscriptionManager] Subscribing to channel: users::BbGNB427FfGEi3j7e
ctrnb
2019-09-15 17:54:51-05:00[1568588091338] - [RedisSubscriptionManager] Subscribing to channel: users::QhvaETKKJDwuqJADA
ctrnb
2019-09-15 17:54:52-05:00[1568588092137] - [RedisSubscriptionManager] Subscribing to channel: users::atosZ3gr6hi8DkneN
ctrnb
2019-09-15 17:54:52-05:00[1568588092833] - [RedisSubscriptionManager] Subscribing to channel: users::HpF34dJsbQgbN2LyG
ctrnb
2019-09-15 17:54:52-05:00[1568588092833] - [RedisSubscriptionManager] Subscribing to channel: users::s3qrf7GgRwi3A3EwG
ctrnb
2019-09-15 17:54:52-05:00[1568588092833] - [RedisSubscriptionManager] Subscribing to channel: users::phjPFPoNek6q36zc8
ctrnb
2019-09-15 17:54:53-05:00[1568588093130] - [RedisSubscriptionManager] Subscribing to channel: users::Y5Fjzxkno6dxsxK3c
ctrnb
2019-09-15 17:54:53-05:00[1568588093832] - [RedisSubscriptionManager] Subscribing to channel: users::cvi8dR3g9hYGMqruW
ctrnb
2019-09-15 17:54:55-05:00[1568588095036] - [RedisSubscriptionManager] Subscribing to channel: users::G2aLRuf89fy4R56Mc
ctrnb
2019-09-15 17:54:55-05:00[1568588095036] - [RedisSubscriptionManager] Subscribing to channel: users::wFsevudYMdLycqQpZ
ctrnb
2019-09-15 17:54:55-05:00[1568588095036] - [RedisSubscriptionManager] Subscribing to channel: users::MhERCof42n94rnLRR
ctrnb
2019-09-15 17:54:57-05:00[1568588097438] - [RedisSubscriptionManager] Subscribing to channel: users::68eGkrK6fgQMQLo4t
ctrnb
2019-09-15 17:54:59-05:00[1568588099437] - [RedisSubscriptionManager] Subscribing to channel: users::ZbbLjSDcJ6nRMKeQP
ctrnb
2019-09-15 17:55:00-05:00[1568588100051] - [RedisSubscriptionManager] Subscribing to channel: users::PZ6nhDiTChxqd7H8w
ctrnb
2019-09-15 17:55:00-05:00[1568588100051] - [RedisSubscriptionManager] Subscribing to channel: users::rJ2Bjn5MW5bk85t4R
ctrnb
2019-09-15 17:55:00-05:00[1568588100051] - [RedisSubscriptionManager] Subscribing to channel: users::r5FsewwrGTei7M3k5
ctrnb
2019-09-15 17:55:04-05:00[1568588104325] - [RedisSubscriptionManager] Subscribing to channel: users::NhwaWAPt6snA9xiAD
ctrnb
2019-09-15 17:55:10-05:00[1568588110027] - [RedisSubscriptionManager] Subscribing to channel: users::kZ2cKpJ7yS2GBv2FM
ctrnb
2019-09-15 17:55:11-05:00[1568588111731] - [RedisSubscriptionManager] Subscribing to channel: users::HPbRzkAyaw8dsBjTA
ctrnb
2019-09-15 17:55:12-05:00[1568588112830] - [RedisSubscriptionManager] Subscribing to channel: users::gRHZndrtHrkNjoGiR
ctrnb
2019-09-15 17:55:20-05:00[1568588120131] - [RedisSubscriptionManager] Subscribing to channel: users::dmuEC6y2k6gXHT3ah
ctrnb
2019-09-15 17:55:25-05:00[1568588125526] - [RedisSubscriptionManager] Subscribing to channel: users::QgJM5B444osb5rz3g
ctrnb
2019-09-15 17:55:26-05:00[1568588126925] - [RedisSubscriptionManager] Subscribing to channel: users::QS5vZFRj8boaK2Rnn
ctrnb
2019-09-15 17:55:30-05:00[1568588130027] - [RedisSubscriptionManager] Received event: "i" to "users"
ctrnb
2019-09-15 17:55:33-05:00[1568588133035] - [RedisSubscriptionManager] Subscribing to channel: users::e9nBSxFCtT8Adc6Cn
ctrnb
2019-09-15 17:55:34-05:00[1568588134633] - [RedisSubscriptionManager] Subscribing to channel: users::A4CYpvQz3YwnJw5T2
ctrnb
2019-09-15 17:55:37-05:00[1568588137327] - [RedisSubscriptionManager] Subscribing to channel: users::owii4WzBvTq4yrgP3
ctrnb
2019-09-15 17:55:38-05:00[1568588138537] - [RedisSubscriptionManager] Received event: "u" to "users"
ctrnb
2019-09-15 17:55:40-05:00[1568588140824] - [RedisSubscriptionManager] Subscribing to channel: users::YXthFgGcAcWMq7dtc
ctrnb
2019-09-15 17:55:42-05:00[1568588142439] - [RedisSubscriptionManager] Subscribing to channel: users::GgQLuZyfDXzyyJeb3
ctrnb
2019-09-15 17:55:44-05:00[1568588144539] - [RedisSubscriptionManager] Subscribing to channel: Messages
ctrnb
2019-09-15 17:55:44-05:00[1568588144931] - [RedisSubscriptionManager] Received event: "i" to "Messages"

And this is an example of the log messages I see on the redis server:

1232:M 15 Sep 2019 22:58:29.266 - 2 clients connected (0 replicas), 848600 bytes in use
1232:M 15 Sep 2019 22:58:34.281 - DB 0: 2 keys (0 volatile) in 4 slots HT.
1232:M 15 Sep 2019 22:58:34.281 - 2 clients connected (0 replicas), 848264 bytes in use
1232:M 15 Sep 2019 22:58:39.296 - DB 0: 2 keys (0 volatile) in 4 slots HT.

repeated continuously (the bytes in use keep changing)

drew · September 16, 2019, 9:44am

Hmm, I’m not too sure in this case. It might be worth opening an issue on the redis-oploag github to see if anyone else has encountered this

danielmain · September 18, 2019, 9:23pm

@drew see this discussion here https://github.com/cult-of-coders/redis-oplog/issues/339
From what I understand, redis-oplog is only meant to improve scalabilty, i.e when using multiple meteor instances. It is not expected to improve DB response times on a single instance.

Which seems to contradict what you said earlier in this discussion:

Implemented redis-oplog (made the most impact)

I am wondering why you saw so much improvement on your system. Are you using multiple instances?

drew · September 18, 2019, 10:08pm

We use a sharded cluster (across 3 geographic regions) - which does not support oplog tailing easily. We had huge latency as we waited for each action to propagate across each geographic region (i.e. London to Sydney to US). By replacing it with redis-oplog as a centralised source, we saw huge performance increases. This allowed us to support roughly 600 concurrent connections without degrading our performance (where methods & pubsubs were unusably slow).

You may be able to test modifying subscription polling times as listed here: https://docs.meteor.com/api/collections.html . Before we implemented redis-oplog, editing polling subscription times (i.e. the time before a subscription would ‘refresh’ once new data) made a little performance improvement on our subscriptions. But this was due to the same fundamental problem of oplog tailing across a sharded cluster.