Mongo Scaling Issues

Hi,

Tonight we’re having peak traffic on our site. We’re running 40 DO droplets, but the issue seems to be the Mongo database is taking far too long to respond according to Kadira and the fact that it takes along time to do things even on a server that has no users connected to it.

What are common strategies to deal with such issues?

1 Like

Are you hosting your self on those droplets or via DBaaS?

We’re hosting on Compose.io

Do you have indexes set-up?
Inspect ops in kadira, does it says, its using oplog tailing?

1 Like

Yeah. Indexes are setup. Is there an easy way to find important indexes in case we’re missing any?

1 Like

Not sure about mongo, but often you can run a query on the db to see ‘missed indexes’. Also, did you follow instructions in this post: http://info.meteor.com/blog/tuning-meteor-mongo-livedata-for-scalability

Finally, MongoDB is really not a reactive db. Oplog trailing is a design mistake as it’s not scalable. If you are sharing your DB with all your DO instances, each additional instance only offers incremental improvement, as it has to watch the activity of ALL users to detect its own.

We are migrating Meteor to RethinkDB which has built in reactivity.

3 Likes

799 active connections is a problem with mongodb because it creates a thread per socket. In case all these connections are active which they typically are since the drivers are sending ping and ismaster commands so regularly. The amount of context switches is just insane.

Also does compose actually provides its users the Machine specs, that is how much cores are you actually running on? This is very important because if your instance is only pinned to one core worst case VCPU and you have 799 active threads, this will not go very well for you.

DM me if you still continue to face issues.

2 Likes

I have no idea why we have 799 active connections with only 40 instances running. Sent you a DM. Thanks

This is because the node.js driver typically has a connection pool limit of 100 for node.js.

I am not sure if meteor allows you to set this, but you can set the pool size for the driver to no more than 5.

1 Like

Any idea how I’d go about doing that?

So taking down instances may help too right?

Take down about 20 instances and watch how many connections are dropped. I myself do not develop with meteor so i am not sure if the framework provides a way for you to pass the mongo options down to the driver. It should else it wont make sense.

I may run into problems of not having enough instances running if I take too many down, but I will try a few

How many RPS are you seeing across your cluster at the moment?

I don’t understand the question

Request per second at the load balancer. I will think you have some sort of proxy routing request to the 40 instances and curious what the RPS is at the moment. Typically an inactive meteor instance opens about 14 connections to the mongodb. For your case that is 560 connections already and this is for the case where the meteor instance has zero traffic so the 799 adds up. Unfortunately it is too much.

1 Like

Using Nginx as the load balancer, but I’m not sure how I find RPS. We had about 1000 connected users at the time, and now it’s far less, but the issues have persisted.

There is one essential worker instance that is doing a lot of writes to the database.

Things have settled a little right now, but it’s been two hours of hundreds of complaints and things are still slow.

What sort of number is acceptable?

Thanks for the help. Pretty certain the issue was as you said and I had far too many connections to the database. I’ve reduced that number now and also increased the RAM for the deployment.

I’ll need to look into connection pooling too.

1 Like

Try mongodb://<username:password>@<ip>:27017/<dbname>?maxPoolSize=200

The default is 100. You can increase to whatever number you want

3 Likes