MongoDB running slow with Meteor app

Hi, can you explain please why is it better to use an external server for the database?

The mlab free tier seems to be for sandboxing. Would it be suitable for deployment data that I care about?

https://mlab.com/plans/pricing/

You need to get to NodeChef to use Kadiara. It will help you solve your problems. $10 a month.

I’m hosted with mLab. They’re great.

Sure you can host yourself, but good luck trying to get that sorted out.

1 Like

There are a number of performance reasons why you wouldn’t want to share a server between your app code and your db – you ideally want each of those to be able to scale independently. Also, per https://12factor.net/backing-services, you want to be able to quickly/easily swap out pieces of your app. Co-locating your db and code means you can’t switch out that server without addressing both needs.

Didn’t know you were in production, so yeah, a sandbox might not be a good fit, then. The $15/mo shared tier has been good for me on smaller apps, and the price point is easy to justify once you realize that anything more than 15 min/mo spent on mongodb devops is covered by that cost.

Just my two cents, though =)

Thanks, that’s interesting to know and it would be great to have more visibility.

I’m currently paying $6 per month for a Digital Ocean droplet on which I have several prototype apps and my one ‘live’ one which I think is never going to be huge as it’s quite niche. It’s so nearly working that I hate to give up - other folks must have got the same setup working?

@Ingaborg the fact you’re still asking these questions indicates you don’t have the skills to run your own professional setup.

Use Galaxy + mLab and focus on developing your app.

Anything else is a huge waste of time lol. Trust me…

I’ll answer your question for you.

First, to get started with the oplog, take a look at my notes here:

It’s hard to say with a Digital Ocean droplet what its performance limitations are. It must be a little related, because it runs fine on your laptop. Cheap instances are extremely slow, like as slow as $35 Raspberry Pis slow.

Judging from your logs, your documents are very large, which is what’ slowing things down pretty badly. IO on instances like these is very, very slow, it could conceivably take 100ms to read these 15MB worth of document data.

Your index doesn’t help you here. It doesn’t matter if you indexed “private,” you’re scanning a lot of documents anyway because it seems like you actually need all the documents.

If you switch to a dedicated database provider, it will probably go a lot slower. You’re transferring a lot of data.

Oplog will help, but it really depends on your application; are you making small changes to large documents? If yes, it will help a lot. Large changes to large documents will still be pretty slow, but not this slow.

It looks like you’re probably storing image or file data. Considering using S3 instead, its better suited for this problem.

1 Like

Thank you. Image data is on S3, but the body of the document is in MongoDB, however I don’t think my documents are very large - maybe 40kb for the main text field. And on the listings page I have been careful only to request the fields that are required, which are small e.g. document id, id of creator, name.

So I wouldn’t think I’m loading an excessive amount of data?

I originally deployed using mup-legacy a couple of years ago and that was fine. The problem started when I upgraded from Meteor 1.2 to Meteor 1.6 and couldn’t get mupx to work. So I switched to Phusion Passenger, and now I have these problems.

That’s interesting. I suspected the document sizes were unusual because your logs show 15MB responses (the reslength part) for 100 some documents, which is consistent with what you’re saying but is still “large” in the sense of being many.

I’m pretty confident if you e.g. tested your code to just retrieve the _id fields, you’ll see everything is zippity quick. All these other things you’re pointing out aren’t as big of a red flag as the logs, so I think you were right to look at them first. My interpretation may be wrong though.

Stop wasting your time. Get Kadira on Galaxy or NodeChef and fix your problem.

I wasted 40 hours goofing around, 10 minutes after I bit the bullet at NodeChef problem was solved – and very obvious.

Until you do this, no one can help you.

Thanks, I’m looking into a dedicated Meteor hosting service. scalingo also looks reasonable, does anybody know a reason to pick either NodeChef or Scalingo?

That part of the log is trying to say that you transferred 15MB from your database to get 101 documents. That’s 150KB documents. Without oplog enabled, you’ll see that every five seconds, which is exactly what appears in your logs. You’ll probably see significantly reduced latency if you enable oplog.

I’m not sure if this is wasting time per-se, your logs are telling you exactly what the issue is. You should probably enable oplog, and use limit in your queries. Don’t page on the client. I don’t think just using an external service, insofar as it will create an oplog user and replica set for you (which are exactly two commands), will help you here.

Thanks! Yes, maybe pagination is part of the problem, I am using alethes:pages. I expected that a plugin would be smart enough to page on the server, not the client, but maybe it is not, I will check that out.

I now have several things to look at and try out, thank you everybody!

Can anybody explain to me why there are so many connections per user and whether that is an issue? Thank you!

Regarding alethes:pages, looking at their code…

The package uses skip and limit to do paging. The thing about skip is that, on mongo, it requires scanning all the documents prior to the skip anyway. A skip and limit paging implementation is inherently inefficient, and it’s probably part of the reason your database queries return a hundred documents (I assume all the products) despite a user maybe looking at just 12.

As an aside, the correct way to do paging is to precompute a sort order number for each document (i.e., product), add an index for the field used for the sort order number, and use $gte and $lt in your query to specify the page. It’s generally smart to preload the next page, and this will give you the absolute best UX for paging. For the most robust to program, you generally have to just cache pages (i.e., save a mongo document representing a whole page of search results) for each query (which is a combination of search terms, categories and page number), providing a brief delay for users who make uncommon queries. This is more or less what Amazon does, and isn’t nearly as complicated as it seems, especially not in an architecture like Meteor.

The connections per user is a little bit more surprising. If you mean subscriptions, alethes:pages does indeed create at most 20 subscriptions (1 for each page before it starts recycling):

If you mean connections according to mongo, that I’m not sure about.

1 Like

Thank you so much for all your help and advice. Now that you have helped me to understand the logs, I’ve checked my code more carefully and in fact I was not limiting certain queries. I thought I was but I was mistakenly only setting limits on the client. Fixing that has made the database much happier! I am still looking at other issues and the possibility of oplog tailing, but setting limits alone seems to have made the difference between “OK” and “not OK”. Thanks again!

Awesome. That would make a lot of sense.

I have a question. If I were to limit all my queries on the server, to (say) 12 results, how would I be able to retrieve an arbitrary document by ID? It seems like the limit would prevent me from returning any document that was further down the list?

You can do this in two ways:

  1. Use a Meteor method to retrieve single documents by ID. This may be the easier solution, as long as you can call the method at the “right time”.
  2. Leverage Meteor’s pub/sub. Publication#1 limits the result set to 14. Publication#2 publishes a single document by ID. On the client, subscribe to both publications. Meteor will ensure you get the merged result.

Thank you, those sound like just what I need. I’ll have a go.

1 Like

Here are some notes on the changes I’ve made recently, in case it helps anybody else. All these things are pretty basic but I hadn’t thought about them enough in my focus on the front end.

Pagination: Alethes Pages
It seems this package publishes ALL fields of paginated collections. Not only does this hit the database but it means that if you show a paginated list of users, all their sensitive information such as emails is shown. I have removed the Users page while I try to find out whether there is any way to limit the published fields, or a better pagination package to use.

I’ve also been much more careful about where the pagination is used. My Home page shows the first few documents in each category (New Documents, My Documents, All Documents, Recently Viewed) and I’ve created a separate publish function on the server for each, taking care to return only as many documents as required and only the fields that will be displayed.

The Pagination is now only used for the subsidiary pages that show a paginated view of New Documents (or whatever). These pages are still expensive but cleaning up the Home page (the most commonly viewed page) should help.

Search
The search index has to access all documents and users, but I’ve limited the published fields to those that are searchable.

Generally I’ve trawled through all my publish and subscribe functions, trying to make sure they are as parsimonious as possible.

Oplog tailing would probably still be a good idea, but I think it was well worth cleaning up my database use first. I’ll run the app for a bit and see how it goes before making more changes.

Thanks again for all the help!

1 Like