Database design to prevent performance degradation

serks · June 29, 2016, 12:22pm

Let’s say I have an app that allows users to create ingredients.

I want each user to have 200 ingredients already in the system ready for them to use and edit upon joining the app. Now because I want to allow them the posibility to edit their ingredients, I believe I need to add the 200 ingredients to the database with the createdBy field set to their user id upon account creation. This way a user will only see and edit their own set of ingredients and not someone else’s.

My question is in regards to performance…

If I have 10 users join, the ingredients collection now has 1000 documents. If in 6 months time I have 5000 users. The ingredients collection now has 5 million documents (plus the ingredients users create on their own). 5000 users would be great, but what if there’s even more.

Correct me if I am wrong but I am pretty sure that this would degrade performance as time goes by wouldn’t it? If the query is searching through millions of documents, a user would most likely be waiting quite some time before their ingredients are displayed on the page right?

Is there any other way I can achieve what I want without this potential problem?

Thanks

robfallows · June 29, 2016, 12:41pm

MongoDB is more than capable of performing well in this scenario - if you ensure you add indexes to the appropriate fields and construct your queries sensibly.

serks · June 29, 2016, 12:50pm

Wow really? That’s impressive.

I’ve come across indexes (within the last 6 month of learning about Meteor) and will look into it with more detail but can you give me an example of a query that would be more sensible than another? What do you mean by this?

Thanks

robfallows · June 29, 2016, 1:12pm

By a “sensible” query, I mean one which you know will perform well given your database/collection design - which will include how you normalise/denormalise relationships and use indexes (basic MongoDB design). It will also include decisions which are more Meteor-specific - like whether you should deeply nest objects or try to pull them into the top of the document structure. You will also be making UX/coding decisions which may influence how you get the data - how much reactivity (or not) being a fundamental one.

Without knowing more about what you want to do I can only answer in general terms, but you need to examine your queries (or potential queries) and look for optimisable patterns.

For example, if you always retrieve documents belonging to the user, you might add an index to the user id. If there are 100,000 users, each with 200 ingredients, the index ensures that only 200 documents are scanned by the database engine, even though there are 20,000,000 documents in the collection.

Similarly, if you want to retrieve those ingredients in alphabetical order, you might also put an index on the name. Mongo is then able to retrieve the documents in the correct order without an extra sorting pass.

robfallows · June 29, 2016, 1:19pm

You might find this article interesting (because we all aspire to our business processing 2 billion documents a day!)

http://blog.mongodb.org/post/79557091037/processing-2-billion-documents-a-day-and-30tb-a

serks · June 29, 2016, 1:53pm

Thanks! You have been very helpful.

I just found this article that explains normalization/denormalization part of things quite simply. I basically knew about these concepts already but I didn’t know that this is what normalization/denormalization referred to.

I still need to wrap my head around indexing though but it is great to know that Mongo can handle billions of documents