Database design to prevent performance degradation


#1

Let’s say I have an app that allows users to create ingredients.

I want each user to have 200 ingredients already in the system ready for them to use and edit upon joining the app. Now because I want to allow them the posibility to edit their ingredients, I believe I need to add the 200 ingredients to the database with the createdBy field set to their user id upon account creation. This way a user will only see and edit their own set of ingredients and not someone else’s.

My question is in regards to performance…

If I have 10 users join, the ingredients collection now has 1000 documents. If in 6 months time I have 5000 users. The ingredients collection now has 5 million documents (plus the ingredients users create on their own). 5000 users would be great, but what if there’s even more.

Correct me if I am wrong but I am pretty sure that this would degrade performance as time goes by wouldn’t it? If the query is searching through millions of documents, a user would most likely be waiting quite some time before their ingredients are displayed on the page right?

Is there any other way I can achieve what I want without this potential problem?

Thanks


#2

MongoDB is more than capable of performing well in this scenario - if you ensure you add indexes to the appropriate fields and construct your queries sensibly.


#3

Wow really? That’s impressive.

I’ve come across indexes (within the last 6 month of learning about Meteor) and will look into it with more detail but can you give me an example of a query that would be more sensible than another? What do you mean by this?

Thanks


#4

By a “sensible” query, I mean one which you know will perform well given your database/collection design - which will include how you normalise/denormalise relationships and use indexes (basic MongoDB design). It will also include decisions which are more Meteor-specific - like whether you should deeply nest objects or try to pull them into the top of the document structure. You will also be making UX/coding decisions which may influence how you get the data - how much reactivity (or not) being a fundamental one.

Without knowing more about what you want to do I can only answer in general terms, but you need to examine your queries (or potential queries) and look for optimisable patterns.

For example, if you always retrieve documents belonging to the user, you might add an index to the user id. If there are 100,000 users, each with 200 ingredients, the index ensures that only 200 documents are scanned by the database engine, even though there are 20,000,000 documents in the collection.

Similarly, if you want to retrieve those ingredients in alphabetical order, you might also put an index on the name. Mongo is then able to retrieve the documents in the correct order without an extra sorting pass.


#5

You might find this article interesting (because we all aspire to our business processing 2 billion documents a day!)

http://blog.mongodb.org/post/79557091037/processing-2-billion-documents-a-day-and-30tb-a


#6

Thanks! You have been very helpful.

I just found this article that explains normalization/denormalization part of things quite simply. I basically knew about these concepts already but I didn’t know that this is what normalization/denormalization referred to.

I still need to wrap my head around indexing though but it is great to know that Mongo can handle billions of documents :slight_smile: