I’m new to NoSQL databases (MongoDB), and I’m wondering if I could get some specifics on best practices for when to build collections, vs. when to add more arrays to a document. I think the best way to do this is I’ll explain what I’m doing, and you guys can let me know what I’m doing wrong. Thanks in advance!
Here are my collections (as of now)
- Users
- trainers
- clients
My assumptions are that trainers and clients should both be in the same collection (users), with a ‘field’ differentiating them (like ‘role: client’), which also controls their rights.
- Tasks
- workouts
- reminders
- task_data
– lots of fields
Here, tasks are settings that a personal trainer sets for each individual client. I assume that because there’s no limit to how many tasks a trainer can assign, these should be their own collection, related to clients and trainer via ‘client’ and ‘trainer’ attributes (that store their _id). Tasks have types, and to further complicate things, if a task is a workout, it will have child documents. My assumption is that the best way to do this is with a self-join, where the child document is set to ‘type: workout-child’, and then parent_id: [parents _id]
.
Now here’s where it gets hairy… these tasks trigger the daily creation of task_data, notifications that are sent to the client, which they can update can say whether or not they completed the task (default = not completed), so their progress can be monitored. Now, technically, these could also be tasks, just with a different type (type: 'workout-child-task-data'
). The ability to create nested attributes opens up another can of worms, as I could do something like this: data: {5/1/15: 'completed', 5/2/15: 'not completed', etc...}
Everything in the above paragraph seems wrong to me, which is why I have a separate task_data collection to handle this stuff, but thinking this through raised questions like:
- When should I create a new collection vs. create attributes and sub-attributes
- Are there performance issues between different ways of storing collections?
- Is there a point when you would say “that’s too many attributes, create more collections”
Finally, task_data could also be something like a graph, which could have hundreds or thousands of data points, so… do I create task_data_data, or do I create task_data that are children of other task_data’s, or do I create hundreds of attributes for some task_data documents?
Thanks for your help!