Downside for denormalizing data?


Let’s say I am modeling youtube videos in mongodb, these are the stuff necessary to show a video page:

author (avatar, name, username),
like counts,
view counts,

Is it better to store everything but the comment inside the video document? like so:

   author: {avatar, username, name},

What are the downsides to this approach vs reactive joins(with package like publish composite)? What about just publishing the count number through Collection.find().count() cursor, is it still slow? For the author field, does it really improve performance by that much by embedding instead of joining?

I am trying to figure out how much RAM i could save by denomalizing because subscription to many collections at one time could be costly for meteor.


The main downside is that when the user changes their name, you’ll have to manually update every video, and if you’re not careful your data will get out of sync.

Denormalizing things like counts is ok because you don’t really have a choice, but I personally try to avoid denormlization if I can help it.


I see, thank you Sacha. Could you perhaps explain why it will get out of sync or Is there anything i can read about that informs me on this topic?


If you user change his name, you have to update the name of the user collection but also each video (usually with a meteor method, since the video are probably not published) pertaining to this user. There is the risk that you forget to do that for each operation.


For a meteor app, how large of a scale/userbase should need me to start thinking about denormalizing?


Depends on the features of a User and what actions they can perform which determines their impact on your collections.


The official MongoDB documentation covers this topic pretty well.