Rethinking Meteor – managing subscription state

If Meteor were built from scratch today, would it still opt to keep all client subscription state on the server and use merge box to process diffs?

When Meteor was created, relying on stateful servers for subscriptions was possibly the only solution that made sense, but this decision came with two serious tradeoffs:

  1. Limitations on scaling.

There are two big factors that place limits on scaling Meteor — oplog tailing and merge box. Oplog tailing should be replaced with change streams, which hopefully will eliminate the current restriction on scaling horizontally but they also present their own issues to solve. Merge box eats up resources storing all subscription state on the server and diffing data.

  1. Modeling data is at odds with Mongo.

Sounds ridiculous right? Meteor was made for Mongo. But there’s a big caveat: Mongo encourages embedding data when possible and discourages joins for the most part (though they have definitely made strides here) whereas Meteor encourages normalization due to its top-level-only diffing which necessitates joining. I believe deep diffing was ruled out due for being even more resource intensive than it already is.

Resolving these issues are key to solving Meteor’s notorious scalability issues. As I mentioned in another forum thread, if Meteor had a better realtime scaling story, it would be much more sought after, especially with the current trend of syncing over fetching and collaborative, local-first apps. Meteor is poised to be a go-to solution here when an authoritative server is needed (which is basically all B2B apps and non personal-use apps).

Could Meteor subscriptions benefit from a paradigm shift?

Things are much different today with clients that are crazy powerful relative to 13 years ago.

What if state and diffing was moved to the client and the server kept a limited amount of state? The different publication strategies opened the door somewhat to this line of thinking. What if we took it to an extreme?

The upsides of this shift:

  1. Distribute more work to today’s powerful clients for true edge computing
  2. Less resource usage on the server (RAM and CPU)
  3. Reduction in costs for your business
  4. Scale horizontally
  5. Potentially open the door for deep diffing where Meteor could align itself more with Mongo in its encouragement of embedding and denormalization (where it makes sense of course)

The potential downsides:

  1. Increased bandwidth
  2. Increased latency (maybe)
  3. If you’re serving users with underpowered devices, this might be problematic.
  4. ???

But if we take a local-first approach with optimistic UI to an extreme then there is zero latency. Bandwidth might also be a negligible issue if you’re careful with your projections and caching your subscriptions.

Now, you might be saying: “we already do this, we just use methods and avoid pub sub like the plague” but when you do this, you give up one of the things that makes Meteor great — syncing data rather than fetching — and you introduce more complexity for you to build and maintain. You’ll forego Minimongo and likely be re-inventing the wheel with your own client-side stores.

This was one of the impetuses for jam:pub-sub – keep the things that are great about syncing data into Minimongo and eliminate the drawbacks of traditional pub sub. To that end, it enables subscription caching and provides an easy way to use mongo change streams, meaning that oplog tailing could be disabled. In the most recent version of the package, you can also significantly reduce state on the server with the serverState config. You can essentially opt-in to what I’m describing above, today. I believe it could be a huge improvement for many Meteor apps and its functionality would be a great addition to Meteor core. For really large-scale realtime apps, maybe a solution involving Redis / Valkey or a new service that sits in-between your app and mongo, similar to ElectricSQL, will be required.

Curious to hear what people’s thoughts are in 2025. Have you experimented with something along these lines? What advantages and challenges have you experienced? What other ideas do you have to solve Meteor’s scalability concerns?

8 Likes

I was about to ask that this should be made into code or contributed to the core instead of it being mere thoughts then I scrolled down to serverState :sweat_smile:

EDIT: I think folks would be more inclined to try out your package if you can pair with with metrics illustrating the performance gains you were able to achieve. Maybe akin to GitHub - meteor/performance

2 Likes

For sure. To be clear though, this thread is meant to discuss ideas to help Meteor solve some fundamental limitations and reach its full potential not to convince people to try the package. :slightly_smiling_face:

It would be great to have help from interested community or core team members to put the package through different scenarios.

I think this is an area of Meteor where it should be optional. The developer should be able to toggle managing subscription state on the server or the client.

Considering Meteor is a full-stack framework, the “common” state logic is all JS, agnostic to if it runs on the client or the server.

Mobile apps would benefit from having subscription state stored on the server, where more traditional desktop/business applications could utilize client performance.

2 Likes

I still haven’t gotten to trying out the .once publications (though I am very much planning to shortly), if I understand though that’s almost having the best of both worlds too, as long as having a peer trigger an update isn’t necessary, right?

The one thing I’m guessing (though I still might have a misunderstanding on my end) is that for .once publications you have to resubscribe when you expect a change to be ready to pull, which has sone similarities with TanStack Query’s mutate feature, or (more passively) passing a query a key that includes reactive state (eg Vue refs).

Meteor’s already pretty groovy with reactivity; maybe it’s trivial to mimic that TanStack Query behaviour in Metoer with jam:pub-sub just by finding a nice place to hook a resubscribe in?

I have to learn more about mergebox and change streams to appreciate the rest; for now I would blindly take what you recommend for granted, test it in practice then note if the behaviour makes sense per my intuition :sweat_smile:

Hello,
We have a very large application with a lot of subscribtions.
Is it possible to try the jam:pub-sub package without breaking anything ?
Like adding it and progressively convert some publications to the new publish.once and try if this is working well ?

I have simply added the package and it seems that it is not working well with the way I use ostrio:flow-router

subscribe.js:68 Uncaught (in promise) TypeError: Cannot read properties of null (reading 'onInvalidate')
    at Object.subscribe (subscribe.js:68:5)
    at Route.waitOn [as _waitOn] (routes.js:25:14)
    at route.js:272:27
    at Array.forEach (<anonymous>)
    at Route.waitOn (route.js:271:15)
    at route._actionHandle (router.js:183:13)
    at ostrio_flow-router-extra.js?hash=e274ee3a7a230c7975fd5c1d21d415a217558352:3209:52
    at nextEnter (ostrio_flow-router-extra.js?hash=e274ee3a7a230c7975fd5c1d21d415a217558352:3033:7)
    at page.dispatch (ostrio_flow-router-extra.js?hash=e274ee3a7a230c7975fd5c1d21d415a217558352:3039:7)
    at page.replace (ostrio_flow-router-extra.js?hash=e274ee3a7a230c7975fd5c1d21d415a217558352:3004:34)

The line 25 :

Hmm I haven’t tried it with flow-router but feel free to open an issue and provide a minimal repo. With that in place, should be straightforward to figure out the root cause.

Also double check your using the latest version of the package v0.4.2

yep

To be fair I’ve only looked a bit at the Tanstack stuff. I’m sure it’s great for people that like that style but it’s a bit too react-brained for my tastes and I’d rather things “just work” without me needing to wire up a bunch of boilerplate. I think there’s a few parts to this:

  1. Update the writer’s client state after a mutation – .once will take care of this without you needing to do anything. No need for invalidating or refetching.
  2. Refetch reactively when the subscription arguments change – .once will do this too. e.g.
// I'm partial to svelte so I'll use it to illustrate but the same applies for any frontend

let foo = 1;
$m: Meteor.subscribe('thing', { foo });

// when foo is updated, .once will refetch and merging the result into minimongo is handled automatically
  1. Update the client state when another user makes a write – .once will not handle this scenario. You’d need to use .stream or a regular publication. I don’t think Tanstack handles this scenario either but offers a invalidateQueries which I’m assuming triggers a re-fetch but that leaves out the important bit. Maybe there is a way to trigger a refetch in this scenario :thinking: but I wonder if you’d basically be re-inventing a more brittle change stream or redis-oplog? Do you have something else in mind?
2 Likes

Oh! I misunderstood and thought .once wasn’t continuous updating! Whoops, thank you for the clarifications! That’s really cool!

I agree that TanStack Query adds a bit of ceremony; the trade off is that it’s basically “stateless” from a server perspective, and the client has more control on when to invalidate data and refetch.

Which has helped me avoid some user experience issues I once had with Meteor subscriptions (which I otherwise addressed with some sort of version UUID sent from the client to confirm when a write was successfully reflected from the server side, in some non-isomorphic code branch), which is when I made the switch towards using TanStack Query so much.

I haven’t figured out a more elegant solution for this problem with just using Meteor; the UUID synchronisation technique was my best idea so far.

(Digressing: The problem was basically, user has a document, there’s a calculations section, when the user updates some fields some calculations happen automatically and update the calculations section; users weren’t sure if the “realtime” calculations on screen reflected their inputs because there was a lag up to around 1-3 seconds; we had a buggy spinner to show something was happening but it sometimes got stuck endlessly spinning or didn’t spin at all; now with Query we can just use the helper reactive variables like “isLoading”.)

1 Like

This is also a great chance to rethink the design in order to support other merge algorithms, such as OT or CRDT, which are used to support conflict free collaboration.

1 Like

CRDTs are fascinating but I think only truly needed if doing something like collaborative document editing. This was an interesting read Architectures for Central Server Collaboration - Matthew Weidner and follow up discussion.

I’m guessing that getting y.js or automerge to play nicely with minimongo is possible. Has anyone tried? Maybe it’s even possible to do things all under-the-hood so it feels like you’re just using Minimongo as you’re accustomed, seems like a worthy goal. But my opinion at the moment is this should be a package you pull in for the specific scenario when it’s needed.

1 Like

I completely agree initial design was interesting but very problematic and in 2025 it doesn’t make sense anymore (I mean mergebox). It’s very complicated and sets very low and hard ceiling of potential scaling if needed. The only thing mergebox saves is bandwidth and this is really very very niche feature.
I already changed to NO_MERGE_NO_HISTORY and will look into your package too. Did you do profiling? What are the results?
For improved performance I think redis streams+redis pub/sub+redis cache between client and mongo would be huge. We already have redis pub/sub with redis oplog but maybe there can be more to be done.

1 Like

Has anybody tried out GitHub - chatr/redpubsub: Custom pub/sub system for Meteor on top of Redis before?

Is redpubsub similar to a manual/non-monkey-patched version of redis-oplog? Maybe we need a comparison table of Meteor pub/sub scaling options now, to know which approaches are compatible or conflict with one another, which ones have overlap, etc.