I have been running it in production with overridePublish: true. Most everything seems to be working fine. But I am getting spikes in method response time, which I wasn’t before using redis:
@jamesgibson14 I need more details regarding that method and more details regarding the spikes. What’s the usual time it takes for it to run ? How are these “spikes” affecting it, how bigger the response time than previous ?
@jamesgibson14 it is most likely related to the way we treat optimistic-ui. It’s the only thing that could cause those spikes and it makes sense, since it requires additional processing. I currently have some ideas how to fix it, however I posted a question on Meteor Github, maybe someone with better knowledge than me can guide me to a better approach.
Does this package essentially store all of the session state in redis? If I use it can I remove the requirement for sticky sessions / session affinity on my servers?
Check compensate for latency. If you clone the repo into your packages folder, and simply comment those lines. You should not have anymore spikes. Are you able to test this ?
But isn’t the reason Meteor needs session affinity because Meteor stores all of a user’s subscription state on the server?
I like that this removes a lot of load on the application server, but I’d love to see it go one step further and push all the session state into Redis so that we can remove the requirements for session affinity. Is that beyond the scope of this package?
How about using redis as an async cache. The strategy looks something like this:
We first hit the server’s SessionCollectionView to look for the user’s session.
If it doesn’t exist then we go out to redis and fetch the session. If it exists we merge it in to the local cache.
After the operation, we update the local cache and also update the redis cache.
This way, if a user is using a websocket they’ll tend to hit the same server. If they don’t, though, for whatever reason, they can reconnect to a different server and have their session restored.
When there’s a local cache miss it would take a bit more time but we would end up with a much more robust / horizontally scalable solution.
It would also help avoid issues when deploying since when we deploy a new version and connect the user to a new server it will be able to restore their session.
The package has been causing an issue for a Meteor Toys customer. I’m going to look into it, but I figured you might want to know about it, and or have a solution, since it could probably happen in other cases too.
@msavin I looked at it. Does it happen only with the pro version ? If not, I may need the full version to test. Ping me on private I think I may know whatsup, I need to ask you some questions about Meteor Toys internals.
@raskal@bmustata yes. I recently discussed with @nadeemjq he had some issues understanding synthetic mutations, and he told me just by switching to redis-oplog, from 1s load-times he went to near instant. I don’t have all details, but there are people who had the courage to go in prod with it. (You the real MVP)
Sorry about the lower input in the past months, but I had to put my priorities in order. By end-of-march I will clean the issue board
@clayne some good ideas there. However, I think we’re gonna uncover a can of worms with this. First of all, SessionCollectionView is per connection not per user.
Imagine this scenario:
I logged in as the same user in 2 tabs. In one tab I’m hitting Server 1, in the other I’m hitting Server 2. The cost of properly updating the SessionCollectionView per user, is going to be very costly time-wise and cpu-wise, because there are many details involved.
I still don’t know how this works, sadly, couldn’t get someone to give me some points. I will look into it in depth, code is open-source, so it shouldn’t be hard to reverse-engineer it. I did the same for most parts of redis-oplog.
I have some crazy ideas to solve this issue with elegance, need to check.