Seems to be surviving pretty well, even without oplog.
Getting curious how much hits you are getting!
It takes a really long time to load the site and another really long time to actually load the content.
3000+ connections at the moment.
Do you mind sharing the reddit thread?
I would like to see peoples comments
Tried to find it, can’t be those threads? https://www.reddit.com/domain/mostexclusivewebsite.com
I wonder if I can get unhug of deathed if I get the site back up.
haha great concept, …362 people in line right now
64GB?! How did you know you needed that much? And how are you running your Meteor app? (forever, PM2, etc) And did you deploy manually or via a script like mup?
Using MUP…I’d opt for a high compute instance, but I’m running on digital ocean and just scaled everything up to the largest box. Set up Compose.io for oplog support and now things are running awesome, reddit hug is coming back 3k people in line.
Hope you planned for tonight’s leap second!
Wow I’d love to see some post-event analytics for this, i.e. What was the limiting factor, where were the bottlenecks, did you try running it with https://github.com/meteorhacks/cluster and use
export CLUSTER_WORKERS_COUNT=auto to load balance across all the CPU cores?
This is pure genius i’m waiting for 37503 people now.
I did use meteorhacks:cluster.
CLUSTER_WORKER_COUNT=auto seemed to cause some issues, running too many workers that would eventually die and causes lots of connections to get lost. I had 4 8GB DO boxes running in the cluster, CPU hovered around 35-40% with 4000 connections. I’m also using compose.io for mongo oplog.
update of waiting list count is messing with that countdown clock.
that looks unprofessional :DDDDD
If you run into issues with cluster like that, let @arunoda know. (if you haven’t already)
He was probably just exhausting memory and/or using up too much CPU with oplog parsing because CLUSTER_WORKER_COUNT=auto spawns a worker for every virtual, not real CPU core, i.e. usually twice as many as actual CPU available with HT-enabled CPUs. Halving the worker count then would have been a better configuration, if that was really what was happening.
So I’m not sure that needed to have been an issue with the
cluster package, though of course it may have been.
And my app wants 1GB of RAM just to wait for connections… ARGH!
I knowwww! Send me a PR! I’ve been quite busy tending to other things…