Contain health checks fail on Galaxy deploy

My site is still down…It’s been hours. Since 2:34am PST this sucks

My app is back, fingers crossed it doesn’t die again

Meteor APM is currently catching up on a backlog of APM stats. It will recover and stats were always being collected (though it’s worth pointing out that non high-availability apps which were down will not have stats to offer during their downtime).

Apps with existing “high-availability” deployments (those who pre-incident had their number of containers set to 3 or higher and were not affected by this outage), will have the Meteor APM data points aggregated and logged soon.

It is my understanding that all affected apps should be back online.

We have published a postmortem about this issue. http://status.meteor.com/incidents/tf630kbt1x2n

This issue is now back on galaxy meteor… our site is now on and off because of this issue. :frowning:

smw6v
2018-05-07 11:36:13+08:00The container is being stopped because it has failed too many health checks.
smw6v
2018-05-07 11:36:17+08:00Application exited with signal: terminated
rcnbk
2018-05-07 11:36:18+08:00Application process starting, version 43

Any possible fixes?

I’m having the same issue. Deploys randomly fail health checks or just keep stuck in deploying for a long time. Even deployments with exactly the same source will once be successful and another time not.

Literally nothing shows up in the logs.

Seriously guys, this is no state we can stay in for long!

Have you raised a Galaxy support ticket?

Hi folks, came across this old thread looking for more information on what constitutes a health check on Galaxy. Is anyone aware of more documentation / insight? Thanks!