Auto-scaling on Galaxy?

Attention Galaxy users, how about sending an email to Galaxy asking:

Any plans to make Galaxy handle auto-scaling, or should we look at other hosting providers for this feature?

Maybe if they can this message from enough of us, they’ll act on it.

Cheers!

6 Likes

Sent.
(20 characters)

+1 One of my recent client apps was on a subreddit and I was very lucky to catch that the app was not responding due to 250 concurrent connections on a single container.

At a minimum the ability to get some kind of notification would be nice.

2 Likes

Agree. Yeah they need 1) an API to create/destroy containers 2) an API for on-demand host-info checks

Once that is in place, then the community can create their own solutions, i.e. services which allow us to configure a monitoring policy which will then check host-info on some schedule and act accordingly, creating new containers based on our criteria.

Doesn’t seem like it would be too hard to expose those two APIs. Galaxy already provides these functions in the UI. They just need to create an API endpoint which triggers the same function i.e. add container, or return some statistical data about an existing container/app.

Hmm, food for thought;

The whole galaxy ui backend, being a meteor app itself, using ddp, meteor accounts, methods, graphql and whatnot, could in theory be invoked from a third party app. One simply needs to monitor the developer console for incoming/outgoing ws messages to see what goes on per each ui action.

Furthermore, an app can keep track of the number of connected clients (and perhaps some OS metrics depending on the environment) and at least conservatively decide for itself if it is sufficient to service the incoming load.

And then - oh and this would be so very meta :slight_smile: - it could itself spin up a new instance or spin itself down.

Now I know it’s best if there’s an official API of sorts for this, but hey, this could still be a very fun community hack weekend project :slight_smile:

1 Like

Anything like this is going to need some safeguards on Galaxy to prevent the runaway spinning up (or down) of containers. Until those exist you risk massive cost (at its most spun up) or no app (at its most spun down).

1 Like

I’ve actually had good mileage with https://atmospherejs.com/konecty/multiple-instances-status in the past for tracking app instances.

The source is very small and simple so it should be trivial to add some safeguards in there.

1 Like

Other cloud computing platforms already offer these kinds of things
https://support.rackspace.com/how-to/rackspace-auto-scale-tips-and-how-tos/

could burn money with all kinds of APIs–an Amazon RedShift dc1.8xlarge dense compute node will cost you more than $4200 pe rmonth in Tokyo region.

Absolutely. I was just commenting that unless you can predefine upper and lower limits on numbers of containers there’s an opportunity to get unexpected bills or drop your app. I’m 100% for Galaxy providing a mechanism for this. Just that sometimes we need protecting from ourselves :wink:

1 Like

it’s hard to say. if your ecommerce site went ape-shit viral on reddit, product hunt, stumble upon and twitter, you may not want an upper limit if you’re doing $10K in sales per hour. Some Shopify stores had their products picked up on Good Morning America and they did millions of USD in sales in a single day. Thank goodness for elastic scaling!

2 Likes

I asked about this somewhere in october '16 and got this response:

Thanks for writing in with your feedback. We have no plans for automated scaling in our roadmap, though our logging (in a more general sense) is something we plan to refactor; with more informative logging, or with logs exported to something like ElasticSearch, you would be able to achieve a part of that functionality. I understand how an API would help your use case, however, and I’ll be sure to submit this as a feature request to our Product Team; for now, please let me know if there’s anything else you’d like to know.

Nothing has been changed or added tho so no idea whats coming. We really would like at least some form of monitoring and alerts so im thinking about rolling something very basic myself to get a (slack) message when connections per instance > X for Y time.

https://console.ng.bluemix.net/catalog/services/auto-scaling

maybe this is where I should move my app once my contact at Galaxy ends.

I also have asked MDG a couple times about roadmap for autoscaling Galaxy. We deploy “auto-magically” with CI scripts with git push to our production branch. Before Galaxy was available we were using Heroku. Heroku is definitely ahead of Galaxy in terms of dev ops features https://www.heroku.com/platform/opex with autoscaling, threshhold alerts, etc. May have to move back to Heroku if MDG does not make any progress soon. Deployment to Heroku with a buildpack is very easy.