Scaling Meteor on AWS for a short burst

Hi folks,

we’re currently running our AWS instance on a t2.medium instance, since we don’t have thousands of concurrent users. We’re using mup for deployment, with that single instance.

However, we expect a major user burst tomorrow. We can’t estimate how many of them might access the app at the same time, but we want to take some precautions.

I am thinking about the following strategy:

  • Spin a larger instance tonight, as a replacement for the t2.medium, e.g. t2.large
  • Have an even larger instance as a backup, e.g. t2.xlarge
  • If the load expects our expectations, hot swap the instances by re-assigning the Elastic IP to the larger one
  • Both servers will use the same Mongo DB instance on ATLAS

I wanted to ask if there are any caveats in this approach? Especially, what will happen to users who are currently signed in?

Will Meteor just re-establish the WebSocket connection (as it would normally do if the connection was down for a moment, like on mobile), or will the users be logged out completely?

And is there a way to handle this better, without changing the overall infrastructure too much?

3 Likes

I’d rather prepare to scale horizontally especially if how much users to expect is unknown. But that will require changes in your infra

2 Likes

Yes, I know. But that’s a risk on that short time-frame. What would be the easiest way to accomplish this with mup?

I don’t use mup but there should be a load balancing option there with sticky sessions

I’m using loadbalancer and auto scaling feature from Google Cloud and it works well with Meteor. Normally I run only one Meteor instance but when the load increased, it will auto add more instances.
The only bottle neck here is mongodb database.

1 Like

Atlas autoscales by default

2 Likes

I’m using local database. Our database is a little bit large so it’s cheaper to setup a replica set.

1 Like

mup with mup-aws-beanstalk and add a load balancer which IMHO is the right way to go with AWS.

1 Like

Thanks, that looks pretty straight forward.

Questions here:

  • You mention to setup a load balancer. Is this a separate step, or is this done automatically by mup / mup-beanstalk?
  • How do I connect my domain to this setup? Mine is not hosted at Amazon, so I currently have an A record that points to the Elastic IP of the single instance I am using.
  • How are the instances configured on AWS anyways? How does AWS know which OS, disk space etc. I need? I do not see a way to define a “template instance” for this?
  • Am I able to ssh to these instances? I have to do it once in a while to clean up to regain disk space
  • The guide mentions that the default instance is t2.micro. This seems a bit small to me. Or is the horizontal scaling that effective? Our app does not do complex stuff in the background, but we have a couple of subscriptions.

Are there any other steps that have to be done in the AWS console to setup Beanstalk itself, that are not mentioned in this guide?

2 Likes
  • You mention to setup a load balancer. Is this a separate step, or is this done automatically by mup / mup-beanstalk?

yes, beanstalk will create it for you.

  • How do I connect my domain to this setup? Mine is not hosted at Amazon, so I currently have an A record that points to the Elastic IP of the single instance I am using.

pointing your A record to the load balancer

  • How are the instances configured on AWS anyways? How does AWS know which OS, disk space etc. I need? I do not see a way to define a “template instance” for this?

mup-beanstalk will set up this for you with the appropriate settings for Meteor as you don’t have a lot of time I wouldn’t worry about this

  • Am I able to ssh to these instances? I have to do it once in a while to clean up to regain disk space

I don’t remember if this is an option but are you storing files locally? If you are going to see a burst in users it may be better to store these in S3 or something similar

  • The guide mentions that the default instance is t2.micro. This seems a bit small to me. Or is the horizontal scaling that effective? Our app does not do complex stuff in the background, but we have a couple of subscriptions.

As you are going to have this for a short time I would spend some money to have extra capacity so I wouldn’t go with micro.

One question: how many users are you expecting?

One problem that you could have that horizontal scaling is not going to help is with oplog with these users are going to cause a lot of writes. As each server will try to catch up with all the changes happening in the oplog this could bring your app down.

2 Likes

Thanks for the quick response!

How do I know this IP address? Normally, I would use the Elastic IP I have set-up. But if mup does everything automagically, where should I look for the load balancer address?

My files are in S3. Yet I had to ssh quite some time to free up some space using docker prune. They used to use up my disk space over time and then the deploy did not work anymore. If this cleanup stuff is not necessary anymore with mup-beanstalk, this would be amazing.

Tbh, I just don’t know. It’s a collaboration with a large music brand, so there could be quite some numbers accessing the app once their social posts will be sent out. After that, the spike will quickly slow down again, because we’re doing an outdoors experience and people have to go out first.

Yeah, I know that I’d better had implemented REDIS. But I have to rely on ATLAS now, since I cannot change the infrastructure that much on that short notice. I hope it will perform well enough.

I don’t think you need an IP, read more here - Route53 + Beanstalk config


You shouldn’t need to clean the disk, yes, you are good.


The problem is not Atlas, the problems are your application nodes (containers) catching up with oplog. Even with horizontal scale it doesn’t solve all these problems, so in this case I would encourage you to do with a large instance instead of a small one as the oplog is going to be fully consumed by all the containers individually (that is the big drawback of MongoDB Oplog)

1 Like

You’re right. I need to use redis oplog instead.

1 Like

Woah, I wasn’t aware of this. I knew it would be better to have REDIS, but that horizontal scaling is not even possible with Mongo DB oplog is new to me.

I didn’t read what @filipenevola said that way – just that each app instance needs to fully tail the oplog, which must be taken into account when calculating the load on mongodb.

I’ve never had such a popular app, but I get the impression that the scaling wall with tailing mongodb is when there is so much activity in the database that the app instances can’t process the oplog firehose.

Also: congratulations, and good luck! =)

3 Likes

Thanks :slight_smile:

A big shout-out to Filipe who I had a live session with today and who is supporting me on this.

3 Likes

Here is a sample of mup.js I use that you can start from to set up a couple of things you asked about.

module.exports = {
  app: {
    // Tells mup that the AWS Beanstalk plugin will manage the app
    type: 'aws-beanstalk',
    name: 'app_name',
    path: '../',
    region: 'eu-central-1', // your prefered region
    forceSSL: true,
    env: {
      METEOR_PROFILE: 2000,
      ROOT_URL: 'https://www.your_domain.com',
      MONGO_URL: 'xxxx',
      MONGO_OPLOG_URL: 'xxxx',
      MONGO_SOMETHING _ELSE: 'xxxxx',
      PRERENDER_SERVICE_URL: 'xxxxxxx',
      CDN_URL: 'xxxxx',
      MONTI_APP_ID: 'xxxxx',
      MONTI_APP_SECRET: 'xxxxx',
      // MONTI_EVENT_STACK_TRACE: true,
      AWS_ACCESS_KEY_ID: 'xxxxxx',
      AWS_SECRET_ACCESS_KEY:'xxxxxx',
      AWS_S3_REGION: 'eu-central-1',
      MAIL_URL: 'smtps://xxxxxxx@email-smtp.eu-central-1.amazonaws.com:465',
      PASSWORDLESS_VALIDITY_MINUTES: 3,
      LOGIN_EXPIRATION_IN_DAYS: 30
    },
    auth: {
      id: 'xxxxx',
      secret: 'xxxxxx',
    },
    customBeanstalkConfig: [
      {
        namespace: 'aws:autoscaling:trigger',
        option: 'LowerThreshold',
        value: '1'
      },
      {
        namespace: 'aws:elasticbeanstalk:cloudwatch:logs',
        option: 'StreamLogs',
        value: 'false'
      },
      {
        namespace: 'aws:elasticbeanstalk:command',
        option: 'DeploymentPolicy',
        value: 'AllAtOnce'
      },
      {
        namespace: 'aws:ec2:instances',
        option: 'InstanceTypes',
        value: 't2.nano'       /* your prefered size */
      },
      {
        namespace: 'aws:ec2:instances',
        option: 'EnableSpot',
        value: 'false'
      },
      {
        namespace: 'aws:autoscaling:updatepolicy:rollingupdate',
        option: 'RollingUpdateEnabled',
        value: 'false'
      },
      {
        namespace: 'aws:elasticbeanstalk:environment',
        option: 'LoadBalancerIsShared',
        value: 'true'
      },
      {
        namespace: 'aws:elbv2:loadbalancer',
        option: 'SharedLoadBalancer',
        value: 'arn:aws:elasticloadbalancing:eu-central-1:.......add it here once you have it built.'
      },
      {
        namespace: 'aws:autoscaling:launchconfiguration',
        option: 'RootVolumeSize',
        value: '8' // your disk size
      }
    ],
    minInstances: 1,
    maxInstances: 1,
    oldVersions: 2 // will store this number of previous versions of the app so you can revert to from the AWS console
  },
  plugins: ['mup-aws-beanstalk']
}

When you have the balancer running, there are some policies that need to be updated so that your health checks stay on green. If you get there and you are on red, let me know.

3 Likes

Thank you so much for sharing!

3 Likes

I’ve read it in the same way as you. Would be good if Filipe could clarify what he meant exactly.

Same strategy here. Set a maximum number of instances to 4 currently and it scales down accordingly as well (so minimum instance is 1).

The size of your server is largely depending on what usage you observe as memory and CPU aren’t increasing at the same constant when more users are added (at least not for me). I observe that memory increases only a little but CPU spikes will happen as there’s still one file reading function left that is quite CPU intense and needs to be ported to Lambda as well (like the others).

We now have established REDIS oplog on our servers. However, most likely under the load of our campaign with Linkin Park that just went live, a user reported this error message:

image

This must have happened on the landing page when we retrieve data for it. I’ve never seen this type of error message before, so I assume it comes from REDIS. We’re not using any “IDBDatabase” ourselves.

The error was not persistent, and the app works just fine right now. But still, I’m a bit worried about this, as this campaign will run for quite some time and got us quite some traction.

Is anybody familiar with this message and could hint me to the root cause?

BTW: I absolutely loved how smooth the REDIS oplog setup went with the code from the cultofcoders guys and @zodern’s mup-redis. That was just amazing! Thank you so much.

2 Likes