Scaling Meteor on AWS for a short burst

I didn’t read what @filipenevola said that way – just that each app instance needs to fully tail the oplog, which must be taken into account when calculating the load on mongodb.

I’ve never had such a popular app, but I get the impression that the scaling wall with tailing mongodb is when there is so much activity in the database that the app instances can’t process the oplog firehose.

Also: congratulations, and good luck! =)

3 Likes

Thanks :slight_smile:

A big shout-out to Filipe who I had a live session with today and who is supporting me on this.

3 Likes

Here is a sample of mup.js I use that you can start from to set up a couple of things you asked about.

module.exports = {
  app: {
    // Tells mup that the AWS Beanstalk plugin will manage the app
    type: 'aws-beanstalk',
    name: 'app_name',
    path: '../',
    region: 'eu-central-1', // your prefered region
    forceSSL: true,
    env: {
      METEOR_PROFILE: 2000,
      ROOT_URL: 'https://www.your_domain.com',
      MONGO_URL: 'xxxx',
      MONGO_OPLOG_URL: 'xxxx',
      MONGO_SOMETHING _ELSE: 'xxxxx',
      PRERENDER_SERVICE_URL: 'xxxxxxx',
      CDN_URL: 'xxxxx',
      MONTI_APP_ID: 'xxxxx',
      MONTI_APP_SECRET: 'xxxxx',
      // MONTI_EVENT_STACK_TRACE: true,
      AWS_ACCESS_KEY_ID: 'xxxxxx',
      AWS_SECRET_ACCESS_KEY:'xxxxxx',
      AWS_S3_REGION: 'eu-central-1',
      MAIL_URL: 'smtps://xxxxxxx@email-smtp.eu-central-1.amazonaws.com:465',
      PASSWORDLESS_VALIDITY_MINUTES: 3,
      LOGIN_EXPIRATION_IN_DAYS: 30
    },
    auth: {
      id: 'xxxxx',
      secret: 'xxxxxx',
    },
    customBeanstalkConfig: [
      {
        namespace: 'aws:autoscaling:trigger',
        option: 'LowerThreshold',
        value: '1'
      },
      {
        namespace: 'aws:elasticbeanstalk:cloudwatch:logs',
        option: 'StreamLogs',
        value: 'false'
      },
      {
        namespace: 'aws:elasticbeanstalk:command',
        option: 'DeploymentPolicy',
        value: 'AllAtOnce'
      },
      {
        namespace: 'aws:ec2:instances',
        option: 'InstanceTypes',
        value: 't2.nano'       /* your prefered size */
      },
      {
        namespace: 'aws:ec2:instances',
        option: 'EnableSpot',
        value: 'false'
      },
      {
        namespace: 'aws:autoscaling:updatepolicy:rollingupdate',
        option: 'RollingUpdateEnabled',
        value: 'false'
      },
      {
        namespace: 'aws:elasticbeanstalk:environment',
        option: 'LoadBalancerIsShared',
        value: 'true'
      },
      {
        namespace: 'aws:elbv2:loadbalancer',
        option: 'SharedLoadBalancer',
        value: 'arn:aws:elasticloadbalancing:eu-central-1:.......add it here once you have it built.'
      },
      {
        namespace: 'aws:autoscaling:launchconfiguration',
        option: 'RootVolumeSize',
        value: '8' // your disk size
      }
    ],
    minInstances: 1,
    maxInstances: 1,
    oldVersions: 2 // will store this number of previous versions of the app so you can revert to from the AWS console
  },
  plugins: ['mup-aws-beanstalk']
}

When you have the balancer running, there are some policies that need to be updated so that your health checks stay on green. If you get there and you are on red, let me know.

3 Likes

Thank you so much for sharing!

3 Likes

I’ve read it in the same way as you. Would be good if Filipe could clarify what he meant exactly.

Same strategy here. Set a maximum number of instances to 4 currently and it scales down accordingly as well (so minimum instance is 1).

The size of your server is largely depending on what usage you observe as memory and CPU aren’t increasing at the same constant when more users are added (at least not for me). I observe that memory increases only a little but CPU spikes will happen as there’s still one file reading function left that is quite CPU intense and needs to be ported to Lambda as well (like the others).

We now have established REDIS oplog on our servers. However, most likely under the load of our campaign with Linkin Park that just went live, a user reported this error message:

image

This must have happened on the landing page when we retrieve data for it. I’ve never seen this type of error message before, so I assume it comes from REDIS. We’re not using any “IDBDatabase” ourselves.

The error was not persistent, and the app works just fine right now. But still, I’m a bit worried about this, as this campaign will run for quite some time and got us quite some traction.

Is anybody familiar with this message and could hint me to the root cause?

BTW: I absolutely loved how smooth the REDIS oplog setup went with the code from the cultofcoders guys and @zodern’s mup-redis. That was just amazing! Thank you so much.

2 Likes

How about this: failed to execute transaction on idbdatabase - Google Search

If I google this, I get info about a lot of other frameworks, but none that we are using. So I am wondering if this might be related to the recently applied REDIS oplog?

However, I found this now:

But this didn’t give me a real clue either.

We do use dynamic imports, which is mentioned in the beginning of that thread. Yet in this case, it will most likely be something Meteor related.

We’re not using the Firebase SDK, though.

1 Like

It’s possible but has limitation. If you have 2 -3 meteor instances then it’s still okay but you can see the cpu load increased on mongodb database instances. It’s better to use redis oplog.
I think if we could use change stream feature from mongodb then it’d be much better. We don’t need to use redis oplog package to scale horizontally.

1 Like

It is not. Is is related to the local storage being used by the browser. We cannot replicate this but it is still happening once or twice a month.

1 Like

Interesting. I never saw this before. But we also did not have that such a high load before. :sweat_smile:

This is not a paid advertising, but I am really really happy with Scalingo and just would like to support those guys.

Basically manages all that Dev Ops stuff for you and scaling works like a charm.

Tried to get AWS EBS running, for me it was a PIA; setting Scalingo was super easy and deploying now is just a matter of merging a branch in github, with no downtimes ever. :slight_smile:

1 Like

Meteor apps with many changes in the database will suffer with Horizontal scaling if Redis oplog is not used as all the containers (or servers) will catch up with all the changes.

In this case, a lot of changes, horizontal scaling is probably not going to help much as you have more power to change the database even more.

Note 1: this varies a lot from app to app but this is true in all cases, what I mean is that you can find apps scaling horizontally much more than others depending on how they publish the data.

Note 2: most of the apps will never suffer with MongoDB oplog. The only way to discover these issues in advance is by load testing your app. This was the main topic I was helping Tom.

1 Like

I’m using local database. Our database is a little bit large so it’s cheaper to setup a replica set.

For those kind of one-off events it could be very useful to have an additional (semi-) static / cached landing page which handles the first load. Then you already filter out many of the quick views which people do which then go away. That instantly reduces real “session” users. Serving a static or cached page is about as cheap as it gets.

I’ve noticed in the Showcase section of the Meteor website, many projects use a simple landing page and the Meteor app is under https://app.xxxxxx.com. This helps a bit with crawlers and general visitors but it is completely useless if you share a link from the Meteor app, something like an event or a post, etc. But this is why we use CDNs to deliver the Meteor bundle files.
At one point I was looking to get the dynamic imports to work with CDNs too but as usual, got caught into other things and dropped it. This is one of those things with a massive impact in the performance and cost saving.

I’m using loadbalancer and auto scaling feature from Google Cloud and it works well with Meteor. Normally I run only one Meteor instance but when the load increased, it will auto add more instances.

This is not a paid advertising, but I am really really happy with Scalingo and just would like to support those guys..

.

That’s exactly how I do it. The landing page is plain Vanilla HTML and scores very high in the Lighthouse. If a visitor wants to sign-up for my app (which is the whole purpose of the landing page) then indeed the URL is yourDNA.family - Discover your DNA family and while it takes a bit longer on mobile (6.4s FCP / 10.8s TTI / 46 performance score - it’s delayed by a blocking 3rd party URL’s like Zendesk) it’s much better on desktop (2.6s FCP / 2.6s TTI / 68 performance score - now it’s more FontAwesome that delays it).

Memory consumption (39%) and CPU is very low on the smallest AWS EC2 server and as this server is only used during sign-up process (and password forgotten) it never breaks a sweat or has to scale up (though it can).

So IMO it’s best practice still for Meteor to split up landing pages (and optimize them ideally with Vanilla HTML or Svelte eg) and the actual Meteor app.

I still get close to 100 in Lighthouse both mobile and desktop with a self hosted Prerender. In cases such as the AWS website where the site is massive already before a log in, I feel a “landing page” is useful. Otherwise, I prefer to give the user the whole bundle “in advance” delivered from a CDN close to the user. Trying to imaging who these would look like: app.facebook.com or app.linkedin.com etc …

1 Like