I’ve just written a review of the Meteor framework. I’ve been using it for the past several years, and I just thought I’d share my thoughts and ideas on it.
I’d love to open a dialogue about the current state of Meteor and ofcourse if I got something wrong please feel free to point it out so I can correct it.
That’s comparing apples with oranges, just remove the pub/sub and convert them to methods and you’ll end with similar scaling characteristics, real-time systems are harder to scale
I heard this statement before, back when NoSQL was being hyped, people were saying that tables are not the right structures and in fact there is truth to that, that’s why ORM was a big industry, you’ve a whole layer convert objects and their relations to tables. I worked in both, I prefer NoSql, I worked with google bigtable and it was a relief from the SQL tables.
I don’t understand how this should part of a framework, can you refer me to a NodeJS framework that does search out of the box?
An interesting read. I’ve been using meteor for about 4 years now, and while I do have issues with a few meteor pieces, I don’t agree with many of your negative points:
regarding dynamic imports, you only need to refactor your code if you didnt build it this way to start with, so migrating from a 1.2 app does require effort, but starting a new app is pretty simple. I would like to see better documentation of using dynamic imports with routers, or at least some best practice info here. What we’re doing works, but I’ve no idea if it is the right approach.
Scaling is almost always hard. The only time it’s “easy” is if you’re building something RESTful that requires no state, and there you trade the benefits of sharing resource (observers, etc) for the benefits of trivial scaling. I will say, that we’re running a B2B SaaS product with 25k users on 6 t2.medium instances (worth noting we never have all 25k users online at the same time!). Our CPU spends 99% of it’s time at < 10% usage. We use publications for almost everything, we’ve got maybe 50 publications in use, with relatively low observer reusage (since users from different accounts cannot share observers). Your comment about " solutions coming out like redis-oplog…" is a little odd, considering it has been available for quite some time now. We’re actually just making the switch to redis now, our users have recently requested the ability to copy large amounts of content (> 5GB of mongo data for each copy), this is the first time we’re seeing issues with tailing the oplog that requires a redis-oplog type approach.
The shrinking ecosystem is an odd one. Kadira was sold to MDG, who opensourced the code where it was previously private, so you’ve actually gained something there. Additionally, there are variants of Kadira that are still maintained, with new features being added (though you do have to pick which variant you want, and they aren’t compatible with each other). The meteorchef project was for tutorials, not a package (unless I’m mistaken) - so not really part of the meteor ecosystem. I’d also say there has been a shift away from meteor specific packages to NPM packages which can be used in any project, this follows Meteor’s general shift away from its own package manager.
While I agree that being locked into any DB technology is bad - I disagree with most of your complaints on mongo. Mongo doesn’t stop you from modelling relational data, it just gives you additional tools so you aren’t forced to flatten everything. Modelling many:many relationships - particularly when there is an application level limit is much easier, allowing per-user (or client) schemas was essentially impossible in relational databases, and is trivial in mongo. I do agree that denormalization is a pain - but this is NOT something specific to NoSQL, querying a single collection, and then querying other related collections is not massively different from issuing a join in SQL, it still queries multiple sources. Most of the complainst I see about mongo are people making the wrong schema decisions, because they previously had no choice. For example, forcing Mongo to use a fully relational model where that isn’t where it excells, or even the other direction, shoehorning your data into nested documents/arrays, when that isn’t the best solution either. In short, you need to spend some time thinking about your schemas.
I think you’re making some assumptions here about “common web app tasks” - I’ve built 8 different applications with meteor, and perhaps 20 in other frameworks. I’ve never needed search engine functionality, the closest I’ve come is needing to search specific columns across multiple collections, and this is easy enough to implement. Similarly, I probably wouldn’t implement a REST API in Meteor, it’s not the right tool for the job. In the cases where I’ve done that, it’s almost always easier to build it outside of a framework, so you can easily make use of infrastructure like AWS Lambda. I don’t think this is specific to Meteor. I’ve also had great success using slingshot with AWS - we see around 5k uploads per day of files of various sizes and it works beautifully. My one complaint is that it doesn’t support multi part uploads, so you’re limited to 5GB. Turns out this is a really hard limit to work around.
My biggest issues with meteor are the “weird” behaviours - due to the “magic” happening, debugging this can be a nightmare.
I mentioned that 99% of the time our CPU is < 10%, the problem is the other 1% of the time. Meteor seemingly randomly bumps to 100% CPU, and the memory usage doubles - in fact this just happend, its 6:30 in the morning, there are 20 users online, and one of our servers hit 80% CPU and 1GB ram (normally 300MB) for 10 minutes, then returned to normal. No reason I can see :(.
Recent versions of meteor brought some interesting bugs with it. I had to skip 1.7 entirely to the 1.8 beta due to a babel bug, HCP still isn’t working fully with IOS - this has been an issue for months, a recent change to the way meteor handles undefined values in mongo queries led to a security bug in my app that took a long time to find all occurrences of, and caused crazy behaviour where “meteor and mongo disagree on the number of results” - which pretty much kills your app, but in a way that requires a manual restart, because its not actually dead.
I would like to see Blaze deliver on their promise of NPMing their code. I published an NPM package with blaze in it, but its not the same as an official version, that doesn’t depend on other bespoke meteor extractions.
Apart from these issues, Meteor has been a dream to work with - there is no way we would have been able to build the system we have in the time we did in any other framework.
These are all good points. Let me just give you my opinion on your points.
Yes, I can understand this. The newer versions of Meteor have the imports directory structure, which should make dynamic imports easier. I did infact start my app around 1.2 and did not use the imports directory structure then, which made things much harder to change down the line. It is also my understanding that you need to use packages to make use of dynamic imports, correct me if I’m wrong.
I made this point comparing it to other non-realtime frameworks. With Django for example, I’ve just been able to throw more servers at it, put it behind a load balancer, and it’s been the end of that, whereas I had to do a fair bit more engineering to scale Meteor. I was able to scale it to thousands of concurrent users, but it was definitely harder than scaling Django for the same user load.
I’ve been using Monti APM myself, so I’m well aware of the developments with regards to Kadira. It’s been a while since I looked into the original package, but last I had, when I was hosting it myself, it would stop working every few days because of data pile-up and there was no documentation to resolve that issue. I’d say tutorials, meetups, etc., all form part of the ecosystem of any project, not just libraries. I did point out the NPM support however.
I still think MongoDB is overused and it’s often the wrong tool for the job, as most data fits the SQL model better. JOINs don’t require you to get the ids and then fire another query, saving the multiple network roundtrips that Mongo requires.
I’ll give you the search engine point, that’s not an ultra-common requirement. But it’s still something most other frameworks have some package for and Meteor doesn’t. I’m personally not a fan of microservice architecture, I’ve been burned by that before. I’d much rather use a framework where I can implement my REST API along with all my other code, in one single place. I tried Slingshot, and I use S3 myself, I decided to just proxy the uploads through a service I created myself, as it was easier.
Thanks for the reply, regarding (1) you do require the dynamic imports package, but the imports themselves are fine grained, on a per file basis. We’ve used it at the view level, so we have a routes file that defines all our routes (for a specific section) and the triggersEnter function loads the view file, waits for it to be available and renders it, like this:
Things become a little trickier if you want to dynamically import a package, it requires that the package has declared its assets to be “lazy” - this is not something I’ve looked into too much
(2) hmm, perhaps when we hit many thousands of concurrent users we’ll have the same problems, but as of right now the workflow sounds similar to your Django example, we use AWS application load balancer, and just register new servers with the relevant target group. I will grant you that deployment is not trivial, we modified the MUP beanstalk package to work directly with AWS balancers (our URL structure requires that two distinct applications share one load balancer).
(3) most likely your problem is a lack of indexes - by default the self hosted kadira didnt’ define any indexes, and I 100% agree, this took ages to figure out - but we’ve been using it to host data for 12 apps (4 dev, 4 staging, 4 production) for nearly 2 years, and we’ve had no problems after about the first month. If you’re interested I’d be happy to share the indexes we defined.
(4) the round trip point is noted, we’ve had a little fun in deciding where to trade off network delays vs overloading the DB, for example when aggregating large sets of values.
I think that is not a fair comparison, you’re comparing apples with oranges, just remove pub/sub and use Meteor RPC methods and you’ve very similar scaling characteristics if not more due to Node event loop advantage with I/O. But even scaling real-time meteor horizontally was not that hard, I also just placed it behind load balancer and spawned more instances.
The hype is tilting back to SQL, but few years back when NoSQL was at the peak of its hype, people made the exact opposite arguments. I personally don’t miss the days of using ORMs and complex SQL migrations queries, the first NoSQL db I used was BigTable by google and I never looked back since then.
Which other NodeJS framework offers search out of the box?
I didn’t use the imports directory, so I needed to use dynamic imports on packages instead.
I used Heroku with sticky sessions, MUP was a hassle to get working and ultimately I decided it wasn’t worth it.
I just ended up paying someone else to host Kadira for me, but thanks for the offer. Perhaps others in the community might benefit from your knowledge though.
Yes, I agree, with just RPC the scaling characteristics would be more in line with any other Node framework. With realtime users subscribing to a lot of data, it did become a challenge. It’s not impossible to scale it, just more expensive in my experience. I ran some benchmarks with SocketCluster and Meteor pub/sub and there were significant differences.
Yes I have seen people going back to SQL these days. I am not a fan of migrations either. What other benefits do you see from NoSQL that makes you stick with it?
I don’t know of any Node framework to provide search out of the box. Meteor used to have a package for it a couple years ago, but it was abandoned long ago.
I’m not sure why imports directory would be necessary? if you’re not using the imports directory, are you manually specifying the entry points? I didn’t think that behaviour was supported > 1.7, with a preference on specifying your own entry points so you can control the load order
Yes, that’s exactly it, the way to look at Meteor scaling is the cost of hardware versus the development of the pub/sub system. At the beginning you want to minimize the cost/time of development and since you’ve small scale you can get away with little hardware but once you scale and your idea gained traction, you start want to optimize for the hardware.
I came from Java and initially I was really frustrated that I had to convert JSON from the UI to Java Object and then convert that to SQL through an ORM, I thought that was really inefficient, the other thing is I used to spend tons of time writing queries to update the schema when the tables design were in flux so I thought that was inefficient either, lastly migration with SQL was always dreaded. Because I like speed and quick iteration I really enjoyed managing the schema at the application level. With Mongo, I’m dealing with JSON all the way from UI to the DB which seems very natural to me. However, SQL does have advantages in that the RDMS are very well researched and understood, the query language is standardized and the schema is enforced by the DB. It really depends on what your priorities are but so far managing relationships has not be an issue for me with NoSQL.
Sounds promising but as I said I had really bad experiences (defects) with ORMs (hibernate in Java) that I decided to avoid this layer of abstraction whenever possible. Right now, I inspect in Chrome I see JSON, I inspect the socket I see JSON, I console at the server I see the same JSON, then I store in the DB and it’s still the same JSON, and even when I export to a file it is the exact same thing! and furthermore to query in MongoDB I use JSON criteria! that is one less thing to think about. No wonder Amazon re-implemented MongoDB API in DocumentDB.