Is MeteorJS good to write analytic platform?

striletskyy · July 21, 2015, 12:47pm

Is MeteorJS good to write analytic platform such like:

Thanks,
Mykola

SkinnyGeek1010 · July 21, 2015, 12:57pm

You would need to use another database suited to high velocity data to capture the data calls. Mongo isn’t (likely) going to work well for this.

However I think it would make a good platform as long as you don’t need realtime data (otherwise you’ll need to copy data over to mongo from the high velocity data)

ffxsam · July 22, 2015, 2:21am

Just curious, why is Mongo not a good choice here?

SkinnyGeek1010 · July 22, 2015, 3:31pm

I don’t have any experience with this directly but there are other databases that can handle high velocity writes much faster. Meteor’s realtime data would be problematic with all the updates.

For example if a customer has 500 writes a second, Meteor would try to update this. In reality having one update every 10 seconds would be more than enough for most use cases, and every 1 second in a live stream page would be sufficient. You might be able to do something like copy data every 1 second for all logged in users. At scale this would be more performant than reading the DB in realtime.

For example here’s a benchmark for Cassandra (from their site):

@arunoda would be able to give you more concrete reasons as he’s doing something similar with Kadira.

arunoda · July 22, 2015, 3:40pm

I don’t think Meteor is good for such platform to collect data. You need another lightweight process written with Node/Java/Go to collect data.

Then try to use a DB you familiar at first. Then try to scale later. First identify your data load patterns and decide a DB for your needs.

**All these benchamarks are marketing tricks and don’t ever trust them. ** We tested more than 10 DBs with our own data loads and all had issues in one way or another. So, it’s a game of what you get and what you sacrifice.

awatson1978 · July 22, 2015, 4:52pm

Am I using the same Mongo platform as you and Aranoda? Because this is totally at odds with my experience developing an IoT google analytics clone for over a year.

Aranoda makes a good point that realtimeness of the Meteor stack is overkill for the data collection. But that’s also exactly why MDG created the meteor-platform and webapp packages… so people could create their own lighter-weight alternatives without Blaze, Tracker, etc.

We had a lighter weight Express process collecting our data; and our issue that eventually caused the company to migrate from Meteor to MEAN was that they were separate architectures, and they preferred rewriting the front end into Angular rather than rewrite the backend into a Meteor REST applet.

But otherwise, Mongo was a dream for creating aggregation buckets, writing map/reduce pipelines, and handling our NFC/IoT data. Mongo was the one part of our stack that nobody had a problem with. And I’m in total agreement with Arunoda… Don’t trust those metrics; they’re all fuzzy. The important metrics are how Mongo compares to an SQL clustering solution like MSSQL or Oracle!

Meteor for analytics front end is… well, for certain use cases involving pre-fetched data, data-splicing, data-recombination, and data visualization on the client… well, there’s nothing else on the market like it. The Clinical Meteor Track has recently got funding and sponsorship to do take ownership of a tool called Data Fusion which will mash-up genomics and biometrics data. We couldn’t create this product on any other platform, because it relies heavily on document-oriented unstructured data and client-side replica sets. But if you don’t need data fusion or data splicing, the minimongo may be overkill.

One note: we found that programming the realtime analytic pipelines involved something like 8+ windows open at any time… observer.js, pub.js, sub.js, component.js, graph.js, server console, browser console, browser, robomongo, tests. Plus research, api docs, etc, could easily put us at 10 or 12 windows, and we were pushing 6 megapixels just to program it. Dual Thunderbolt monitors worked well; but a single Thunderbolt made development difficult. There is workstation equipment overhead for developing such an app. It can’t be developed on laptops.

khamoud · July 22, 2015, 5:06pm

The issue with high velocity data and meteor which I have personally faced is livedata. If you have high velocity data along with publications meteor is not horizontally scalable. Your CPU usage will spike on all of your your instances because the meteor server will try to keep up with the oplog and crash every instance. If you don’t have oplog tailing then this process will be slower but will ultimately lead to the same outcome. You can get rid of publications and move the high velocity data to a separate database like we had to do but then you lose the realtime aspect of meteor. If the point of your application is to be realtime then you would probably be better off using a different framework/db/stack.