Meteor Data graphing app with HTTP REST system. Need help

The GIT repository of this project is under echo-project-work echo-project. Alright I will dive right in. I am creating a website that will be able to host data from portable weather stations. These stations will send data with an arduino. This arduino will have many sensors hooked up and will have 2nd shield that is a 3G/GPRS shield that can send HTTP/FTP/FTPS/TCP/UDP/HTTPS/SMS/email. Seems to me to all be AT-commands which I have no Idea about but my job is the website. Although I do need help sending the data as well my main 2 questions are.

  1. How do I receive this data and and save it into my collection in one specific array and save the time it came in.
    here you can see example code of what each Node will hold. I also was wondering if i can also set an _id here as well.

    if(Nodes.find().count() === 0){Nodes.insert({ name: "OSU Node 1", gps: '134.234.78.545', humidity: [80,81,82,83,85,86,87,88,89,88], temp: [25,56,67,65,65,66,45,50], dew: [80,87,89,86,87,89,80,84,79], speed: [100,101,103,110,150,134,112,123,124], direction: [120,90,180,275,360,120,123,124,150], pressure: [33,32,33,34,35,32,32,31,32,31,33,35]});}
    The Data will be coming in every 4 seconds. But only one point of data for each array. I know if I do
    Nodes.update({_id: Router.current().params._id},{$push: {humidity: {$each: [#],$position: 0}}});

That I will get a new number at the front of the array. I saw in someone else’s code a limit param and was wondering how I could limit my arrays to only be up to 7500 points of data each. And as new data comes in the oldest data is deleted.

2 . I have been looking into Highcharts but cant seem to figure out all the small coding nuances. I need to graph each of these arrays on their own page. on their own Node’s Page. I have seen a lot of work from @robfallows and have questions to ask on other forum pages but since this is a larger question than just a reply to a another forum question I figured I would ask a more complete question/questions.

You can see what the site current looks like at Echo-project-website. If it is not up I just have it down for a bit for meteor reset. I have been testing array limitations.

You could use mqtt protocol for this - with mqtt-collections you can save for example only the last entry but i’m not sure if you can limit it to sth like 7500 … actually i don’t think so.

Here’s some blogpost about the experience i made with mqtt. There’s also a 2nd part and one about graphing the data.

That I will get a new number at the front of the array. I saw in someone
else’s code a limit param and was wondering how I could limit my arrays
to only be up to 7500 points of data each. And as new data comes in the
oldest data is deleted.

Actually mongodb has sth build in for this called expire-date. @robfallows talked about this in mentor session 3 and 4, so this is perhaps worth a look.
https://docs.mongodb.org/manual/tutorial/expire-data/

@sakulstra The TTL is a fine idea except I do not want the data to be deleted after any amount of time. I need it to stay there until the limit hits 7500. I will just go with my original plan of editing the collections array on when new data is introduced.
Also mqtt seems like agreat idea but not what I am looking for. I am running this project on an actual server on Digitalocean. I am making the project on Ubuntu 14.04 and nginx. Although have not deployed it yet. I need the data to save website visitors can download the data. But thank you for the suggestion.

Is it a strict limit of 7500? If not you can

  • calculate an ~time when you are certain that data comes every 4 seconds (yeah, this would be a really ugly workaround)
  • do it via some cronjob

I don’t see why DO, is an argument against mqtt. I don’t want to convice you, as i’m not experienced enough with mqtt, but running mosquitto in digital ocean seems like the easiest thing on earth.

like I said I am not trying to send just messages with this holding pieces of data. I need it to accept a ID,name,GPS value, and one integer for each array in each send and receive. Also using this I would have to rework my whole project and debug any issues which I already have 100 hrs in. Also I am trying to learn industry applicable information and I don’t see mqtt as information I could see myself using while working for a company making their website. Yes this is possible but I am going for a more general approach to doing this project. I don’t want to get into an argument over this but I am not going to use mqtt for this project.

1 Like

I sent a PM regarding this. I’ve been tied up in meetings for much of today, but am working on some code suggestions which are almost ready to share. However, I will say up front that it requires an about turn in the way you’re currently architecting this which will make it smaller, simpler and more performant.

However, if you take my feedback on board you …

would have to rework my whole project and debug any issues which I already have 100 hrs in.

If that’s not an option for you, then I’ll stop now.

Is there any way you could send a small snippet of how the architecture would work? What exactly would it be changing? I so far do not have a great understanding of Sessions, and Tracker. Or maybe if you could just explain what you would be restructuring.

I’m struggling to make time to structure my current ramblings about code and templates. However, the data part is fairly straightforward, so I thought I’d update with this for now.

Caveat: I don’t know your application requirements, so much of this is speculation on my part.

Database

I’d approach the data architecture from the requirements. For example, if you intend that the detail graphs from a weather station are to be static (they render when you instantiate them, but don’t update reactively), then an array of data points per document may sound reasonable. It will almost certainly be marginally quicker to get one well structured document than to get 7500 documents and generate the Highcharts data.

However, if the charts are to update reactively every 4 seconds then it’s way more efficient to send only the changes (new data points) than to resend a new fully populated document. In addition, Meteor does not send differences to subdocuments (one element change would resend the entire array). Also, MongoDB doesn’t make it easy to work with arrays (especially with the circular buffer concept which you seem to be proposing). That may mean updating a weather station’s document every 4 seconds with a new array. Doing upserts that frequently on large-ish documents sounds expensive.

My rule of thumb is that it’s generally much better to put database functionality into the database, rather than in the code. So, I would have one document per weather station per timestamp and set up my publications to return the data I want. If this is done right we’ll get our (up to) 7500 documents when we instantiate the graph, as well as a new document every 4 seconds to keep it up to date. This also makes it trivial to hold the time (look at simple-schema and collection2 for methods to do much of the heavy lifting for you) and deal with sparse populations and pruning. My suggested document structure would be something like this:

{
  _id: 'xxxxxxx',
  station_id: 'OSU Node 1',
  createdAt: someTimestamp,
  humidity: 88,
  temp: 50,
  dew: 79,
  speed: 124,
  direction: 150,
  pressure: 35
}

In addition, we need to ensure the collection has indexes for the fields we will be finding/sorting on (station_id and createdAt in the above example).

I would probably use a separate collection for fixed details about each weather station. It seems a bit “light” at the moment, but if it included other details (description, date commissioned, type, etc.) it would be more useful. If the weather stations move frequently and it’s worth tracking that over a 7500 sample set (8:20 hrs) then adding the co-ordinates to each sample may be better. Otherwise, it would look something like:

{
  _id: 'OSU Node 1',
  gps: '134.234,78.545'
}

I used the name as the _id - not necessary (I could have used a separate name property), but convenient for the purpose.

Graphing

I need to restructure my notes on this to make it more coherent. Watch this space!

2 Likes

This is absolutely a fenominal response and exactly the feedback I have been looking for. I was thinking of putting my data in that structure but figured that it would cause problems with database size but it sounds like from what you are saying it is a better choice. Also I only want to graph the most recent 20 points of data and the GPS will be sent with each reading because I want the data to be held on the server and if the next time it starts to transmit it will be at a new location.
I have actually graphed all the data points and created a button for a random input but this was just to text the graphing. I do want the graphs to updat with each point. The new code is on the same github under each graphs js file.
And the name just use to have a better looking sidbar menu but if it is unecassary I can scrap the idea. I like the time created because then I can have that in the graph.
Ultimately I am trying to have a button to download the graphs data into an excel file for any user who wants to download it from the database. Im glad you see I want these nodes to hold 8 hrs of data haha.
One last quedtion that I dont understand is. How would I create a new collection if A new node wanted to comnect to the website. Or will it always have to take an admin to input a new collection and the publish/subsribe.If this is the case that is fine I was just curious.

If I understand you correctly, ie as “How would I create a new collection if a new weather station wanted to connect to the application?”

You wouldn’t need to. As long as the weather station identifies itself uniquely (stationId), it just gets added to the existing collection(s).

Oh okay so it would just be a very large collection. Wouldn’t this mean large search times for finding the correct data point? Since I want to search for a particular Nodes most recent 20 points. Which I am also a little confused on how to do and is the main reason I switched to having arrays.

I was a away for a bit for a migraine. still not completely gone but I am awake now.

All right now that the migraine is gone I was able to read over more carefully and read up on the links. Sorry my other reply was a block I was on the phone.

I have a few other things to present and ask.
I only want the graphs to update when new data is present. Also the Arduino is not sending the time sent. So the website will have to add the time arrived for when the user wants to download they will have a time element.

I read about the schema, collection2, and autoform, although I still do not quiet grasp what a schema is. It looks like a key for each element of a document.

When you say

does this mean if a stationId is the same as an existing stationId in the collection will it overwrite this one? I do not think this is the case and would hope not because I need the data for download.

And when you say Doc you mean like an SQL row. Would the data IE sensor inputs be held in the fields(SQL columns)?

I only want the graphs to update when new data is present.

That was an assumption I made.

the website will have to add the time arrived

That’s expected: the last thing you need is inconsistent timestamps between devices.

I still do not quiet grasp what a schema is

It’s a description of the data: it describes what fields a document contains, and for each field it describes what it contains and any properties it has which may be used to validate it. You don’t need a schema to use MongoDB, but at the very least it encourages you to think about what and how you store your data.

does this mean if a stationId is the same as an existing stationId in the collection will it overwrite this one?

No - it will add a new document to the collection.

And when you say Doc you mean like an SQL row. Would the data IE sensor inputs be held in the fields(SQL columns)?

Collection == Table
Document == Row
Field == Column

(aproximately)

Okay this all makes sense now. Also I ran into the problem with pulling the whole array for graphing and it seems like quiet the pain to deal with data this way. I am very interested to see what you have come up with.
On GitHub. I added in a Spline graph. This is the exact graph I will be trying to use. I will start working on how to show the GPS coordinates with google maps.

Huge apologies for not getting back with code sample and whatnot, but this week has been frantic and I haven’t been able to give the forums the attention they deserve! Hopefully, I’ll get back to this by Monday.

Absolutely fine. I have not been able to work on the project much we are having a presentation next week and we are writing a long paper so don’t worry.

Just wanted to say that this is a very cool project, and kudos to @robfallows for weighing in with what seems to be a couple of awesome ideas!

1 Like

OK, so here’s what I have so far. I’ve taken the (possibly incorrect) opinionated stance that all charts are (say) line charts with time along the x-axis and value on the y. I haven’t given you any chart creation stuff, other than the basic instantiation and a note about where the data is coming from.

With only 20 points per chart, it’s perfectly OK to just give Highcharts another complete array of data for each chart and it will do the time shifting for you. An alternative would be to observe the collection and use the Highcharts addPoint and removePoint methods each time a new data point arrives. I’ll leave all that to you!

Templates and Charts

For this use case I’d try to get my charting code as generic as possible - write one template that handles all types (temp, pressure, etc). My thinking goes like this:

Start with a rough functional nesting of templates:

main
  stations
    station
      temp
      pressure
      ...
    station
      temp
      pressure
      ...
    ...

The stations template lets us choose which station to look at (“contains” a bunch of stations). The station template lets us see a bunch of graphs for that station.

Highcharts requires that the DOM be available when the chart is defined. This means we should put each chart in its own dedicated template, even if we want multiple charts on a page (as above). So maybe something like this, where we re-use the chart type as its id:

<template name="showchart">
  <div id="{{type}}"></div>
</template>

Which ensures that the Template.showchart.onRendered will correctly run when the div is available. In which case for our parent template (where we iterate over all available chart types - temp, pressure, etc.) we might have:

<template name="station">
  {{#each chart}}
    {{> showchart type=type}}
  {{/each}}
</template>

Giving a station helper which returns a list of valid chart types from the document structure. Maybe something like:

Template.station.helpers({
  chart() => {
    return Object.keys(Readings.findOne().fetch()).filter(type => {
      return !['_id', 'station_id', 'createdAt'].some(test => {
        return test === type;
      });
    }).map(type => {
      return {type};
    });
  }
});

Note that as a rule, it’s good practice to only request the data we need (for example temp if that’s all we’re plotting), but if we want to render all related graphs on one page, it’s probably more efficient to subscribe to all data for this weather station one level up from the chart template. In my example, that’s the station template. (Note: I’m not showing all code here - like “where does stationId come from?” - I’ll also leave that as an exercise for you :wink:).

Template.station.onCreated(function() {
  this.subscribe('readings', stationId)
});

The corresponding server-side publication then looks something like:

Meteor.publish({
  readings: function(stationId) {
    // Note the sort is descending, so here we get the latest 20 data points,
    // although they're currently in reverse order.
    return Readings.find({stationId}, {sort:{createdAt:-1}, limit: 20});
  }
});

The showchart template code then becomes generic:

Template.showchart.onCreated(function() {
  this.chartdata = new ReactiveVar([]);
});

Template.showchart.onRendered(function() {
  const template = this;
  // Setup chart. The id for the chart's div is in template.data.type
  // The title and axis labels may be derived given the chart type (temp, pressure, etc.)
  // and the series data comes from template.chartdata.get()

  template.$('#' + template.data.type).highcharts({
    ...
  });

  // Set up an autorun to refresh the data on a new data point
  
  template.autorun(() => {
    // Note this time the sort is ascending, so the data points are in the right order for the chart
    template.chartdata.set(Readings.find({stationId}, {sort:{createdAt:1}, limit: 20}).fetch().map(doc => {
      return [doc.createdAt.valueOf(), doc.type];
    }));
  };
});

Template.showchart.helpers({
  type() {
    const template = Template.instance();
    return template.data.type;
  }
});
2 Likes

All right this is pretty informative haha. I have a questiong though, is the Readings the key in the schema for the collections type? Or is Readings the actual collection.

 `Schema.Readings= new SimpleSchema({});`

As a matter of fact I understood more than I thought I would. I don’t think that reworking the front and back end will take very long. I just need to read over the collection 2, simpleSchema, and Highcharts docs to see exactly what is going to be going on.

I like this approach a lot more but it will take some time to reformat my code. Thank you again @robfallows for all the help. I had seen you post on other feeds and saw you really knew what you are talking about when it comes to Meteor and even more. I will most likely be back with some more questions but not for a while.

1 Like

Readings is the collection.

Actually, if you’re using a schema you could simplify the Template.station.helpers by interrogating the schema for its keys, rather than actually getting a document from the collection.

You’re most welcome :smile: