Best practice / pattern for building Collections from external APIs?

Hi all,

I’m building an app that seeds Collections with a bunch of data from various external APIs. I’ve wired it all up and it works – but it feels dirty and I’m wondering if there’s a better approach. I’d love guidance from others on the best approach here. Maintenance and extensibility are my key concerns.

1) My initial pattern

  • get starter data from local file
    – iterate over & Insert a new Document for each iteratee
  • get API_1 data
    – match on starter data, Update matched Document
  • get API_2 data
    – match on starter data , Update matched Document
  • … rinse & repeat

2) Alternative pattern

  • Create a local container Object, e.g. var dataContainer = {obj1, obj2, obj3}
  • Build these objects up locally (don’t make calls to the db)
  • When the container Object is fully built up, then iterate over and make Insert calls for each iteratee to seed the Collection

Looking at my code I’m starting to think #2 makes more sense. Thoughts? Or maybe there’s a completely different, more optimal, pattern?

Thanks!

In case it’s clarifying, here’s pseudo code for my approach right now:

// grab starter data from local file
// iterate over and insert new Documents into Collection 
function initDestinationCollection() {
  var countries_starter = JSON.parse(Assets.getText("countries_starter.json")),
  _.each(countries_starter, function(country) {
    Destination.insert({
      country_code: country.country_code, // this is used as a foreign_key
      prop1: data
      // ...
    })
  }
}

// get data from API
// iterate over & match on foreign_key (country_code)
// then update Collection with new data 
function API1() { 
  var response = HTTP.get('url'), 
      data = response.data; 
  _.each(data, function(countryData) {

    // match on country_code, then add new data 
    Destinations.update(
      { "country_code": countryData.country_code },
      { $set: { /* new data */ } }
    ),
  }  
}

// rinse & repeat with various 3rd party data sources 
function API2_And_Beyond() {}

I guess my high level thought is this:

Is the data valuable if it’s not complete?

If not, I’m not sure how much sense it makes to insert/update piecemeal. Though this doesn’t really speak to performance/efficiency so much as what makes sense to actually persist in your db.

1 Like

Good point. The first time I “seed” the app with data it’s not valuable unless it’s complete - my app is entirely dependent on this data set. But thereafter, I’m only updating this data every X days.

So here’s what I’m thinking:

  • initCollection fn that builds the Collection w/ all of its expected properties. Will only run once in the app’s lifecycle.
  • update fn’s for each API I’m getting data from. These run every X days.

Looks something like this:

// call once in the App's lifecyle
function initApp() {
  initDestinationsCollection();
  updateApi1();
  updateApi2();
  updateApi3()
}

// call every X days
function updateDestinationsCollection() {
  updateApi1();
  updateApi2();
  updateApi3();
}

alt:replication might do what you want.

Example with http source:

  let httpConn = {
    query(url, callback){
      let resp = Meteor.http.get(url)
      if(resp && resp.data && resp.data.responseStatus==200)
        callback(null, resp.data.responseData.feed.entries)
      else
        throw new Error('Invalid response received')
    }
  }

  let hds = Meteor.Replication.DataSource(httpConn, 'query', 60*60)
  let url = 'https://ajax.googleapis.com/ajax/services/feed/load?v=1.0&q=http://www.google.com/trends/hottrends/atom/feed?pn=p1'
  let hot = Meteor.Replication('trends', hds.id('title'), url)

  Meteor.publish('hot-trends', function () {
    return hot.find({}, {fields: {title: 1, link: 1, publishedDate: 1}})
  })