Best approaches for files as datastore?

canadaduane · April 23, 2015, 4:58am

I have about a half TB of files (books from archive.org, actually) on disk that I would like to make available to a Meteor app. Additionally, when requests are made for a file that doesn’t exist on disk, behind the scenes, I’d like to reach out to archive.org, retrieve the file, cache it locally, and then send it off to the original requester.

Now, I’m new to Meteor and I’m struggling with thinking about this architecturally. My initial thinking is to make a node app that knows how to serve up files, and retrieve missing files, and then implement DDP on top. I was sort of going down this road, but the DDP docs mention that if you’re building a Meteor app and you’re building DDP for your app, you might be doing it wrong.

What’s the right approach to this problem? Should I build a Meteor app that serves files and does background requests to archive.org, or should I build a node proxy layer that speaks DDP?

And writing a (non-Meteor) node layer that ferries data from disk to Meteor app–am I approaching that in a way that would work within Meteor’s design patterns?

Sanjo · April 23, 2015, 8:21am

My suggestions:

Store metadata about all your files in MongoDB. You can query this data easily and fast.
For serving the files you can use AWS S3, Nginx or any other service or tool that can serve files.
Retrieving non existing files would be a background job. To make it easily scalable you can use AWS Lambda. If you don’t have a lot requests you can also do it inside a Meteor method that uses this.unblock.

canadaduane · April 23, 2015, 2:56pm

That’s very helpful, thank you @Sanjo. I like the separation of file serving concern from metadata. That actually makes the task a lot easier. I don’t expect to have a lot of “cache misses”, so I will learn more about this.unblock.