Redis Oplog and AWS ElastiCache

ramez · October 5, 2020, 10:52pm

@peterfkruger, thank you for your kind words.

I think that’s worth considering (variable / controlled TTL). We played with it a bit until we came with this approach.

Let me share our rationale first so you can see where we are coming from: the user is the one who dictates what to cache. If he/she accesses the data, you cache it. Otherwise, clear the cache.

If the data is accessed so rarely, the db hits to reload are negligible and very few (if you pull the data back every 3 hours, why would you cache it in the first place?).

This approach ensures, without too much thought or optimizing, an almost ideal scenario.

As they say “Great is the enemy of good”. When is it good enough? Is it worth adding more code and cpu cycles to optimize this.

[Edit: Fixed “Great is the enemy OF good”]

pmogollon · October 5, 2020, 10:53pm

@ramez this sounds awesome. I came from laravel and caching was just built in like this and was nice. Im going to start trying this fork and later this week move it to beta env if we have no issues.

Thanks for the work. If we end up using the fork ill be interested in getting involved as we are starting to scale.

peterfkruger · October 5, 2020, 11:16pm

We have for example a collection with documents to store configuration parameters for various subsystems. These will only be read in Meteor methods, so this data will never make it to the client. We know that these documents are updated infrequently, however accessed in much shorter periods of time (whenever those methods are called). We could cache them on the Meteor server as well (meaning: in each of the instances), but such double caching is not the smartest thing in general. Ideal would be here to set a relatively long TTL, and to programmatically invalidate the cache whenever these configuration documents are updated.

In another example we have a document with currency exchange data which we update daily in exact the same time. This document is accessed frequently by clients: we could eliminate all db access (save a single one) to this document by setting a TTL of about 24 hours.

In other cases we have relatively large documents that are getting updated very rarely, with a frequency ranging from once per day to once per month. Ideally they should be cached indefinitely until we invalidate the redis cache upon an update of the document.

In general there is no doubt that it makes sense to have a default TTL to all collections subject to caching; but I think it would be equally meaningful to allow for individual TTL settings, for this is not necessarily a one size fits all kind of situation. Of course I don’t insist; it’s just a humble suggestion for your consideration.

ramez · October 5, 2020, 11:26pm

Just a note, if the data is accessed, even in a method, it gets cached. So you don’t have to worry.

How about this:
collection.setCacheTimer(timeInMS)

Would that satisfy your needs?

peterfkruger · October 5, 2020, 11:34pm

Yes, thank you, I would appreciate that!

Is there any chance for an API such as…

collection.invalidateCache(selector)

…where all documents of collection satisfying selector would get deleted from the cache, or if selector was not provided, the entire collection’s cache gets invalidated? Again, it’s just for your consideration. This the proverbial situation when they say “appetite comes with eating”

ramez · October 5, 2020, 11:52pm

Ok, it’s easy enough.
I should be pushing a new release soon.

Once it’s ready, I’ll send your way for testing.

peterfkruger · October 6, 2020, 12:36am

Awesome!!! You totally rock!!

ramez · October 6, 2020, 5:24am

@peterfkruger
A new version was just pushed, please check the API for the new methods
Please let me know how it works for you

Specifically, take a look at the __getCollectionStats method, would love to hear your results.

peterfkruger · October 6, 2020, 5:58pm

Thanks a bunch, I definitely owe you one! I may not be able to test it tonight, but I will very soon and I’ll give you feedback right after.