This is a bit of a head scratcher. Any ‘out there’ theories appreciated.
For various reasons, we have migrated our production app to and from Galaxy twice over the past 12 months. And without fail, we get the below warning from Mongo Atlas when we are running on Galaxy. The same exact code on AWS directly does not trigger this.
OPEN Query Targeting: Scanned Objects / Returned has gone above 1500 app-shard-00-00-p6urt.mongodb.net:27017 Created: 2019/09/12 18:08 AWST Replica Set: app-shard-0 Type: Primary Current Value: 2,927.4
The spikes (and related alerts) happen throughout the day, opening and closing 3-4 times per hour, during the business day. Some notes:
- This is 100% related to being on Galaxy. After two migrations back and forth, there is no doubt about this
- I had, at the start, thought it might be an oplog issue. But we can rule that out now as we migrated to Redis Oplog a couple of months back (or do we need to explicitly ‘switch’ something off on Mongo?)
- I have spoke to the Atlas team at length and between all those chats, hours spent trying to decipher Mongo logs and going through every single db query in our codebase, I genuinely do not think this is the result of an unindexed query. The Galaxy vs AWS difference further makes me think this is something to do with Galaxy vs anything else.
- The latest migration to Galaxy happened a couple of hours back (and immediately we started getting these alerts). This migration happened after an AWS server crash due to something related to the node garbage collector. I do no know if there is a memory leak (to be investigated) and whether this has any bearing at all on these Atlas warnings…
- Have any of you seen anything similar to this warning before?
- Anyone have any theories about what might be causing this? Or ideas about the relationship with where the application is hosted?
Thanks in advance!