Upgrade from 1.6 to 1.7 causing mismatch between number of docs returned by client and server error

Hi all,

After upgrading from 1.6.0 to 1.7.0.5 we’ve been noticing a severe slowdown in some of our publications. Here is a list of atmosphere packages we are using

accounts-base@1.4.2
accounts-password@1.5.1
alanning:roles@1.2.16
aldeed:collection2@2.10.0
aldeed:collection2-core@1.2.0
aldeed:schema-deny@1.1.0
aldeed:schema-index@1.1.1
aldeed:simple-schema@1.5.4
allow-deny@1.1.0
autoupdate@1.4.1
babel-compiler@7.1.1
babel-runtime@1.2.5
base64@1.0.11
binary-heap@1.0.10
blaze@2.3.3
blaze-tools@1.0.10
boilerplate-generator@1.5.0
browser-policy@1.1.0
browser-policy-common@1.0.11
browser-policy-content@1.1.0
browser-policy-framing@1.1.0
caching-compiler@1.1.12
caching-html-compiler@1.1.3
callback-hook@1.1.0
check@1.3.1
chfritz:easycron@0.0.4
coffeescript@1.0.17
dburles:collection-helpers@1.1.0
dburles:factory@1.1.0
ddp@1.4.0
ddp-client@2.3.3
ddp-common@1.4.0
ddp-rate-limiter@1.0.7
ddp-server@2.2.0
deps@1.0.12
dferber:prerender@2.2.2_3
diff-sequence@1.1.0
dispatch:twilio@1.1.0
dynamic-import@0.4.2
ecmascript@0.11.1
ecmascript-runtime@0.7.0
ecmascript-runtime-client@0.7.2
ecmascript-runtime-server@0.7.1
edgee:slingshot@0.7.1
ejson@1.1.0
email@1.2.3
emgee:libphonenumber@1.0.15
es5-shim@4.8.0
force-ssl@1.1.0
force-ssl-common@1.1.0
fortawesome:fontawesome@4.7.0
fourseven:scss@4.9.0
geojson-utils@1.0.10
hot-code-push@1.0.4
html-tools@1.0.11
htmljs@1.0.11
http@1.4.1
id-map@1.1.0
jquery@1.11.11
juliancwirko:postcss@1.3.0
kadira:flow-router@2.12.1
lai:collection-extensions@0.2.1_1
launch-screen@1.1.1
livedata@1.0.18
localstorage@1.2.0
logging@1.1.20
matb33:collection-hooks@0.8.4
maximum:computed-fields@0.2.1
mdg:meteor-apm-agent@3.1.2
mdg:seo@1.1.0
mdg:validated-method@1.1.0
mdg:validation-error@0.5.1
meteor@1.9.2
meteor-base@1.4.0
meteorhacks:async@1.0.0
meteorhacks:meteorx@1.4.1
meteorhacks:picker@1.0.3
meteortoys:toykit@3.0.4
miktam:loggly@2.0.0
minifier-css@1.3.1
minifier-js@2.3.5
minimongo@1.4.4
mizzao:timesync@0.5.0
mobile-experience@1.0.5
mobile-status-bar@1.0.14
modern-browsers@0.1.2
modules@0.12.2
modules-runtime@0.10.2
momentjs:moment@2.22.2
mongo@1.5.1
mongo-dev-server@1.1.0
mongo-id@1.0.7
mongo-livedata@1.0.12
msavin:mongol@2.0.1
natestrauser:publish-performant-counts@0.1.2
npm-bcrypt@0.9.3
npm-mongo@3.0.11
observe-sequence@1.0.16
okgrow:router-autoscroll@0.1.8
ordered-dict@1.1.0
patrickml:braintree@1.32.0
peerlibrary:blocking@0.5.2
percolate:migrations@0.9.8
practicalmeteor:chai@2.1.0_1
practicalmeteor:loglevel@1.2.0_2
practicalmeteor:mocha@2.4.5_6
practicalmeteor:mocha-core@1.0.1
practicalmeteor:sinon@1.14.1_2
promise@0.11.1
raix:eventemitter@0.1.3
random@1.1.0
rate-limit@1.0.9
react-meteor-data@0.2.16
reactive-dict@1.2.1
reactive-var@1.0.11
reload@1.2.0
retry@1.1.0
reywood:publish-composite@1.7.0
routepolicy@1.0.13
service-configuration@1.0.11
session@1.1.8
sha@1.0.9
shell-server@0.3.1
socket-stream-client@0.2.2
softwarerero:accounts-t9n@1.3.11
spacebars@1.0.15
spacebars-compiler@1.1.3
srp@1.0.12
standard-minifier-js@2.3.4
static-html@1.2.2
templating@1.3.2
templating-compiler@1.3.3
templating-runtime@1.3.2
templating-tools@1.1.2
tmeasday:check-npm-versions@0.3.2
tmeasday:test-reporter-helpers@0.2.1
tracker@1.2.0
underscore@1.0.10
url@1.2.0
webapp@1.6.2
webapp-hashing@1.0.9
xolvio:cleaner@0.3.3
zimme:collection-behaviours@1.1.3
zimme:collection-softremovable@1.0.5
zimme:collection-timestampable@1.0.9

This is definitely related to the upgrade as I’ve rolled it back in the past and the perf issues would disappear. It was actually worse when I tried to upgrade to 1.7.0 but it seems to have gotten better with 1.7.0.5 yet it’s still causing some issues, but only in our production environment. Would anyone have any clues as to why this could be?

Thanks.

Some extra context: we use Atlas as our cloud service and we’re on Mongo 3.4.16.

I’ve narrowed down the root cause of the issue to be due to a maxTimeMs option that I placed as a default for all our queries. Some change to how this works seems to have surfaced this issue for some of our slower subscriptions as it would cause the query to constantly re-run for some our users with slower connections. Upping the timeout has since resolved the problem.

Another update: seems to have only been a partial fix, still seeing issues with some subscriptions never sending back a ready hook and causing all other subscriptions to slow down.

The real issue was that there was a mismatch between number of docs returned by client and server, causing publications to break and subscriptions to hang. Setting Mongo.setConnectionOptions({ ignoreUndefined: false }) in a meteor package and putting that at the top of our .meteor/packages file fixed the issue.

I understand you found a workaround but can you share what the publication code that causes the issue looks like?

It looked something like this:

Meteor.publish("MY_PUBLICATION", function(userId, userEmail, queryStart, queryEnd) {
  return Tickets.find(
    {
      "options.edits": {
        $elemMatch: {
          deliveredAt: { $gte: queryStart, $lt: queryEnd },
          editedBy: userEmail,
        },
      },
    }
  );
});

The problem was queryEnd wasn’t actually getting passed in.