[SOLVED] Montiapm agent crashing with Meteor 3.4.1

I’ve upgraded to Meteor 3.4.1 and also updated montiapm:agent to version 3.0.0-beta.16

But this does not seem to stop a nasty crash that I get (sporadically):

Monti APM: trace has not started yet
meteor://💻app/packages/montiapm_agent.js:605
logger(‘ERROR:’, session.id, sub._subscriptionId);
^
TypeError: Cannot read properties of null (reading ‘id’)
at PubsubModel._trackError (packages/montiapm:agent/lib/models/pubsub.js:110:28)
at Subscription.subscriptionProto.error (packages/montiapm:agent/lib/hijack/wrap_subscription.js:71:28)
at Subscription. (packages/ddp-server/livedata_server.js:970:14)
at Generator.throw ()
at asyncGeneratorStep (/Users/dominicthwaites/WebstormProjects/mmm/.meteor/local/build/programs/server/packages/ddp-server.js:255:28)
at _throw (/Users/dominicthwaites/WebstormProjects/mmm/.meteor/local/build/programs/server/packages/ddp-server.js:276:17)
Node.js v22.22.1
Exited with code: 1
Your application is crashing. Waiting for file change.

The above happened on my dev machine but I also get the same issue once deployed to Galaxy - so I had to revert my 3.4.1 deploy.

Any ideas?

Looks like the session argument is not being passed:

monti-apm-agent/lib/models/pubsub.js at master · monti-apm/monti-apm-agent

Clearly, yes.

What I’m actually asking is what has changed in 3.4.1 that has brought about the possibility that the session parameter to this method can now be null.

A quick fix would be to check if it is null and return. But what is the root cause and can Galaxy/Meteor fix the issue?

This is the main PR I suspect could be related. @dupontbertrand, does that sound plausible?

Meteor core now appears to treat subscription._session as nullable during/after deactivation, and was updated to guard for that case. Monti APM is reaching into this internal field (this._session ) and assuming it is always non-null, then reading session.id , which matches the crash we’re seeing.

Since this relies on Meteor internals, we can’t really guarantee those invariants long-term. It might make sense for Monti APM to add a small compatibility guard, e.g.:

if (!session) return;

inside _trackError (and possibly similar places).

1 Like

@nachocodoner is correct, I can confirm that this crash is very likely caused by the lifecycle change introduced there.The important change in Meteor 3.4.1 is that publication onStop callbacks are now properly awaited before the subscription references are cleaned up.
Before that PR, onStop callbacks were effectively synchronous from the subscription lifecycle point of view. If an onStop callback did async work, Meteor did not wait for it before continuing cleanup

After PR #14236, the flow is now roughly:

_callStopCallbacks()  
.then(() => {    
this._session = null;    
this._documents.clear();  
});

So after a subscription has been deactivated, subscription._session can now legitimately be null :+1:

Meteor core was updated for that new behavior. For example, isReady() now guards against a missing session:

return this._deactivated || !this._session || this._session.inQueue === null;

So yes, this explains the Monti APM crash: montiapm:agent is reading session.id in PubsubModel._trackError without checking whether session is still defined

The new behavior is intentional: async onStop cleanup is now awaited correctly :+1:

The fix for montiapm-agent should be very small, probably something like:

xxxxxx._trackError = function (session, ...) {
  if (!session) return; 
 // existing logic
};

The temporary null-check workaround mentioned above is safe in practice: at that point the subscription is already being torn down, so there is no useful active session tracking to preserve.

Hope that clarifies where the change comes from :blush:

1 Like

I published version 3.0.0-beta.17 with this fixed.

While trying to reproduce this, I found an interesting gap in how Meteor handles errors during publications.

These both work correctly:

Meteor.publish('test1', function () {
  this.stop();
  throw new Error('test');
});

// Meteor handles this correctly too, though this is the case that causes the error in Monti APM
Meteor.publish('test3', async function () {
  await 0;
  this.stop();
  throw new Error('error after stop');
});

But this one causes an unhandled promise rejection in Meteor:

Meteor.publish('test2', async function () {
  this.stop();
  await 0;
  throw new Error('test');
});

I’ll work on a PR to fix this.

2 Likes

Fantastic response, guys - thanks so much for such a quick response.

I do have a policy of try…catch inside my publications so as not to throw an error from them, but clearly not everywhere!

1 Like

Sorry guys, one more thing…

My galaxy log is now getting filled with rather distracting entries like so:

Monti APM: trace has not started yet

Any chance we can suppress these or is it indicative of something else not working?

Thanks

I’ll look into it. It sounds like it is related to the same issue.

1 Like

If you update to 3.0.0-beta.18, this and a few other bugs with Meteor 3.4.1 have been fixed.

2 Likes

That’s awesome @zodern - really appreciate this