How to debug slowdowns?

Hi,

we have a rather big app (kinda crm) where some users reported slowdowns (until unusable) after 1-2h of use. They say after logging out / in or opening a new tabs it´s fast again. Server doesn´t show any performance problems. It doesn´t seem to be bound to any certain clicks or subscriptions. A single user has a subscription size of about 10 MB.

Any ideas?

thanks

Kadira? https://kadira.io

Sorry, forgot to mention that kadira is in use. But nothing in kadira indicates an error or slowdown.

For example a user adds an appointment or something else. Then does it again on another customer. Every time the page loads it takes more time. Till the app isn´t usable any more.

The strange thing is it only applies to certain users. Browser is the same, but doesn´t make a difference if they change from chrome to ff for example.

I’m having a similar issue with a Dice roller. The method doesn’t do much, but it does load a rather large Document (the entire Game document - with all the players, and move history, etc.) - I’m thinking it’ll need to be refactored and normalized, but that’s a lot of work - and I’ve heard you shouldn’t have to normalize with MongoDB (I’ve always been skeptical though).

I’m going to first look at refactoring my lazy update method (it just shoves the entire document into the the update method of my collection, rather than selectively updating just the one new field that I add when a roll is made - lazy I know). But there’s not a lot of insight to be gleamed from kadira.io to be honest (unless I’m not using it right - is there a line by line profiler somehwere in there?)

Let me know if you figure anything out. :slight_smile:

Also wanted to mention, as each player who has access to the game (which is only updated server side behind a method) plays their independant rolls, it gets slower.

It’s strange though that playing a turn is actually significantly faster than rolling a single dice roll. The code is similar, though latency compensation is probably hiding the lag during play. I don’t use latency compensation on the roll method because I’m not sure how I’d get consistent random dice rolls - I’d probably need to implement some kind of token based system and preroll random things or something - the security, or anti-cheat problem is difficult, but I might still do that. I really can’t afford lag on this important user interface operation).

Are you using any router in your app? How are you managing subscriptions and minimongo?
Are you observing any cursors?
My suggestion would be to start looking at those things first.

I’d also look at how you organized your folders in your project, to avoid some unwanted server stuff from loading in the client.

We´re using FlowRouter. Subscriptions are managed in FlowRouter. We observe the customers collection for added and changed events (about 1000 a day).

It would be easier if every user would have the slowdowns. But some - subscribed to the same collections - don´t.

Kadira can track like everything regarding DDP.
And what is not using DDP to communicate, that has nothing to do with Meteor :smiley:
Good luck with debugging.

Based on your comments, it seems like issue with a lot of data coming into client and having issues based on that. May be it’s related to you are getting a lot of data to the client and then find it hard to fix.

May be try Kadira Debug on that slow clients: https://kadira.io/platform/kadira-debug/overview
(I heard you said, it’s hard to find slow clients at once)

I ran into a similar slowdown in a project: references to objects were preventing them from being cleaned up. Here’s an article I found helpful in debugging:

What about extreeeeme slowness?

I have a conversion of The Meteor Chef base React version to Material-UI that is behaving really oddly. It works fine locally, but on meteor.com it can take three minutes between clicking sign up, log in or log out before it responds. But only those actions. Everything else is fine.

You can see it here: http://gswr-material-ui.meteor.com/ (admin@admin.com:password)

Source code is here: https://github.com/mbrookes/getting-started-with-react

Any suggestions appreciated! (And sorry to hijack the thread!)

Edit: Looks like it may be related to whatever’s wrong with meteor.com this week. It’s gone from not starting after deploy, to barely logging in, to logging in fine, to barely being able to load:

Back to your scheduled discussion :slight_smile:

Hmm, I don’t know… From what you describe, it looks like the slowdown starts right after the user adds something to the database, then maybe changes pages, comes back and add something again. There must be something in the particular “flow” of certain users that’s messing up this cycle “add/observe/update subscription/change route/repeat”.
Second suggestion: It could have something to do with how your app and database are deployed/scaled. i.e. users from China accessing an US East instance? Maybe the higher latency is causing some processes to misbehave on the client?
Third: Check for any sort of location or account specific thing associated with users and how they access their subscriptions. i.e. Users subscribe to data based on their company name and role. Maybe if these select parameters are altered by side effects, they could cause problems with the publication and subscription.

Anyway, I don’t have much experience so this is just some brainstorming, but I hope it helps.

It seems that a complex history table pushes cpu on some clients to 100% for several seconds. I´m not sure if this is the reason.