Users unable to log in with Meteor.loginWithPassword()

Hey guys I’m at my wits’ end for this one. Really appreciate some help here!

On production, sometimes we get into the situation when Meteor.loginWithPassword() is successful on the server but the callback is not triggered on the client, so the user does not know that they have successfully logged in, and they are stuck at the login screen with a loading button. It used to happen intermittently but I’m getting it once every 1 or 2 days now with an increase in user load.

There have been similar cases but there doesn’t seem to be a consensus:

For my case, when it happens to the first user, subsequent users cannot log in anymore, until I restart the server, after which Meteor automatically reconnects all users. However, those who are already logged in are not affected and are still able to interact with the server. The server is still responsive before the restart.

I’m using MontiAPM and it’s not showing errors or an increased CPU or memory load. There are no exceptions thrown as well. My accounts-base and accounts-password packages are all 1.6.0.

I’m wondering if:

  • an infinite loop somewhere is able to cause this?
  • socket is blocked?
  • I should create my own user/password/login function?

If anyone has any leads, ideas, I would love to hear them. If you are interested to help us debug the problem (for a price, even) let me know too. Thanks in advance!

I have seen similar behavior when load testing and it was due to a database bottleneck, specifically, due to the users collection being locked under heavy load (older MongoDB version).

Did you check your database logs? Which MongoDB version are you using? And where are you hosting?

1 Like

Hi Alawi, thank you for your tips and no I haven’t considered that!

I’m using version 3.4.1 for Mongo. I just pulled some logs through MUP:

[52.220.238.102]2020-06-03T12:20:57.565+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:60614 #87 (5 connections now open)
[52.220.238.102]2020-06-03T12:20:57.566+0000 I NETWORK  [conn87] received client metadata from 172.17.0.5:60614 conn87: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:21:02.087+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:60676 #88 (6 connections now open)
[52.220.238.102]2020-06-03T12:21:02.133+0000 I NETWORK  [conn88] received client metadata from 172.17.0.5:60676 conn88: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:21:02.159+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:60678 #89 (7 connections now open)
[52.220.238.102]2020-06-03T12:21:02.161+0000 I NETWORK  [conn89] received client metadata from 172.17.0.5:60678 conn89: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:29:46.263+0000 I -        [conn87] end connection 172.17.0.5:60614 (7 connections now open)
[52.220.238.102]2020-06-03T12:29:46.265+0000 I -        [conn85] end connection 172.17.0.5:60552 (6 connections now open)
[52.220.238.102]2020-06-03T12:29:46.266+0000 I -        [conn84] end connection 172.17.0.5:60550 (5 connections now open)
[52.220.238.102]2020-06-03T12:29:46.266+0000 I -        [conn89] end connection 172.17.0.5:60678 (5 connections now open)
[52.220.238.102]2020-06-03T12:29:46.267+0000 I -        [conn88] end connection 172.17.0.5:60676 (3 connections now open)
[52.220.238.102]2020-06-03T12:29:46.267+0000 I -        [conn83] end connection 172.17.0.5:60546 (3 connections now open)
[52.220.238.102]2020-06-03T12:29:46.267+0000 I -        [conn86] end connection 172.17.0.5:60600 (3 connections now open)
[52.220.238.102]2020-06-03T12:29:48.450+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:34920 #90 (1 connection now open)
[52.220.238.102]2020-06-03T12:29:48.466+0000 I NETWORK  [conn90] received client metadata from 172.17.0.5:34920 conn90: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:29:48.474+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:34924 #91 (2 connections now open)
[52.220.238.102]2020-06-03T12:29:48.474+0000 I NETWORK  [conn91] received client metadata from 172.17.0.5:34924 conn91: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:29:48.476+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:34926 #92 (3 connections now open)
[52.220.238.102]2020-06-03T12:29:48.476+0000 I NETWORK  [conn92] received client metadata from 172.17.0.5:34926 conn92: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:29:55.722+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:34992 #93 (4 connections now open)
[52.220.238.102]2020-06-03T12:29:55.725+0000 I NETWORK  [conn93] received client metadata from 172.17.0.5:34992 conn93: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:29:55.730+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:34994 #94 (5 connections now open)
[52.220.238.102]2020-06-03T12:29:55.733+0000 I NETWORK  [conn94] received client metadata from 172.17.0.5:34994 conn94: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:29:55.912+0000 I COMMAND  [conn90] command practicle.transactions command: find { find: "transactions", filter: { state: { $in: [ "pending" ] } }, sort: { lastModified: 1 }, returnKey: false, showRecordId: false } planSummary: COLLSCAN keysExamined:0 docsExamined:197595 hasSortStage:1 cursorExhausted:1 numYields:1546 nreturned:0 reslen:110 locks:{ Global: { acquireCount: { r: 3094 } }, Database: { acquireCount: { r: 1547 } }, Collection: { acquireCount: { r: 1547 } } } protocol:op_query 186ms
[52.220.238.102]2020-06-03T12:29:59.431+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:35130 #95 (6 connections now open)
[52.220.238.102]2020-06-03T12:29:59.432+0000 I NETWORK  [conn95] received client metadata from 172.17.0.5:35130 conn95: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:30:06.450+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:35266 #96 (7 connections now open)
[52.220.238.102]2020-06-03T12:30:06.462+0000 I NETWORK  [conn96] received client metadata from 172.17.0.5:35266 conn96: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:34:56.143+0000 I -        [conn95] end connection 172.17.0.5:35130 (7 connections now open)
[52.220.238.102]2020-06-03T12:34:56.144+0000 I -        [conn91] end connection 172.17.0.5:34924 (6 connections now open)
[52.220.238.102]2020-06-03T12:34:56.145+0000 I -        [conn90] end connection 172.17.0.5:34920 (5 connections now open)
[52.220.238.102]2020-06-03T12:34:56.146+0000 I -        [conn93] end connection 172.17.0.5:34992 (4 connections now open)
[52.220.238.102]2020-06-03T12:34:56.146+0000 I -        [conn92] end connection 172.17.0.5:34926 (5 connections now open)
[52.220.238.102]2020-06-03T12:34:56.147+0000 I -        [conn96] end connection 172.17.0.5:35266 (2 connections now open)
[52.220.238.102]2020-06-03T12:34:56.147+0000 I -        [conn94] end connection 172.17.0.5:34994 (1 connection now open)
[52.220.238.102]2020-06-03T12:34:58.352+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:37220 #97 (1 connection now open)
[52.220.238.102]2020-06-03T12:34:58.369+0000 I NETWORK  [conn97] received client metadata from 172.17.0.5:37220 conn97: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:34:58.377+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:37224 #98 (2 connections now open)
[52.220.238.102]2020-06-03T12:34:58.377+0000 I NETWORK  [conn98] received client metadata from 172.17.0.5:37224 conn98: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:34:58.379+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:37226 #99 (3 connections now open)
[52.220.238.102]2020-06-03T12:34:58.380+0000 I NETWORK  [conn99] received client metadata from 172.17.0.5:37226 conn99: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:35:05.502+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:37304 #100 (4 connections now open)
[52.220.238.102]2020-06-03T12:35:05.527+0000 I NETWORK  [conn100] received client metadata from 172.17.0.5:37304 conn100: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:35:05.533+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:37306 #101 (5 connections now open)
[52.220.238.102]2020-06-03T12:35:05.545+0000 I NETWORK  [conn101] received client metadata from 172.17.0.5:37306 conn101: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:35:05.736+0000 I COMMAND  [conn97] command practicle.transactions command: find { find: "transactions", filter: { state: { $in: [ "pending" ] } }, sort: { lastModified: 1 }, returnKey: false, showRecordId: false } planSummary: COLLSCAN keysExamined:0 docsExamined:197614 hasSortStage:1 cursorExhausted:1 numYields:1547 nreturned:0 reslen:110 locks:{ Global: { acquireCount: { r: 3096 } }, Database: { acquireCount: { r: 1548 } }, Collection: { acquireCount: { r: 1548 } } } protocol:op_query 207ms
[52.220.238.102]2020-06-03T12:35:09.158+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:37320 #102 (6 connections now open)
[52.220.238.102]2020-06-03T12:35:09.159+0000 I NETWORK  [conn102] received client metadata from 172.17.0.5:37320 conn102: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T12:35:12.446+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:37404 #103 (7 connections now open)
[52.220.238.102]2020-06-03T12:35:12.452+0000 I NETWORK  [conn103] received client metadata from 172.17.0.5:37404 conn103: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }

[52.220.238.102]2020-06-03T12:52:22.361+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:51824 #104 (8 connections now open)
[52.220.238.102]2020-06-03T12:52:22.363+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:51826 #105 (9 connections now open)
[52.220.238.102]2020-06-03T12:52:22.400+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:51828 #106 (10 connections now open)
[52.220.238.102]2020-06-03T12:52:22.410+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:51830 #107 (11 connections now open)
[52.220.238.102]2020-06-03T12:52:22.535+0000 I COMMAND  [conn105] command practicle.LogsDB command: getMore { getMore: 153112738302, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:153112738302 keysExamined:0 docsExamined:107972 numYields:844 nreturned:107971 reslen:16777124 locks:{ Global: { acquireCount: { r: 1690 } }, Database: { acquireCount: { r: 845 } }, Collection: { acquireCount: { r: 845 } } } protocol:op_query 134ms
[52.220.238.102]2020-06-03T12:52:22.558+0000 I COMMAND  [conn106] command practicle.SkillNodesDB command: getMore { getMore: 218601341304, collection: "SkillNodesDB" } originatingCommand: { find: "SkillNodesDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:218601341304 keysExamined:0 docsExamined:62150 numYields:488 nreturned:62149 reslen:16777236 locks:{ Global: { acquireCount: { r: 978 } }, Database: { acquireCount: { r: 489 } }, Collection: { acquireCount: { r: 489 } } } protocol:op_query 154ms
[52.220.238.102]2020-06-03T12:52:22.570+0000 I COMMAND  [conn104] command practicle.RecordsDB command: getMore { getMore: 198194919416, collection: "RecordsDB" } originatingCommand: { find: "RecordsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:198194919416 keysExamined:0 docsExamined:78979 numYields:618 nreturned:78978 reslen:16777250 locks:{ Global: { acquireCount: { r: 1238 } }, Database: { acquireCount: { r: 619 } }, Collection: { acquireCount: { r: 619 } } } protocol:op_query 168ms
[52.220.238.102]2020-06-03T12:52:22.609+0000 I COMMAND  [conn107] command practicle.ActionsDB command: getMore { getMore: 39364000696, collection: "ActionsDB" } originatingCommand: { find: "ActionsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:39364000696 keysExamined:0 docsExamined:85399 numYields:668 nreturned:85398 reslen:16777204 locks:{ Global: { acquireCount: { r: 1338 } }, Database: { acquireCount: { r: 669 } }, Collection: { acquireCount: { r: 669 } } } protocol:op_query 185ms
[52.220.238.102]2020-06-03T12:52:24.089+0000 I COMMAND  [conn105] command practicle.LogsDB command: getMore { getMore: 153112738302, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:153112738302 keysExamined:0 docsExamined:108330 numYields:848 nreturned:108330 reslen:16777246 locks:{ Global: { acquireCount: { r: 1698 } }, Database: { acquireCount: { r: 849 } }, Collection: { acquireCount: { r: 849 } } } protocol:op_query 158ms
[52.220.238.102]2020-06-03T12:52:25.520+0000 I COMMAND  [conn107] command practicle.ActionsDB command: getMore { getMore: 39364000696, collection: "ActionsDB" } originatingCommand: { find: "ActionsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:39364000696 keysExamined:0 docsExamined:77508 numYields:607 nreturned:77508 reslen:16777232 locks:{ Global: { acquireCount: { r: 1216 } }, Database: { acquireCount: { r: 608 } }, Collection: { acquireCount: { r: 608 } } } protocol:op_query 153ms
[52.220.238.102]2020-06-03T12:52:25.530+0000 I COMMAND  [conn105] command practicle.LogsDB command: getMore { getMore: 153112738302, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:153112738302 keysExamined:0 docsExamined:108337 numYields:846 nreturned:108337 reslen:16777231 locks:{ Global: { acquireCount: { r: 1694 } }, Database: { acquireCount: { r: 847 } }, Collection: { acquireCount: { r: 847 } } } protocol:op_query 152ms
[52.220.238.102]2020-06-03T12:52:26.782+0000 I COMMAND  [conn105] command practicle.LogsDB command: getMore { getMore: 153112738302, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:153112738302 keysExamined:0 docsExamined:105990 numYields:828 nreturned:105990 reslen:16777221 locks:{ Global: { acquireCount: { r: 1658 } }, Database: { acquireCount: { r: 829 } }, Collection: { acquireCount: { r: 829 } } } protocol:op_query 120ms
[52.220.238.102]2020-06-03T12:52:27.979+0000 I COMMAND  [conn106] command practicle.SkillNodesDB command: getMore { getMore: 218601341304, collection: "SkillNodesDB" } originatingCommand: { find: "SkillNodesDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:218601341304 keysExamined:0 docsExamined:59617 cursorExhausted:1 numYields:466 nreturned:59618 reslen:15288022 locks:{ Global: { acquireCount: { r: 934 } }, Database: { acquireCount: { r: 467 } }, Collection: { acquireCount: { r: 467 } } } protocol:op_query 164ms
[52.220.238.102]2020-06-03T12:52:31.488+0000 I COMMAND  [conn107] command practicle.ActionsDB command: getMore { getMore: 39364000696, collection: "ActionsDB" } originatingCommand: { find: "ActionsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:39364000696 keysExamined:0 docsExamined:72792 numYields:569 nreturned:72792 reslen:16777238 locks:{ Global: { acquireCount: { r: 1140 } }, Database: { acquireCount: { r: 570 } }, Collection: { acquireCount: { r: 570 } } } protocol:op_query 102ms
[52.220.238.102]2020-06-03T12:52:31.500+0000 I COMMAND  [conn105] command practicle.LogsDB command: getMore { getMore: 153112738302, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:153112738302 keysExamined:0 docsExamined:105370 numYields:824 nreturned:105370 reslen:16777156 locks:{ Global: { acquireCount: { r: 1650 } }, Database: { acquireCount: { r: 825 } }, Collection: { acquireCount: { r: 825 } } } protocol:op_query 131ms
[52.220.238.102]2020-06-03T12:52:32.783+0000 I COMMAND  [conn107] command practicle.ActionsDB command: getMore { getMore: 39364000696, collection: "ActionsDB" } originatingCommand: { find: "ActionsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:39364000696 keysExamined:0 docsExamined:73652 numYields:577 nreturned:73652 reslen:16777261 locks:{ Global: { acquireCount: { r: 1156 } }, Database: { acquireCount: { r: 578 } }, Collection: { acquireCount: { r: 578 } } } protocol:op_query 174ms
[52.220.238.102]2020-06-03T12:52:32.789+0000 I COMMAND  [conn105] command practicle.LogsDB command: getMore { getMore: 153112738302, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:153112738302 keysExamined:0 docsExamined:71450 cursorExhausted:1 numYields:559 nreturned:71451 reslen:11506945 locks:{ Global: { acquireCount: { r: 1120 } }, Database: { acquireCount: { r: 560 } }, Collection: { acquireCount: { r: 560 } } } protocol:op_query 133ms
[52.220.238.102]2020-06-03T12:52:37.377+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:51860 #108 (12 connections now open)
[52.220.238.102]2020-06-03T12:52:50.955+0000 I -        [conn105] end connection 127.0.0.1:51826 (12 connections now open)
[52.220.238.102]2020-06-03T12:52:50.955+0000 I -        [conn106] end connection 127.0.0.1:51828 (11 connections now open)
[52.220.238.102]2020-06-03T12:52:50.956+0000 I -        [conn107] end connection 127.0.0.1:51830 (11 connections now open)
[52.220.238.102]2020-06-03T12:52:50.956+0000 I -        [conn104] end connection 127.0.0.1:51824 (11 connections now open)
[52.220.238.102]2020-06-03T12:52:50.955+0000 I -        [conn108] end connection 127.0.0.1:51860 (12 connections now open)
[52.220.238.102]2020-06-03T14:35:25.769+0000 I -        [conn103] end connection 172.17.0.5:37404 (7 connections now open)
[52.220.238.102]2020-06-03T14:35:25.771+0000 I -        [conn102] end connection 172.17.0.5:37320 (6 connections now open)
[52.220.238.102]2020-06-03T14:35:25.771+0000 I -        [conn99] end connection 172.17.0.5:37226 (5 connections now open)
[52.220.238.102]2020-06-03T14:35:25.771+0000 I -        [conn101] end connection 172.17.0.5:37306 (5 connections now open)
[52.220.238.102]2020-06-03T14:35:25.772+0000 I -        [conn100] end connection 172.17.0.5:37304 (3 connections now open)
[52.220.238.102]2020-06-03T14:35:25.772+0000 I -        [conn97] end connection 172.17.0.5:37220 (2 connections now open)
[52.220.238.102]2020-06-03T14:35:26.225+0000 I -        [conn98] end connection 172.17.0.5:37224 (1 connection now open)
[52.220.238.102]2020-06-03T14:35:30.650+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:53346 #109 (1 connection now open)
[52.220.238.102]2020-06-03T14:35:30.669+0000 I NETWORK  [conn109] received client metadata from 172.17.0.5:53346 conn109: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T14:35:30.681+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:53350 #110 (2 connections now open)
[52.220.238.102]2020-06-03T14:35:30.681+0000 I NETWORK  [conn110] received client metadata from 172.17.0.5:53350 conn110: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T14:35:30.683+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:53352 #111 (3 connections now open)
[52.220.238.102]2020-06-03T14:35:30.684+0000 I NETWORK  [conn111] received client metadata from 172.17.0.5:53352 conn111: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T14:35:38.867+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:53394 #112 (4 connections now open)
[52.220.238.102]2020-06-03T14:35:38.868+0000 I NETWORK  [conn112] received client metadata from 172.17.0.5:53394 conn112: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T14:35:39.204+0000 I COMMAND  [conn109] command practicle.transactions command: find { find: "transactions", filter: { state: { $in: [ "pending" ] } }, sort: { lastModified: 1 }, returnKey: false, showRecordId: false } planSummary: COLLSCAN keysExamined:0 docsExamined:197615 hasSortStage:1 cursorExhausted:1 numYields:1545 nreturned:0 reslen:110 locks:{ Global: { acquireCount: { r: 3092 } }, Database: { acquireCount: { r: 1546 } }, Collection: { acquireCount: { r: 1546 } } } protocol:op_query 334ms
[52.220.238.102]2020-06-03T14:35:42.742+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:53408 #113 (5 connections now open)
[52.220.238.102]2020-06-03T14:35:42.743+0000 I NETWORK  [conn113] received client metadata from 172.17.0.5:53408 conn113: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T14:37:13.351+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:53820 #114 (6 connections now open)
[52.220.238.102]2020-06-03T14:37:13.357+0000 I NETWORK  [conn114] received client metadata from 172.17.0.5:53820 conn114: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T14:37:13.368+0000 I NETWORK  [thread1] connection accepted from 172.17.0.5:53822 #115 (7 connections now open)
[52.220.238.102]2020-06-03T14:37:13.375+0000 I NETWORK  [conn115] received client metadata from 172.17.0.5:53822 conn115: { driver: { name: "nodejs", version: "3.5.4" }, os: { type: "Linux", name: "linux", architecture: "x64", version: "4.4.0-1020-aws" }, platform: "'Node.js v12.16.1, LE (legacy)" }
[52.220.238.102]2020-06-03T15:22:28.449+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:37704 #116 (8 connections now open)
[52.220.238.102]2020-06-03T15:22:28.453+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:37706 #117 (9 connections now open)
[52.220.238.102]2020-06-03T15:22:28.507+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:37708 #118 (10 connections now open)
[52.220.238.102]2020-06-03T15:22:28.524+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:37710 #119 (11 connections now open)
[52.220.238.102]2020-06-03T15:22:28.655+0000 I COMMAND  [conn119] command practicle.SkillNodesDB command: getMore { getMore: 217347501431, collection: "SkillNodesDB" } originatingCommand: { find: "SkillNodesDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:217347501431 keysExamined:0 docsExamined:62150 numYields:485 nreturned:62149 reslen:16777236 locks:{ Global: { acquireCount: { r: 972 } }, Database: { acquireCount: { r: 486 } }, Collection: { acquireCount: { r: 486 } } } protocol:op_query 123ms
[52.220.238.102]2020-06-03T15:22:28.658+0000 I COMMAND  [conn116] command practicle.LogsDB command: getMore { getMore: 154553564694, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:154553564694 keysExamined:0 docsExamined:107972 numYields:843 nreturned:107971 reslen:16777124 locks:{ Global: { acquireCount: { r: 1688 } }, Database: { acquireCount: { r: 844 } }, Collection: { acquireCount: { r: 844 } } } protocol:op_query 155ms
[52.220.238.102]2020-06-03T15:22:28.666+0000 I COMMAND  [conn117] command practicle.ActionsDB command: getMore { getMore: 40175217692, collection: "ActionsDB" } originatingCommand: { find: "ActionsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:40175217692 keysExamined:0 docsExamined:85399 numYields:668 nreturned:85398 reslen:16777204 locks:{ Global: { acquireCount: { r: 1338 } }, Database: { acquireCount: { r: 669 } }, Collection: { acquireCount: { r: 669 } } } protocol:op_query 156ms
[52.220.238.102]2020-06-03T15:22:28.710+0000 I COMMAND  [conn118] command practicle.RecordsDB command: getMore { getMore: 199594201574, collection: "RecordsDB" } originatingCommand: { find: "RecordsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:199594201574 keysExamined:0 docsExamined:78979 numYields:617 nreturned:78978 reslen:16777250 locks:{ Global: { acquireCount: { r: 1236 } }, Database: { acquireCount: { r: 618 } }, Collection: { acquireCount: { r: 618 } } } protocol:op_query 189ms
[52.220.238.102]2020-06-03T15:22:29.808+0000 I COMMAND  [conn117] command practicle.ActionsDB command: getMore { getMore: 40175217692, collection: "ActionsDB" } originatingCommand: { find: "ActionsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:40175217692 keysExamined:0 docsExamined:76766 numYields:600 nreturned:76766 reslen:16777031 locks:{ Global: { acquireCount: { r: 1202 } }, Database: { acquireCount: { r: 601 } }, Collection: { acquireCount: { r: 601 } } } protocol:op_query 167ms
[52.220.238.102]2020-06-03T15:22:29.832+0000 I COMMAND  [conn118] command practicle.RecordsDB command: getMore { getMore: 199594201574, collection: "RecordsDB" } originatingCommand: { find: "RecordsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:199594201574 keysExamined:0 docsExamined:79649 numYields:622 nreturned:79649 reslen:16777109 locks:{ Global: { acquireCount: { r: 1246 } }, Database: { acquireCount: { r: 623 } }, Collection: { acquireCount: { r: 623 } } } protocol:op_query 129ms
[52.220.238.102]2020-06-03T15:22:30.015+0000 I COMMAND  [conn116] command practicle.LogsDB command: getMore { getMore: 154553564694, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:154553564694 keysExamined:0 docsExamined:108330 numYields:847 nreturned:108330 reslen:16777246 locks:{ Global: { acquireCount: { r: 1696 } }, Database: { acquireCount: { r: 848 } }, Collection: { acquireCount: { r: 848 } } } protocol:op_query 107ms
[52.220.238.102]2020-06-03T15:22:31.054+0000 I COMMAND  [conn117] command practicle.ActionsDB command: getMore { getMore: 40175217692, collection: "ActionsDB" } originatingCommand: { find: "ActionsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:40175217692 keysExamined:0 docsExamined:77508 numYields:605 nreturned:77508 reslen:16777232 locks:{ Global: { acquireCount: { r: 1212 } }, Database: { acquireCount: { r: 606 } }, Collection: { acquireCount: { r: 606 } } } protocol:op_query 104ms
[52.220.238.102]2020-06-03T15:22:31.461+0000 I COMMAND  [conn116] command practicle.LogsDB command: getMore { getMore: 154553564694, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:154553564694 keysExamined:0 docsExamined:108337 numYields:846 nreturned:108337 reslen:16777231 locks:{ Global: { acquireCount: { r: 1694 } }, Database: { acquireCount: { r: 847 } }, Collection: { acquireCount: { r: 847 } } } protocol:op_query 111ms
[52.220.238.102]2020-06-03T15:22:32.443+0000 I COMMAND  [conn117] command practicle.ActionsDB command: getMore { getMore: 40175217692, collection: "ActionsDB" } originatingCommand: { find: "ActionsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:40175217692 keysExamined:0 docsExamined:77451 numYields:605 nreturned:77451 reslen:16776767 locks:{ Global: { acquireCount: { r: 1212 } }, Database: { acquireCount: { r: 606 } }, Collection: { acquireCount: { r: 606 } } } protocol:op_query 129ms
[52.220.238.102]2020-06-03T15:22:32.466+0000 I COMMAND  [conn118] command practicle.RecordsDB command: getMore { getMore: 199594201574, collection: "RecordsDB" } originatingCommand: { find: "RecordsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:199594201574 keysExamined:0 docsExamined:80043 numYields:625 nreturned:80043 reslen:16777090 locks:{ Global: { acquireCount: { r: 1252 } }, Database: { acquireCount: { r: 626 } }, Collection: { acquireCount: { r: 626 } } } protocol:op_query 102ms
[52.220.238.102]2020-06-03T15:22:39.019+0000 I COMMAND  [conn116] command practicle.LogsDB command: getMore { getMore: 154553564694, collection: "LogsDB" } originatingCommand: { find: "LogsDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:154553564694 keysExamined:0 docsExamined:71547 cursorExhausted:1 numYields:560 nreturned:71548 reslen:11523102 locks:{ Global: { acquireCount: { r: 1122 } }, Database: { acquireCount: { r: 561 } }, Collection: { acquireCount: { r: 561 } } } protocol:op_query 101ms
[52.220.238.102]2020-06-03T15:22:43.465+0000 I NETWORK  [thread1] connection accepted from 127.0.0.1:37714 #120 (12 connections now open)
[52.220.238.102]2020-06-03T15:22:51.384+0000 I COMMAND  [conn119] command practicle.transactions command: getMore { getMore: 293313308227, collection: "transactions" } originatingCommand: { find: "transactions", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:293313308227 keysExamined:0 docsExamined:23805 numYields:187 nreturned:23805 reslen:16718147 locks:{ Global: { acquireCount: { r: 376 } }, Database: { acquireCount: { r: 188 } }, Collection: { acquireCount: { r: 188 } } } protocol:op_query 116ms
[52.220.238.102]2020-06-03T15:22:52.835+0000 I COMMAND  [conn118] command practicle.PracticeDB command: getMore { getMore: 182207815812, collection: "PracticeDB" } originatingCommand: { find: "PracticeDB", skip: 0, snapshot: true } planSummary: COLLSCAN cursorid:182207815812 keysExamined:0 docsExamined:143 numYields:2 nreturned:143 reslen:15820574 locks:{ Global: { acquireCount: { r: 6 } }, Database: { acquireCount: { r: 3 } }, Collection: { acquireCount: { r: 3 } } } protocol:op_query 102ms
[52.220.238.102]2020-06-03T15:22:56.955+0000 I -        [conn118] end connection 127.0.0.1:37708 (12 connections now open)
[52.220.238.102]2020-06-03T15:22:56.955+0000 I -        [conn120] end connection 127.0.0.1:37714 (12 connections now open)
[52.220.238.102]2020-06-03T15:22:56.955+0000 I -        [conn119] end connection 127.0.0.1:37710 (12 connections now open)
[52.220.238.102]2020-06-03T15:22:56.955+0000 I -        [conn117] end connection 127.0.0.1:37706 (9 connections now open)
[52.220.238.102]2020-06-03T15:22:56.955+0000 I -        [conn116] end connection 127.0.0.1:37704 (9 connections now open)

I’m not sure how to read these, are there indications of a lockup?

I think this is the MongoDB driver version. Where are you hosting your database? Atlas or compose have admin dashboard with logs, and again double check the DB version, I can already see some global collection locks so it has to be an older MongoDB version.

I’m hosting it locally together with Meteor on a single EC2 instance, together with Oplog. Is that a bad idea? If so maybe I’ll shift it to Atlas. I use Meteor-up with the following configuration:

mongo: { // remove if using remote mongo server
oplog: true,
port: 27017,
version: ‘3.4.1’,
//version: ‘4.0.6’,
servers: {
one: {}
}
}

Would a faster instance help until I move the database to Atlas?

Yeah I don’t think it’s a good idea to have your database and sever on the same machine in production.

I would suggest moving to Atlas and make sure you select the same region as your server.

As said, I suspect the issue is due to the users collection locking, Atlas has WiredTiger with built-in concurrency access, so it should solve that problem.

Alright I’ll do the shift tonight, thank you for your advice Alawi!

1 Like

Hi Alawi and guys,

we moved our database to Atlas! We also identified a method call that ends up in an infinite loop under certain conditions. On the server side it goes into a loop and calls heavy lookups on the mongodb side, so this might be the one causing the lockups.

After we fixed those issues the login problem disappeared, but it reappeared once this morning. I checked the MongoDB logs and it only contains information logs so it doesn’t seem to be a problem with MongoDB this time.

During that time, our APM showed a flat curve for some methods:

After this happened, some subscriptions went dead, but the server still responds to some method calls.

We did a restart on the server at around 12:15 and it recovered immediately.

Does anyone have any idea what could cause something like this to happen?

When this happened to me it was due to a problen with the oplog. I’ll see if I can dig around my notes and fine the exact problem. It also involved either a primary failiver or switch to a different secondary for reads. We never had this again after switching to redis oplog. I think the issue relationship is to how the oplog driver handles reconnects

Hey znewsham, thanks! I went to reread the whole thread again, I realized I haven’t added readPreference=primary yet so I might just try that first. I didn’t see any indication in Atlas that there was a failover though…

So I made the shift to redis-oplog and the login problem went away. Pub/sub and method timings also came down. There was an instance where our server became unresponsive but I believe that’s a separate problem.

Kudos to the redis-oplog team! I think everyone should try redis-oplog for their Meteor app.

2 Likes