NodeJS ssh timeout issue

In my meteor application I open multiple parallel ssh connections upto 200 max. I use nodejs based node-ssh library to establish ssh connections (Node-ssh library in-turn uses nodejs based ssh2 library). I use nodejs based async library to run ssh tasks in parallel. I run 200 async tasks, via async library, to open 200 parallel ssh connections.

When I run the application with meteor 1.8, my application runs properly. But when I run the application with meteor 1.9 or above, I get lot of ssh timeouts and I can’t complete my activity. Meteor 1.9 or higher versions use node v12.14.0 or higher. I ran the application using node v12 on production and it worked, so I guess something has changed in node v12.14.0 and higher versions.

I can tweak the ssh timeout settings and number of parallel ssh connections to make application work properly with node v12.14.0/12.16.1. However this decreases performance significantly.

My question is, has something changed with relation to ssh/async-tasks in node v12.14.0 or v12.16.1 that I cant run 200 ssh tasks in parallel? Iam wondering that behaviour of worker threads in higher versions of nodejs has probably changed, so Iam not able to open 200 parallel ssh connections. Any suggestions on what the problem is or how I can debug it.

This sounds like a deeply technical issue, so if you haven’t already added an issue to the Meteor repo, you should do that now.
Maybe a core developer (Benjamin?) can shed some light into what has changed and how you might diagnose this further.

My gut feeling is that something might have changed with Fibers?
You said you’ve tested it with Node 12? which version? Have you tested the old app in production with 12.14.0?

Thanks @coagmano for looking into this.

I do following operations in my code. Via node-ssh library, I open parallel ssh connections. I use async library’s async.parallelLimit to schedule asynchronous parallel ssh connections.

For the following values, ssh timeout as 20 secs and number of parallel ssh connections as 200 with meteor 1.8.0.2 and using node v8.11.4 on my production deployment, my application works properly. For the same values with meteor v1.9 or above, and using node v12.14.0 or higher (including node v12.16.1) on prodcution deployment, I get lot of ssh timeouts and my application does not work. However instead of using node v12.14.0 if I use node v12.0.0 with meteor v1.9 for production deployment, my application works properly with above values of ssh timeout and number of parallel ssh connections.

To make my application work properly with meteor 1.9, using node v12.14.0 or higher, I have to set ssh timeout value as 3 mins instead of 20 seconds and number of parallel ssh connections as 50 instead of 200. However this slows down my application by 15-20 mins.

I have been debugging this now since quite some time to see if Iam doing something wrong in my code, but I cant pin-point anything. I suspect its a node or can meteor fiber issue as you point out @coagmano. I will raise this issue in meteor repo, thanks.

1 Like

It seems that there may be an issue with running 200 parallel SSH connections using the Node.js-based node-ssh library in Meteor applications with versions 1.9 or higher.

While the application runs properly with Meteor 1.8, running the same application with Meteor 1.9 or higher versions, which use Node.js v12.14.0 or higher, results in SSH timeouts and incomplete activity