Parallel execution of database-tied tests

aureooms · January 16, 2022, 5:34pm

I would be very interested in being able to run mocha units (it) or equivalent in parallel, with no side effects, even when those interact with the database. To make this work would need the following:

Spawning multiple MongoDB instances
Somehow tying the scope of each unit to a particular instance
Recycling MongoDB instances for performance

In addition to that, --full-app would require parallel recyclable web app instances with scopes tied to particular instances of window or whatever browser driver (puppeteer for instance).

Why am I thinking about this? I am currently writing a suite of integration tests for an app I wish to increase the maintainability of, and I can see the tests total running time slowly climbing up. I have eight logical cores on my machine, most tests are stuck multiple times waiting for asynchronous calls to complete, it just seems like a lot of wasted time.

Maybe this will require some deeper integration with the test driver, like passing these different scope specific variables explicitly. Maybe this is out of scope for Meteor and requires a separate tool. Someone with more experience on this?

Ideally I would switch the test driver to use AVA but that’s not necessary. After sifting through Meteor’s sources it seems that’s where the change needs to happen first, not in the test driver.

Truth is I tried to figure out how difficult it would be to write my own Meteor test driver with Jest or AVA (without even thinking about parallel/isolated testing then) and I could not get anywhere. It’s a tangle of code lines where it took me an inordinate amount of time to figure out who is responsible for loading test files (seems it is the meteor test command, but not sure, where I would have expected the driver, mocha, to be responsible). Then I am unable to find where that list is passed down to the driver, mocha.run just knows what to do magically and that is beyond my understanding.

znewsham · January 16, 2022, 8:50pm

I’d be very interested in potential full solutions to this. I’ve been looking into part of this for my own projects, specifically the db portion, side effects in general are hard to handle, as are full app tests vs unit tests (specifically client driven tests vs server) so I’ll skip over those pieces.

Database level isolation isn’t too hard if you aren’t afraid to get your hands dirty modifying a couple of packages and, your test driver triggers tests from the server (or allows you to run custom code on the server prior to running the test)

You dont need multiple instance of mongo server running you just need to connect to a different database each time. You can do this with a couple of steps.

Create a meteor environment variable to track the current test context. The value of this variable will be bound to a fiber and thus can re used in parallel.
In the beforeEach of every test set this to a new random Id. This will be the database name in all future database interactions
Patch your mongo collection operations or better yet the mongo package so that the underlying operations are issued against a new connection each time (connecting to the database name stored by the environment variable you tracked in step 1.
You might need to patch your test driver to spawn a new fiber for each test if it doesn’t already do this you may also need to patch it to allow it to run in parallel at all.
(Optional) I’m your afterEach clean up the db.

There are limitations to this of course, specifically any asynchronous work done by your tests would need to be awaited before the test completes.

As I said, general side effects are hard to handle generically. So you’d still have trouble if you’re doing things like counting invocations of methods which could be triggered asynchronously.

aaronford · May 12, 2022, 11:26pm

I’d definitely be interested in this too. In our case though, we aren’t so much worried about testing the interaction with the database but more worried about testing against the data that is stored in it. So, my thought was to have the one database and use something like hwillson:stub-collections just to load the data from it into virtual collections for each worker then the beforeEach hook just wipes over it with a fresh copy from the db each time?
Our test db is only ~7MB in size, so loading the whole thing isn’t really an issue for us, but I can see it being an problem if people had larger data sets. Maybe either a way to define which collections and maybe even which documents to load would be needed?