I’m researching the fastest way to search in a graph (array of objects) and have checked underscores .findWhere (as the resulting object can only be once in the graph) vs .find:
.find 0.363ms
.where 0.205ms
.findWhere 0.089ms
This is in local debug mode to purposely run it slower (otherwise it’s 0.013ms vs 0.113 ms).
Even .where which is going through the whole array of objects is faster than .find (which is a more recent implementation).
Why is that it’s slower or rather, why is underscore faster?
It comes down to computation, finding on graph is over the entire dataset whereas in underscore it’s object comparison. The way this is done factors heavily on the speed. Im surprised a purely vanilla method isn’t faster becuase underscore is a library so to make this even faster I would look at doing it in pure JS.
In the past I was able to achieve a huge speed increase in a similar manner by using the .includes method for array lookups in JS - its much faster then anything else out there and I was told that is due to the compilation of JS itself and these methods are worked in to be as quick to compile as possible.
Out of interest, could you try without object destructuring in the args?
Underscores implementation calls quite a lot of functions (where calls find which calls isArrayLike and findIndex) so I suspect the big difference would be the destructuring.
Is this in meteor or native J’s? In meteor the object destructuring is transpiled by babel, possibly making it more expensive (or cheaper, who knows, lol)
Be mindful of how you benchmark this. The measurements are really only useful if done in a production environment. The JIT compiler can optimize things in ways that aren’t obvious, so it’s entirely possible that that the native version is in fact more performant at load.
This talk discusses the woes of benchmarking. It’s somewhat old, but I imagine the JIT optimizations have only improved since then.
It’s a bit more difficult to do benchmarking on the production system as I can’t run this function 100x or even more just to see a proper time difference. Unless I maybe run the benchmark in a delayed job after the startup of the server is finished but then I’d need to build up the array of objects myself which isn’t as realistic as using real data.