Can we solve React cursor inefficiency? (vs Blaze)

gadicc · September 30, 2015, 10:57am

Here’s one for the React experts…

My biggest barrier to checking out React with Meteor is how inefficiently some things are handled. I agree in a lot of circumstances these things aren’t that noticeable, but when handling our own animations and working with big lists on mobile, this can be a real pain. Take e.g. the official TODO example:

App = React.createClass({

  mixins: [ReactMeteorData],

  getMeteorData() {
    return { Tasks.find().fetch() };
  },

  renderTasks() {
    return this.data.tasks.map((task) => {
      return <Task key={task._id} task={task} />;    
  },

  render() { ... <ul> {this.renderTasks()} </ul> ... }
});

So let’s say we have 2000 tasks already rendered, and a new one is added. What happens?

getMeteorData() is invalidated and Tasks.find().fetch() is rerun. So minimongo has to rebuild the entire list from scratch.
renderTasks() is rerun after the state change, iterating from scratch (via map()) over the entire query result set and rendering 2001 nodes to the virtual dom.
vdom diff (which iterates over 2001 children to figure out that 1 was added) and patch 1 element to the DOM.

That was painful just to type. Let’s look at Blaze: 1) cursor addedAt callback fired with just the new record, 2) Blaze.Each instance’s addedAt callback is run and a single view is renderered (and added to the DOM). In a super simple test, this is how long it took in each case to add the 2001st row:

Blaze: ~2 ms
React: ~160 ms
React: ~68 ms (after adding shouldComponentUpdate)

Is this solveable? Is anyone looking into a way to hack react and hijack the regular state update path? Is that idea an abomination that breaks react core principles and its implementation would be heresy? Is there already some advanced React feature I’m not aware of that helps address this?

I saw a proposal for sideways data loading with a gazillion comments including other proposals (even a ref to Meteor’s Tracker) and no consensus in sight. It wasn’t so obvious (to a React newcomer, at least) how exactly it would work and what problems it solves. Is this the missing magic we need? Will it ever land?

cc: @sashko - your input would be very welcome

serkandurusoy · September 30, 2015, 12:16pm

We should also ping our React expert in da house @SkinnyGeek1010

SkinnyGeek1010 · September 30, 2015, 1:17pm

If you can upload a repo to Github that replicates a simple example i’d love to play with it to try and optimize it.

When I last tried using getMeteorData it would diff 3-4x every time the list changed (putting a console.log in the component that has the posts.map(...). Here’s an example of that:
http://react-ive.meteor.com/feed

Adding shouldComponentUpdate helps if you can narrow down what’s changed without doing an expensive comparison (deep object comparisons are expensive). You want to ideally do this as high up in the tree as possible because it won’t diff any children if this should update returns false.

May I ask how you’re using shouldComponentUpdate? Perhaps theres a way to tune it. Using immutable data helps but isn’t quite do-able with a collection (though people are working on it). Then you can use a cheap === to compare and stop the cascading diff.

Is it still choppy after 68ms ? I think Blaze will always be faster in this case because it’s not checking the entire tree to ensure nothing has changed. Usually this cost is worth it in maintainability.

I know this isn’t applicable in your case but React Native has a construct for large lists that are animating:

gadicc · September 30, 2015, 2:05pm

Thanks, @SkinnyGeek1010. I knew I should have attached some example code so here it is: https://github.com/gadicc/react-vs-blaze. And yeah, it would have been better to run the test repeatedly and compute the average for a real benchmark, but I just wanted a general indication (I just ran it a few times to get a general idea and took that average). The shouldUpdate was just a simple _id check:

   shouldComponentUpdate(nextProps) { return nextProps.task._id !== this.props.task._id }

Well, anything above a few ms is guaranteed to be choppy, since we have 16.67ms to do our calculations, modify the DOM, repaint, etc. Actually Blaze’s 2ms is excessive and I’m sure could be improved, but that’s not for here (Yeah, ultimately we’ll be implementing our own ListView, just working out what to use for rendering content with it…).

But yeah, I know that currently Blaze will always be faster in this case. The question is if we can do something about it :>

SkinnyGeek1010 · September 30, 2015, 3:46pm

Cool thanks for posting a repo! Just browsing the code on my phone there should be a few things to speed it up… Also need to dive into the profiler a bit

I’ll try my best to report back tonight👍🏻

SkinnyGeek1010 · October 3, 2015, 2:07am

@gadicc I only had enough time to play with this for a bit but the only way I can drastically drop the time is to only diff the nodes that are visible. This gets around 6x faster than the shouldComponentUpdate method.

The only sane way to do this is to use a component… i’ve done something like this from scratch with Blonk (using Blaze) and it’s enough to pull your hair out

Here’s a few that I briefly tried:

Also using these tools may help (especially the wasted time func):

lai · October 3, 2015, 8:54pm

@jedwards created the immutable cursor observable that aims at speeding up things.

shock · October 3, 2015, 9:02pm

as @SkinnyGeek1010 prepared his cursor observing handlers, if we patch them to pass just observeChanges to some flux/redux immutable the way that just forward it as parameter for merge instead passing whole fetch() it could be interesting.

jedwards · October 3, 2015, 9:21pm

It will be interesting to see if it actually does – there is the overhead of doing the deep updates in the immutables. It may turn out that my package works best for handling deeply nested trees rather than large flat collections. Plus I haven’t ever checked the performance of Immutable.js itself,

jedwards · October 3, 2015, 9:28pm

Using immutable is totally do-able with a collection using my meteor-immutable-observer. And someone correct me if I’m wrong, but I don’t think Mongo.Cursor.observe or Mongo.Cursor.observeChanges loads a plain-JS copy in memory unless you do find().fetch() on a collection.

I’ll admit though, I think React’s performance is pretty overhyped. I don’t believe React will ever perform as well as tools like Blaze unless it undergoes major architectural changes. If you’re displaying a list of things in React, each time that list gets created, your render has to create the entire array of React elements again, and call their shouldComponentUpdate/diff those with the React components in the virtual DOM. Whereas Blaze can just create a single new element and add it in.

jedwards · October 3, 2015, 9:34pm

@gadicc Check out https://facebook.github.io/fixed-data-table/ – it only creates enough DOM elements to fill the screen, and swaps out their contents as you scroll, allowing efficient updates to tables containing potentially millions of rows. When you get to that scale you’ll run into problems with probably any framework, because you’d simply have too many DOM elements.

jedwards · October 3, 2015, 9:35pm

heh @SkinnyGeek1010 I just noticed you mentioned other similar components, cool!

SkinnyGeek1010 · October 4, 2015, 1:40am

I’m not sure it would help in this specific use case. The main bottleneck is running shouldComponentUpdate 2000 times.

An immutable array will be different once it adds an element. If you have an array of immutable objects you can use the === in the list item shouldComponentUpdate which in this case is the same as checking if the _id is the same.

However it would be huge if you needed to determine if a single document has changed (where the id would be the same but another key changed).

[quote="jedwards, post:10, topic:10590"] I'll admit though, I think React's performance is pretty overhyped. [/quote]

Yea I agree. I lot of people think React is so fast but performance is the bottom of the pros for me. It’s the architecture and maintainability that are a big win.

If you really need a fast declarative framework, Elm is pretty good (around 3x faster than React)
http://elm-lang.org/blog/blazing-fast-html

arunoda · October 4, 2015, 2:25am

This is one of the issue of the getMeteorData. It can invalidate due to a lot of reasons. So, this is something you can do. In somewhere else in your component, start a autorun and track this fetch.

Set the result into a ReactiveVar or a session. Then use it inside getMeteorData.
This is the way we can fix this issue.

To me, this is not an issue of React, but how getMeteorData is implemented.

gadicc · October 4, 2015, 3:30pm

Thank you everyone for all your detailed replies! So I think this is where we’re at:

I want to add a single new row:

Without rebuilding the entire dataset. Yes, instead of cursor.fetch() use @jedwards’s awesome meteor-immutable-observer, or via observe pattern (below).
Without re-rendering (virtually) every component for each member of that dataset (via map => <Comp>). Yes, presumably we can use an immutable map for this too, as long as we hold a ref to the “old” map, or via observe pattern (below).
Without diffing against the current component array. No, but thankfully this is the least resource intense part of the three, and good shouldComponentUpdate use can minimize it’s impact. It’s still something we need to solve though.

I did also find a solution though for fine-grained reactive updates (i.e. of existing components, but not adds/removes). I came across https://www.mendix.com/tech-blog/making-react-reactive-pursuit-high-performing-easily-maintainable-react-apps/ which takes a few pages out of the Blaze playbook and uses the mobservable mixin to reactively rerender only changed components (directly). Updating 10 components amongst 10,000 went from 2,433 ms to 63ms (still too much, but promising :)). I’m pretty sure we could use the same pattern with Meteor.

As best I can tell, for add/removes it looks like we’re out of luck, at least when trying to solve in “pure” React (i.e. as a react issue and not a DOM issue), in the current version.

The purpose of #3398 Implement Sideways Data Loading and alternatives (1,2, maybe more) is still not 100% clear to me (as a React newbie). I think it’s the same pattern as mobservable, so I’m still not sure if it would only help with updates vs adds/moves. If anyone understands this stuff better, some more details would be awesome. I did just notice that react-meteor-data does actually reference this issue so maybe @sashko could shed more light here.

I’ve never looked at the react source before (and am only just getting into using it at all) but when I have time I plan to look around and see how easily we can hack this without touching the react source, or maybe propose a PR of my own if nothing has happened by then. (Again, will be happy to be corrected if I misunderstood some part of React… it’s totally new for me).

@SkinnyGeek1010 and @jedwards, thanks again for all your input here. Regarding recycling DOM elements, yes, it’s a must for a case like this and is exactly what we’ll be doing, but I’m still looking if there’s a clear/transparent way of wrapping it with regular React and nothing fancy (like we can already do with Blaze). Really what I just want is addedAt, removedAt, movedTo methods, but maybe this breaks some React purist principles

@arunoda, I’m fortunate not to have come across that yet :> But I did double check that getMeteorData() is being run once only in this particular case and superfluous invalidations aren’t the issue here.

SkinnyGeek1010 · October 4, 2015, 4:15pm

Bummer, I was hoping that was worked out. In the early beta is was firing like 5x instead of once when the data was complete. I hope this gets resolved.

This is primarily the reason why I use Redux to store minimongo data. Instead of watching for subscription changes i’m just watching for a collection to change with meteor-flux-helpers. This only fires when the minimongo data changes and triggers one re-render at the same time.

jedwards · October 4, 2015, 5:49pm

No, if two immutables are ===, they are deeply equal. No matter what changed within a document, the cost to determine if it changed is one === comparison. With meteor-immutable-observer, when a single document changes, Meteor calls the changed callback of Mongo.Cursor.observeChanges that I provide with only the id and fields of the document that changed. I then update only those fields of the immutable for that document – all others remain the same. And the root immutable becomes a new object, but it still points to 1999 of the same objects as the last root immutable – only 1 is !== to the previous. Even unchanged fields within the changed document remain === the values for those fields in the previous immutable.

Now you could get more performance for changed documents by using observeChanges directly, and in your changed callback, calling forceUpdate of the React Component for the corresponding list item (or setState or some other sideways loading). That way you could avoid the other 1999 shouldComponentUpdate checks. You could even create organized way to do that so that the code is clean. But it wouldn’t help for added/removed documents, and for my own projects it hasn’t been necessary.

SkinnyGeek1010 · October 4, 2015, 6:10pm

Ah thanks for clarifying that!

mbrookes · October 25, 2015, 10:46pm

I’ve run into an issue where getMeteorData has a devastating impact on Colliction.insert performance (at least on Android).

I’m inserting accelerometer data (a tiny 4-attribute object) 10 times a second into a Meteor collection.

With no getMeteorData() it has sustained roughly 9.6 inserts per second for the last 2 hours 20 minutes - 80,800 documents and counting.

With getMeteorData() active, and an empty collection, insert performance rapidly tanks from 10 inserts per second, to 1/s within about 50 inserts, and continues to drop as the collection grows.

It isn’t rendering anything (for the purpose of comparison), so the difference is purely down to getMeteorData() / Collection.find() performance.

let watchID;

App = React.createClass({

  mixins: [ReactMeteorData],
  
  getMeteorData() {
    return {
      motions: Motions.find({}, {sort: {timestamp: -1}, limit: 100}).fetch()
 /  }
  },

  componentDidMount: function () {
    let options = { frequency: 100 };
    watchID = navigator.accelerometer.watchAcceleration(this.accelerometerSuccess, this.accelerometerError, options);
  },

  compontentWillUnmount: function () {
    navigator.accelerometer.clearWatch(watchID);
  },

  accelerometerSuccess: function (motion) {
    Motions.insert(motion)
  },

  accelerometerError: function (error) {
    console.log('Accelerometer error.', error);
  },

  render() {
    return <i />
  }
});

Most of the discussion here is over my head, but does this tally with what others are seeing?

Any suggestions on how to handle this volume of inserts, while also analyzing it and displaying feedback? I’m think perhaps a non-reactive fifo buffer that gets used as the data source, and to only store aggregated results to the collection, but that removes the possibility to perform more detailed post analysis on the server…

generalledger · October 27, 2015, 1:17pm

Out of curiosity, do you know if there would have been a similar performance hit with Blaze?