How to solve sync latency with multi-core or multi webservers?

Hi, everyone. Apologies in advance for my poor English.
I’m new to both Meteor and MongoDB. Any help would be greatly appreciated.

I am trying to implement a chat application.

----- Problem -----
Chat is a time sensitive data and normal sub/pub seems to have a time lag when users inserts messages from different web servers.

  • When I disable multicore support and if the users are on the same webserver IP,
    then the new messages are synchronized(published) to other users immediately.
  • When I enable meteorhacks:cluster package, and even if the users are on the same webserver IP, the new messages are synchronized after a while (2-5 secs in average). Sometimes another user sees updates faster than the user who actually input the message.

Any idea, please?

  • specifications and code snippets are as follows:

----- Specifications and code snippets -----

  • Webserver: AWS EC2 m4.large

  • MongoDB Server: AWS EC2 m4.large
    installed with guidelines from Digital Ocean

  • Load Balancing and SSL: AWS ELB

  • Meteor version: 1.3.x(updated from 1.1.0.3 as of June 18th, 2016, still the same)

  • Collection name: ‘Chats’ defined using SimpleSchema/attachSchema

  • Subscription: Using SubsManager
    Subscription.get(‘chat’).subscribe(‘chats’, boardId, currentMsgCount, chatRoomId);

  • Publication:
    Meteor.publish(‘chats’, function (boardId, currentCount, chatRoomId) {
    check(boardId, String);

    if (!this.userId) {
    return this.stop();
    }

    var self = this;
    var ready = false;
    var cursor;

    if (!chatRoomId) {
    var chatRoom = ChatRooms.findOne({‘boardId’: boardId});
    chatRoomId = chatRoom ? chatRoom._id : ‘’;
    }

    if (currentCount == null) {
    cursor = Chats.find({$query: { boardId: boardId, chatRoomId: chatRoomId }, $hint: {boardId: 1, chatRoomId: 1, createdAt: -1}/, $orderBy: {createdAt: -1}/},
    {limit: 20 });
    } else {
    var count = 20;
    if (0 >= currentCount)
    currentCount = count;

      cursor = Chats.find({$query: { boardId: boardId, chatRoomId: chatRoomId }, $hint: {boardId: 1, chatRoomId: 1, createdAt: -1}/*, $orderBy: {createdAt: -1}*/},
          {skip: currentCount, limit: count });
    

    }

    var cursorHandle = cursor.observe({
    added: function (doc) {
    self.added(’_chats’, doc._id, doc);
    },
    removed: function (doc) {
    self.removed(’_chats’, doc._id);
    },
    changed: function (doc) {
    self.changed(’_chats’, doc._id, doc);
    }
    });

    self.onStop(function () {
    cursorHandle.stop();
    });

    ready = true;
    self.ready();
    });

  • Chat message insert:
    Chats.insert({
    localId: localId,
    boardId: chatRoom.boardId,
    contentType: chatContentType,
    message: message,
    userId: userId,
    chatRoomId: chatRoomId});

Are you using MongoDB oplog? Otherwise Meteor will poll the database which is slower. You’ll also see delays if SockJS is falling back to polling instead of websockets. The two could conceivably combine to give you 2-5 second delays, especially on a slower network.

In terms of latency compensation for the UI you need to set up stub methods that run on the client and update the minimongo database, so the user sending the message sees it instantly, then if the server method fails the local change will be reverted, otherwise it sticks and the new message is pushed to all the other clients via their subscriptions. Have a look at https://meteorhacks.com/introduction-to-latency-compensation/ for more details on that.

Thank you very much for your insight, mjmasn.
Yes I do use oplog tailing.
But I have not known that falling back to polling instead of websocket can cause the symptom. As far as I understand, AWS ELB does not support websocket even though it supports sticky sessions (for limited protocols only). But this doesn’t cause any delay when I use just one EC2 instance without enabling multicore instances. Hmmm…
Anyway, I will do consider moving to some other cloud service.
Also, about using client side minimongo, I did consider that, but have been hesitating to implement that approach due to some side effects, such as flickering(if series of requests’ responses were ended in different order) or disappearance(in case of error). Made me think about it again.
Again, thank you so much.