Streaming openai responses to client?

altrome · March 21, 2024, 10:19pm

Hi all! I’m developing a chat interface on one of my applications, using openAI GPT4 to ask some questions about specific content retrieved from the app. I want to replicate the stream effect like the original chatGPT. The idea is to fetch openAI from a server method, and then stream the chucks received to the client as they reach the server. What is the best way to achieve the same effect?

I have seen that there are packages such as streamy, but I’m not sure if this is the optimal approach to achieve this…

illustreets · March 21, 2024, 11:29pm

Hello and welcome to the forums!

I assume you do not want to stream directly from OpenAI in the client, because that would involve exposing the API key or confidential data from your application.

Your best bet would be to use their native library for Node available at GitHub - openai/openai-node: The official Node.js / Typescript library for the OpenAI API then in your Meteor app you would employ a custom publication to send the chunks to the frontend using the pub/sub model. I am not familiar with streamy so I can’t comment on that. But, the custom pub/sub pattern is well established in Meteor.

See here how that would work: Publications and Data Loading | Meteor Guide

From the top of my head, you would do something like this (none of this is tested, it’s just an example):

// server

Meteor.publish('OpenAIEventsPub', async function (prompt) {
  // you may wish to do run authorization here, using this.userId

  const openai = new OpenAI({
    apiKey: process.env['OPENAI_API_KEY'],
  });

  const stream = await openai.chat.completions.create({
    model: 'gpt-4',
    messages: [{ role: 'user', content: prompt }],
    stream: true,
  });

  for await (const chunk of stream) {
    this.added('OpenAIChunks', 'id', { chunk: chunk.choices[0]?.delta?.content || '' });
  }

  this.onStop(() => {
    // kill the connection to OpenAI API when the user stops the subscription
  });

  this.ready();
});

// client

const Events = new Mongo.Collection('OpenAIChunks');

Meteor.subscribe('OpenAIEventsPub', 'Help me with .... [the prompt]');

Tracker.autorun(() => {
  const allChunks = Events.find({}).fetch();
  console.log(allChunks);
  // stitch all the chunks here, or maybe you need to pick up the latest only,
  // you need to see what works
});

Good luck!

paulishca · March 22, 2024, 8:50am

I think Streamy or Meteor Streams would be perfect for your case. (https://github.com/RocketChat/meteor-streamer/tree/master/packages/rocketchat-streamer). Perhaps a subscription to a virtual MongoDB Collection could do it too.

I do prefer Streams because of the concept: a relation exists, at the edge of the server, in memory, while the client needs it. It is very … volatile. No Mongo, Minimongo, or anything else is required.
I use it extensively in a communication app and can be adopted for elastic computing or multi-server.

A case where I use it:
We are buddies in a chat or a contact list. I want to see you go online or offline only when I am on the screen. I ping you “secretly” to see if you are online. I have 100 contacts, I don’t want to be mongo-subscribed to something as trivial as … 100 booleans. If you come in or go out … just let me know… might not even be interested. Staying private is as simple as … ok … I will not emit anything.

altrome · March 22, 2024, 9:19am

Thanks @illustreets for your reply.

Exactly! this is why I need to call OpenAI API from the server

I realized that I did not specify the context well in my question (mental note: be less ambiguous). I will try to give more context and the evidence and research I have done to date.

First of all, and regarding your proposal. OpenAI native library for Node requires node v18, so it can not be used right now with meteor 2.XX (this is not the case in v3, since it uses node v20, but I don’t want to upgrade yet), so you need to hard code the fetch call. This is not a big problem right now, as it is solved with a simple POST call to your API using the required parameters (stream included). The main problem lies in how to transfer the chunks received from the query to the OpenAI API to the client in an optimal way.

As you propose, we can use the Pub/Sub strategy used by Meteor on collections, but updating the collection on every chunk received and the consequent subscription on the client can be highly resources demanding (or at least this is my feeling).

The other option is using something like the streamy or meteor-stream, and stream the chunks to the client and only update the collection once the message is complete. And this is where my doubt appeared, since most of these packages are no longer maintained, or have not been updated for a long time, and I would not like to use a library that becomes obsolete in a short time.

altrome · March 22, 2024, 9:23am

Thanks @paulishca!! Good to hear that! I’m going to take a look at meteor-streamer, since I didn’t have it identified.

illustreets · March 22, 2024, 2:31pm

I also worked with meteor-streams in the past and I was about to suggest the same, originally. I didn’t know that RocketChat has forked Arunoda’s original package. I assumed you’d need to fork the package and bring it up to date.

@paulishca’s advice is more straightforward, although, for the sake of correctness, I should say that my suggestion does not involve any real MongoDB collection. That collection is only on the client side, only for that user, to capture the DDP messaging from the publication. The minimongo collections are in-memory constructs, in the browser.

altrome · March 22, 2024, 4:20pm

Good point! I hadn’t considered using the Pub/Sub model in this way… I’ll try it anyway.
Thanks again!!

altrome · March 26, 2024, 10:24pm

Hi all, as a follow up… I finally opted for the solution that @illustreets proposed. I like the fact of not depending on extra libraries, and his solution, apart from that, fits more with the Pub/Sub philosophy of meteor and the development has been easier than I imagined. Thank you very much to all of you for your answers.