Poll: One or two databases with two different types of users?

arggh · July 9, 2020, 10:23am

I’ve run into a dire situation where I can’t make up my mind on the best way to proceed. Plz help.

Background

I have a three apps.

App 1

Used by regular users like Jane and Joe. Registered users only.

App 2

Used by business users like Microsoft and Apple. Registered users only.

App 3

Public facing app that lists the businesses in app 3, browsed by users of app 1 and anonymous users.

App 1 has a huge amount of data created by Janes and Joes. App 2 has an equal amount of completely non-related data created by Apple and Microsoft. App 3 calls both app 1 and app 2 to get data from both as needed.

The only overlaps in data (as we know today) are these cases:

Jane and Joe can send messages to Apple and Microsoft, and vice versa
Jane and Joe can mark Apple and Microsoft as favorited companies
Jane and Joe can browse businesses either in app 1 or app 3.

Option 1 - Hosting everything in one database

Pros:

One less database to maintain
Easy to point both apps to the same database

Cons:

Lots of non-related data in one DB
Less resilient, one DB hiccup takes both apps down
Less distributed, more load on that one DB
Have to maintain a distinction in the user documents, is the role business or people?
Complex user document schema
Complexity in logins etc. to make sure users login in correct places
Risking that updating one app requires updating both apps

Option 2 - Two separate databases

Pros:

Clear separation of the two domains
Resilient, one DB hiccup or attack doesn’t necessarily take both apps down
Distributed load between the two apps
No need to worry about mixing the (completely different) user document types

Cons:

More work and money required to host two databases
More work needed to get the two apps to communicate securely
Expected weird edge cases, eg. what happens when there is a potential _id clash among users of different apps?

Since I’m maybe forgetting some important aspects, I thought I’d ask here for input. There’s even a poll! Any feedback highly appreciated

One database
Two databases

0 voters

arggh · July 9, 2020, 1:32pm

Writing the options down, it’s starting to look like the separate DB has it’s merits, while the other option is mostly just about being lazy

illustreets · July 9, 2020, 4:12pm

A bit more work, a bit more expensive (though it depends!), but cleaner and easier to reason about separate databases.

Regarding the issue of unique ID’s, I did notice a problem with Meteor’s Random.id() even in the same database. Just generate one million IDs, and you will find multiple duplicates - it was long ago, I can’t remember what percentage, but we didn’t like it. So we found and used ever since this little gem: https://www.npmjs.com/package/shortid

EDIT: regarding secure communication between apps, that would indeed require some additional effort. If you host them yourself, a good solution is to put them both in the same AWS VPC, and make them connect to each other via private interface. Otherwise, I’m sure you know, you don’t want to do DDP.connect("ws://x.x.x.x:3000") but force connection over wss. Shame MongoDB doesn’t have something akin to dblink or foreign data wrappers, in the way PostgreSQL has.

arggh · July 9, 2020, 5:11pm

Actually, I don’t! Could you elaborate? The Meteor documentation is kind of sparse in this area

illustreets · July 9, 2020, 7:04pm

So, if you have both apps communicating in a private network, in which only they (and other trusted machines) exist, you’re good with communication over plain HTTP or WS. If you plan to connect over the public network, then you must use a secure connection, because sniffing and reading this kind of traffic is dead easy. Sorry if I repeat something already obvious, this is just in case someone comes googling at a later stage…

It has been a long time, we used to connect different servers using only Meteor’s DDP package, but…

What I had in mind was something like this https://github.com/oortcloud/node-ddp-client when I gave the example, where one specifies the websocket URL directly, so you explicitly set it to use the secure WebSocket connection. But even if you use Meteor.connect() like in this question, using a URL with https://... should force the underlying driver to switch to wss (here’s a nice explanation on how this works).

Obviously, communicating over https / wss supposes you have proper SSL termination at the point of entry in both apps. With Galaxy that comes as standard. On your own, you must use something like NGINX.

jkuester · July 9, 2020, 7:30pm

In Microservice architecture, db sharing is highly discouraged due to tight coupling of models between applications. Changes will cost you more in the long run when using a shared db.

Also note, that you can have one mongo service running multiple databases so no need for a full second mongo server. When using mongo from a provider it may be different of course.

Syncing users is easy between to servers using ongoworks:ddp-login and a sync user account that is defined in the settings.json but if you could also implement an own oauth workflow using accounts-oauth.

If you have more complex data sharing between more than two apps you might add another service that handles the sync (like a gateway service).

Good read here is Martin Fowlers book in Microservices

arggh · July 9, 2020, 7:35pm

Ah, I misread / misunderstood your first comment. Yep, our app servers share a private network and we have full control of the firewalls, so establishing a secure connection should be trivial.

arggh · July 9, 2020, 8:07pm

I believe you are very correct, these bullet points in the initial post touched this topic:

Have to maintain a distinction in the user documents, is the role business or people ?
Complex user document schema
Risking that updating one app requires updating both apps

This is a good point, with easy transitioning path to multiple servers, should the need arise in the future.

This comment about syncing users I didn’t quite grasp, could you elaborate? Any tips or help regarding authentication in this scenario is very much appreciated

Our use case involves mostly these situations where the two datasets mix:

Accessing some of the documents from the other DB (“regular users” viewing business profiles, managed by the “business app”)
Users of both apps, businesses and regular Joes, can have shared conversations. These documents therefore need to reference users from both databases.

My idea was to not sync anything (maybe I misinterpreted the term used?) between the databases, but to provide an API for the app servers to talk to each other when needed, exposing the documents as required.

Meteor provides an easy way to connect and call methods & subscribe via DDP.connect, but since the user database is not shared, the resume token authentication method doesn’t work here (or was the syncing supposed to address this issue?).

I suppose I have to come up with a custom auth solution by sharing some secrets etc…

Thank you for the tip!

arggh · July 10, 2020, 6:44am

Connecting to both databases from both apps could also work, but this makes them more tightly coupled and could cause:

Risking that updating one app requires updating both apps

arggh · July 11, 2020, 10:51am

I was already pretty happy with my solution where I just cross-access the databases, so I can share documents:

import { Mongo, MongoInternals } from 'meteor/mongo';

const driver = new MongoInternals.RemoteCollectionDriver(
  'mongodb://localhost:27017/app' 
);

export const Conversations = new Mongo.Collection('conversations', { _driver: driver });

…until I tried to access the other app’s users in the same fashion:

import { Mongo, MongoInternals } from 'meteor/mongo';

const driver = new MongoInternals.RemoteCollectionDriver(
  'mongodb://localhost:27017/app'
);

export const Clients = new Mongo.Collection('users', { _driver: driver });

Error: There is already a collection named "users"

Looks like I have to create some custom interface for querying the other app’s users after all

@jkuester it sounded like you were very familiar with this set of problems, any input highly appreciated!

arggh · July 11, 2020, 4:37pm

I re-read your message and possibly understood what you meant by this:

Have a “sync user”, meaning a user that has all the required permissions for the cross-instance tasks (eg. role === ‘system’)
Login from app1 to app2 using the “sync user” credentials
Do what you gotta do

Right?

I would love to hear a more detailed description about the flow on all of these (in the context of 2 DBs, two sets of users, shared documents referencing users from both DBs)

jkuester · July 14, 2020, 3:32pm

Correct, this user is only there to retrieve or get documents that have to be shared across both services.

So the idea is like this:

both apps have their own accounts (no shared third accounts server)
both apps have a default user in their settings.json that is created at startup, if it not yet exists:

{
  "sync": {
    "user": {
      "username": "syncuser123",
      "password": "defaultPassword"
    }
  }
}

This is from my viewpoint fine here, because as long as there is no leakage of these credentials from the server env to client, there should be no issue with the password secrecy.

Note, that the combination username/password needs to be the same for both apps.

Open a DDP connection between both applications
Create hooks for any accounts-related operation
On every hook let the apps check, if there is a diff between the given user docs and update the app, that has the outdated info (use timestamps to check who had the latest information)
Try to provide a common API for these operations via a package in order to not create a tight coupling
Your users are now shared across both apps

This is of course one approach and if you go beyond two apps you may use a shared auth-server and add this as login service to your apps.

arggh · July 24, 2020, 8:31pm

Since users of both apps can participate in conversations, I have to reference user ids from both databases in one document, which is hosted on either of the databases.

The tricky part is, I have no guarantee over not having duplicate id’s across the two databases (unless I manually cross-check every time a new user is created).

So in essence, I’m forced to create separate fields based on the user’s “origin”:

Conversation {
  app1_participants: [],
  app2_participants: []
}

…which I don’t like, at all. I’m still undecided whether I should go with 1 or 2 databases, and even the poll is pretty even with all the 7 votes given

jkuester · July 25, 2020, 10:10am

You could also use an extra docId field that is your main identifier, then the _id field would not be required.

rjdavid · July 25, 2020, 10:38am

Voting for 1 shared db

Is this really a big issue especially in the world of denormalizing data with nosql dbs?

Redundancy can make this less of an issue

Again, redundancy

This sees easier than what you have to go through with separate dbs

I find Meteor’s structure easy on the update side

arggh · February 14, 2021, 5:18pm

Turns out creating a DDP connection from server to server is easy, but authenticating from the server side isn’t.

In case someone stumbles across this post, I ended up using ongowork:ddp-login and it seems to do the trick.