Design pattern to prevent data disclosure in case of a mistake

klaucode · August 14, 2023, 8:26am

Hi, I have a question focused to security of clients data in case of developer mistake. This is probably question to design pattern.

The best explanation in this case is on example.
This is the sample example:

In Meteor app will be registered 4 users from 2 companies (company1, company2). Company1 and Company2 will be storing their own posts. Target is, that all users from company1 can be able to create and modify posts of company 1. The same situation is with company2.

But, what is most important think, Company1 and Company2 they must not see each other’s data!!! I would like to somehow prevent, if developer will make a mistake, still will be the data available only for those company, to which they belong.

Is it possible somehow to solve it without separate app and data to separate mongo databases? Is it possible to solve it to 100% somehow in single app and mongo database so that I can sleep peacefully without worry?

Thanks a lot for each help:-)

minhna · August 14, 2023, 8:40am

I guess you need a field companyId in your data and you need to filter data by this field on the server side.

rjdavid · August 14, 2023, 8:44am

Can it be done?

Yes

Can I sleep peacefully without worry?

This requires an exemplary process of testing from code review, unit tests, end-to-end tests, manual tests, and even user acceptance testing.

Nevertheless, bugs are part of development. The question is how fast you can identify the bug, at what stage, and how fast can your team react to fix especially severe and critical bugs.

klaucode · August 14, 2023, 8:51am

Thanks a lot @minhna for your response. Yes, this is good idea. But If company1 and company2 will be competitors and will be storing very secret information, I’m affraid, that companyId wild will be not enough to be sure, that company1 will not see data of company2, If I will make a mistake as a developer.

Probably 100% solution is separate companies to separate apps and databases, but It will be later harder to maintain I think, and also, I would like to make a connect databases to BI to be able to make a statistics. Separate databases means for me always create new BI connection (to new companies 3, 4, 5) and update reports.

The most simple solution for me is store all clients data into single database, But I must somehow to solve security to 100% prevent the disclosure information.

Do you have also an another idea? …or If is possible to split data by companyId on mongoDB site, and I can count on it, that could be one possible solution, but I’m not aware of mongo allowing such a thing.

Thanks a lot.

minhna · August 14, 2023, 8:54am

well if you let your client manage your database then you’re right. If it’s not the case then it’s your code to control the permission.
If you store the data on separated database but your code is bad then it won’t help.

klaucode · August 14, 2023, 9:00am

Hi @rjdavid, yes testing, testing, testing… is very important, I totally agree. If company1 and company2 will be have separate apps with separate DB, then there is nothing to worry about to compromise data between them. But as I written to @minhna, there will be new problems, which I must solve.

If I will be able to store data of separate clients into single DB (with as @minhna written, companyId field), of course, there is not problem to do this, but I think, this is not so super safe solution. If I will make a mistake in publish or method and I skip companyId field, that all companies data will be published I would like to be safe from this kind of issues and to be sure, that even if an error occurs, the data will not see each other.

Isn’t there an option at the mongo level?

klaucode · August 14, 2023, 9:06am

…I mean separate database and app for each company (not separate database and all databases available in single app). …Or maybe we don’t need to found totally 100% solution, we can think, how to reduce the risk to maximum level.

@rjdavid => maximum focus to testing
@rjdavid => multi-tenant framework
@minhna => companyId (…how to integrate companyId to minimize risk in code? …is it possible somehow wrap Meteor mongo to put companyId to selections automatically?)

rjdavid · August 14, 2023, 9:12am

Use a multi-tenant framework. Built-in functions can help you minimize mistakes

klaucode · August 14, 2023, 9:40am

…hmmm, yes! This is great info and the correct solution. I found an interesting article about it: Multi-Tenant Application. Software Architecture | Update on… | by Sudheer Sandu | Medium

But I still wants to keep the love for the meteor . Somehow safe use of companyId can also help me.

@rjdavid Do you think there is any way I could integrate it in a simple way with companyId and at the same time in a safe way? I would like to avoid complex architectures when development is so easy and fast with Meteor.

Thanks a lot for your answer, help and support.

rjdavid · August 14, 2023, 9:42am

I remember discussions of multi-tenancy solutions for Meteor. You can start searching in this forum.

pmogollon · August 14, 2023, 11:16pm

I think you can find useful a package called partitioner. I have never used it but I use something similar in my apps.

klaucode · August 15, 2023, 4:39pm

Hi @pmogollon, thank you very much for the great info. Meteor partitioner can be most effective solution, because my app, which I like to extend to others exists and running on Meteor. I already found it after multi-tenancy architecture recommendation, @rjdavid opened my mind and currently I’m studying more info and and architecture patterns.

I found an interesting article here: Multi-Tenant Application. Software Architecture | Update on… | by Sudheer Sandu | Medium

If you will be have more tips or experiences to Meteor multi-tenant, I appreciate any info .

pmogollon · August 15, 2023, 5:50pm

Well, I have a really simple approach. Have in mind that I use grapher.

There are organizations, orgs have users, and orgs have some assets (other collections). I have a security class with a method called isAllowedInOrg({ userId, orgId, allowedRoles }), and on every firewall I use this method to check if the user has permissions. In the query itselft I just have one required field for the multitenancy, orgId. So for example, I want to query all posts for an org is just Posts.find({ orgId }).

OrgUsers can be a collection with docs like this { userId, orgId, role }, and in your security class you do the checks. Or you could add this data to your user docs, depends on how else you will use this data.

elie · August 16, 2023, 7:55am

You’re likely overthinking this. 99.9% of companies will be storing in the same database. Unless you’re really in the 0.1% then don’t go too crazy and just write the correct query.

lucfranken · August 16, 2023, 9:09am

You can also try to make sure that there are no errors on the publications by creating custom linters for example which notify developers that a required tenant check is missing for example.

So then you automate the review process during coding and integration and you accept that there is the line of defense.

Also in your test suite you could quite easily subscribe to all publications for example and review that they don’t contain any data from other tenants.

klaucode · August 16, 2023, 9:10am

hahaha, @elie , explained in the clearest way , I appreciate it. Yes, I want to do this and don’t waste time with maintain multiple apps and databases. This is the reason, why I opened this discussion. I wanted to know how others deal with it, to prevent later headache.

I only want to somehow prevent everywhere put to the queries always {companyId: “blabla”}, because when I will forgot it, I will disclose user data to each other. Therewore Im looking for the way, how to prevent it.

I checked the partitioner package, it seems to be very useful for this case, but It’s little bit old. Then Im still hesitate, which way to choose.

Thanks a lot @elie, If you have any tips, I will be very happy if you share your experiences and decisions in this multi-tenancy area.

klaucode · August 16, 2023, 9:23am

Yes, this is great point! This can be used also as important security measure for the security audit. I don’t want to open new theme inside this discussion, but what is also interesting for me is, how others works with mongo querying in Meteor, if the recommendation is to put queries directly to the methods or publications, or if somebody prefer to create models to prevent duplicate same queries and field selections in methods and publications.

When the app is growing up, adding new fields to all duplicated queries and search all places, when query for the collection is used is more and more complicated and prone to error. This is the reason, why interests me, how others organize database queries.

@lucfranken thanks a lot for the great point.

elie · August 16, 2023, 9:39am

Right. I think it’s a good question and always good to learn new ideas. But was just worried based on the responses and seeing your original question that you will severely over engineer the app.

You can write a function that requires you to pass the company id. And always call that function. eg:

findCompany(companyId: string) {
  if (!companyId) throw new Error()
  return Company.findOne({ _id: companyId })
}

And never do Company.find anywhere else in the app. The throw error is a bit overkill with Typescript but up to you.
Also write a function to check the user is admin of a specific company. You could potentially have this admin company check called on every request by wrapping the request in this check.

mikeTT · August 16, 2023, 5:02pm

This part:

...if developer will make a mistake, still will be the data available only for those company, to which they belong.

depends on what kind of developer you’re thinking about. If you separate frontend from backend developer, then guarding against “if frontend developer makes a mistake” will be easy because you can guard against it on the backend.

My team and I have built a few healthcare apps that allow medical providers from separate hospitals to log in and see only their hospital’s data. It is extremely important not to let patient data from Hospital A to be seen by someone at Hospital B. So we do as much as possible on the server instead of passing an ID from the client.

Take a simple Meteor publication as an example:

Meteor.publish('companyPosts', function () { 
    // don't pass in the user's ID nor the company's ID. Get it from Mongo
    // this assumes each user has a companyId property
    var user = Meteor.users.findOne({_id: this.userId});
    if (user && user.companyId) {
        // if the currently logged in user has a companyID, find posts that exactly match it 
        CompanyPosts.find({ 
            company: user.companyId
        }, { 
            fields: { '_id': 1, 'companyId': 1, ...otherPropertiesYouNeed }
        });
    } // whatever else you need to do

This of course relies on a good backend developer, and also relies on lots and lots of testing. We personally don’t just rely on our own testing but get security penetration testing as well.

And of course the same practice can be employed with Methods as needed. As a rule, if security is a concern, always do as much as possible at the server level, not the client.

paulishca · August 16, 2023, 7:04pm

“only want to somehow prevent everywhere put to the queries always {companyId: “blabla”}, because when I will forgot it, I will disclose user data to each other.”
This feels insufficient because … it is insuficient.

1. On the user you store the available company ids with the role (if you need different roles such as accounts/finance, sales, manager, approver etc).

2. On the company you store roles and if you don’t have many users, you can keep userIds within the role, otherwise you keep a separate DB for users with roles in companies.

When you do a query to display data, you use 1 (from above): show me MY things based on what I store in my user. This should prevent the possibility of you being somewhere where you can run a query for things that are not yours.

When you’re on a company page that you can view and want to move around, list more, change things etc, you use 2 in your queries: get me this data (or save this data) for this company id in which I should be part of a role.

A practice to not forget to use things in queries is to enforce them through schemas. I use SimpleSchema extensively. You can have all method schemas in a file which you can audit frequently to make sure schemas enforce the use of companyId. (manual job)

You can also write a eslint rule to fail tests when companyId is missing in a method or method call. (automated job)

There are many simple but professional ways that provide performance and security/privacy benefits. It starts with DB design.