Slow mongo query on createUser

This is email index on the Users collection. So already has unique: true which means in theory you should not be able to insert a record with the same email. Would that result in a error that can be handled and notified back to the user?!
What I mean, maybe you don’t even search for the email, just attempt to insert it and if it fails based on uniqueness then … it exists :slight_smile:
But if your index is corrupted … big drama …

We do not use this package but we do lowercase all emails before saving to optimize the query to a simple match.

That existing algorithm to check for existing email looks like a very complex solution to a simple problem

I think the only reasonable solution would be to fix this once and for all directly in accounts:password. Email would henceforth be stored lowercase only, period.

2 Likes

Just re-copied the word because I agree with it more than the average of my general agreements :slight_smile:

2 Likes

I will open an github issue for this. Could we agree this is a performance bug and not a feature request?
The argument here would be that usernames and emails should only be low case enforced by the server and in UI the developer is free to deal with it in the input fields.
Let me know please what/how you think.

1 Like

The fix will be a breaking change or will require a database update script. Not sure if it has been done before

Perhaps a second package should do it? So that the user can do the DB migration and switch the packages or start with the new one if new project… Would that be too cumbersome?!

1 Like

I consider a second package for the data migration a good solution. If needed, it can be added temporarily and invoked in Meteor.startup on the server. Subsequently both the invocation and the package itself can be removed.

Even if both stay wouldn’t normally do much harm. If the migration script is properly written, it wouldn’t do any actual update in the Users collection if all email is already lowercase.

On that note, the migration script shall ideally use bulk update for the best performance, which is crucial for systems that may have thousands of users.

I was thinking to have a new accounts package and leave it up to the user to manage the DB migration, perhaps provide a best practice document.
Let’s say:

  1. “accounts-password” old. (already in use).
  2. Perform DB migration (within Meteor or with Mongo scripts) - developer’s deal
  3. “accounts-password” old still works ok.
  4. Switch to “accounts-password-new”.

You’re right, that should work.

I remember something was done with react-meteor-data before wherein version 2 was a breaking change. The user must explicitly upgrade to version 2 and would not be automatically upgraded with the usual meteor update

Can that work in this case?

I think the breaking change was the version of react supported

I think the root cause is that the original query doesn’t make use of indexes. Using a regex will result in a collscan, even if the property is indexed. Instead you need to specify an index with a collation, and then specify the collation when querying; there’s info in the Mongo docs about it here:

In our app we have a separate collection for user profiles, and we perform a case-insensitive lookup on the email by specifying an index with the following collation:

// when creating the indexes 
const collation = {
    locale: 'en', 
    strength: 1, // case-insensitive
};

And then we query like this:

Profiles.rawCollection.find({ email }).collation(collation)

So to fix this you’d need to change the default index on the Users collection to use a collation, and then you’d need to modify the query to use the collation when searching so the search is case-insensitive.

3 Likes

That’s a brilliantly elegant solution! Plus it works without data migration (save the changing of the default index)

1 Like

That looks like an easier route to take

1 Like

Hi @kimar,

I am working on a pull request to solve this issue. But I don’t see any significant slowness while creating a new account, using the createUser method. Of course I didn’t try with 400k users. If possible can you share some screenshots of the slowness? It would be really helpful.

1 Like

My users collection has nearly a million documents now, which means that if someone tries to register with an email such as info@gmail.com it takes 20+ seconds to do this query. I see the Github pull requests all ended up getting skipped? So it seems like this is still an issue in 2022? Has anyone figured out an elegant solution?

1 Like

Would it be possible to implement the user creation without the need to perform the query (for example if a certain flag is passed as option to shut this off) and instead perform the query on our own? That would at least be non-breaking and allow those with performance needs to do the query in a performant manner.

1 Like

Thanks @rjdavid - I remember seeing that thread but couldn’t figure out what to search for. Seems like there isn’t a simple solution out there yet.