Best data structure for having to find() documents by array

Basically we need to create a system that allows certain documents in a collection to be flagged as certain item types (example: Food, Entertainment, Etc) stored inside an array.

We are also using a pagination system. Which means the “flag” data has to be in the document itself, rather than another collection.

I can just create the do a find() and have it search the array to see if it contains the correct flag… but would this be an acceptable approach? I’m worried about optimization with searches, and since this is used alongside a pagination system with many documents, I’m concerned if it will be acceptable performance.

Would an array be best in this case? Or are there any alternative ways to structure the data that would be more optimized & allow it to be filtered in the pagination system.

First thing that comes to mind is storing the “flag” as derived data in the document itself. You could determine and store the required data when updating the array, using matb33:collection-hooks package or equivalent handcrafted code. Then index on the derived “flag” for efficient searches.

If I understand your problem correctly:
You store one or more item types per item. You can search filtering on these types.

You could just use the $all query selector (https://docs.mongodb.com/manual/reference/operator/query/all/#op._S_all) to find documents that have all the types wanted. Or use the $in query selector (https://docs.mongodb.com/manual/reference/operator/query/in/#op._S_in) to select documents that are at least of one type.

If the field is indexed there shouldn’t be any performance issues.
Find() on indexed arrays are quite fast since MongoDB indexes each element in the array (read here for more information).

2 Likes

Thanks! That was my primary concern. Glad to know arrays are indexed without issues!