Elaborately sorting based on another sort (wtf?)


#1

So I have Posts collection where each doc has createdAt and submitCost fields.
I would like to sort by createdAt first and then, by submitCost INSIDE EACH DAY.
Say if I have the following docs:

{_id:1, createdAt: 10.06.16, submitCost: 10}
{_id:2, createdAt: 10.06.16, submitCost: 30}
{_id:3, createdAt: 10.06.16, submitCost: 20}
{_id:4, createdAt: 11.06.16, submitCost: 10}
{_id:5, createdAt: 11.06.16, submitCost: 40}
{_id:6, createdAt: 14.06.16, submitCost: 10}

I’d like the result to be sorted: 2,3,1,5,4,6
Can this be done?
Thx in advance!


#2
YourCollection.find({}, { sort: { createdAt:1, submitCost: -1} } );

However, your createdAt needs to be a Date field (or a yymmdd string).

Also, you would be well advised to index that combination for performance if it’s important (run regularly and/or a large number of documents). In your server-side code:

Meteor.startUp(() => {
  YourCollection._ensureIndex({ createdAt:1, submitCost:-1 });
});

#3

That’s very helpful, thanks Rob!

But now, that I think of it - my createdAt field is already a Date field (I think, I oversimplified my example, argh…), thus they don’t represent any particular day, but a timestamp… so how could I possibly aggregate all the timestamps for a particular date and then make a sort inside of it?


#4

Hmm. I guess the answer is “it depends”.

If you’re crunching huge amounts of data your best bet will be to use the aggregation pipeline.

If the number of secondary keys per date is fairly small, you could take what the original sort provides and do mini-aggregations in code.


#5

Suppose I have 10,000 posts per day that need to be sorted by submitCost. Could you give pseudo-code that handles this plz?


#6

I could, but my “fairly small” was somewhere in the 100s. Sorting larger sets is certainly doable, but it will tie up the event loop and may be better in a separate process. Or, use the aggregation pipeline - it’s what it was designed for.

If you still want some code I’ll post something later (about to go into a meeting).


#7

Yep, I’d like some code for your “fairly small” scenario - when you have the time :smiley:


#8

So this is the sort of thing I mean. It doesn’t use the aggregation pipeline. It only needs enough memory to hold one day’s worth of results at a time. It doesn’t address the issue of what you do with the results (which are console.logged as each day is processed). My thinking is that you could roll this up into a publication and use the pub/sub API to deliver one day at a time to the client. You could also build a complete result array for all days - but then there wouldn’t be much point doing it this way in the first place!

let workingDate = null;
let subset = [];

MyCollection.find({}, { sort: { createdAt: 1 } }).forEach(doc => {
  const thisDate = doc.createdAt.toISOString().slice(0, 10);
  if (workingDate !== thisDate && workingDate !== null) {
    const result = subset.sort((a, b) => {
      if (a.submitCost > b.submitCost) {
        return -1;
      } else if (a.submitCost < b.submitCost) {
        return 1;
      }
      return 0;
    });
    console.log(result);
    subset = [
      {
        _id: doc._id,
        createdAt: thisDate,
        submitCost: doc.submitCost,
      }
    ];
  } else {
    workingDate = thisDate;
    subset.push(
      {
        _id: doc._id,
        createdAt: workingDate,
        submitCost: doc.submitCost,
      }
    );
  }
});

No warranty provided: caveat emptor!


#9

I tried to implement a sort (a time sort) , but Meteor seems to ignore the second sort. It will only sort the first value and ignore the second one. It seems that when sorting dates, that it takes the time (hours, etc) into account and therefore ignores the second sorted value.

Also, what’s the best implementation of this kind of sort. You said to use the aggregation pipeline, but then you made a suggestion without the pipeline.


#10

Relevant date-related operators
General aggregation docs
note that aggregation is server-side only


#11

Could you share your code with us?

If your sort is based on a field which corresponds to a new Date(), then it sorts to the ms. So, it may be that the second part of your sort has no effect (it won’t be ignored). When you say you’re trying to do a “time sort”, what exactly do you want to achieve?

Well, it really depends on what you’re trying to do and the size/complexity of your data. The aggregation pipeline is really good for complex operations involving large numbers of documents and does a lot more than just sorting. Standard sort options on the find are good for simple data structures and can also be used with minimongo on the client (not true for the aggregation pipeline).


#12

I’m trying to sort my objects exactly as the first person indicated. Except, the field for the date does correspond to new Date() and when I do as you suggest above it doesn’t properly sort them. I’m guessing that it has to do with the sort function not being aware that I want it to be sorted for each day (as above).