Hello,
what is best way how to update often changing big number of small documents ?
For now I am doing around 15k every 5 minutes.
I have document structure like this
> db.streams.findOne()
{
"_id" : "8S5d2w5YZBe9oCWL4",
"channel" : "ECTVLoL",
"title" : "Bienvenue sur l'Eclypsia TV LOL",
"game" : "League of Legends",
"followers" : 1663,
"channel_url" : "http://hitbox.tv/ectvlol",
"viewers" : 748,
"avatar" : "http://edge.vie.hitbox.tv/static/img/channel/ECTVLoL_550fe222af2c4_small.png",
"timestamp" : ISODate("2015-10-27T13:12:01.362Z"),
"thumbnail" : "http://edge.vie.hitbox.tv/static/img/media/live/ectvlol_mid_000.jpg",
"service" : "hitbox",
"online" : true
}
>
They are quite frequently changing. And I can identify document by matching channel and service property.
So for now I am updating them like this.
data.forEach(function(item) {
Streams.upsert(
{
service: 'twitch',
channel: item.channel.display_name
},
{
$set: {
title: item.channel.status,
game: item.game,
avatar: item.channel.logo,
followers: Number(item.channel.followers),
viewers: Number(item.viewers),
timestamp: moment().toDate(),
thumbnail: item.preview.medium,
channel_url: item.channel.url,
online: true
}
}
);
})
It seems to go better after adding index
> db.streams.createIndex({ "service": 1, "channel": 1})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1
}
But still CPU on 1gb DO droplet is 100%
And I have 1 more index for fulltext search.
> db.streams.createIndex({ "channel": "text", "title": "text", "game": "text"})
{
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 2,
"numIndexesAfter" : 3,
"ok" : 1
}
>
I am little scared to add also Youtube grabbing package, due to already high CPU usage by MongoDB atm.
Am I missing some index or some more effective way how to update?
Should I fetch document first and update just changes, or upsert is nicely optimized for that already?
BTW, basic fulltext search as it is now is up on http://shocki.tv
Indexing twitch.tv, hitbox.tv, livecoding.tv , there are some “meteor” or “meteorjs” streams from time to time.