What is prerender.io doing that I can't implement on my own?

waiholiu · February 3, 2016, 2:35am

So I have an app where I need to make visible to Facebook and search engines and therefore I need some sort of server side rendering solution.

The Meteor guide ( (http://guide.meteor.com/deployment.html#seo) recommends I use prerender.io. Having had a look, as far as I can see, it looks like it just kind of does what spiderable tried to do, in that, it generates a static page for bots and crawlers.

Considering there may be a cost to prerender.io, I’m just wondering, what’s the advantage of doing this over just using spiderable or the meteorhacks:ssr package? What is prerender.io actually doing that is different?

joshowens · February 3, 2016, 3:31am

Prerender has an open source version (which is what I use for Crater) and I grab the bot requests at the Nginx level and send them over to the prerender server I have.

The issue with spiderable (besides the fact that is hasn’t been touched in forever) is that it is really really REALLY naive code that will just keep forking for every bot request, no matter how many bot requests you have. Basically, Google will likely come along and DDOS your server if you have a growing number of public facing pages to consume. Seems better to fix spiderable, imo, but I am not sure that is on anyone’s radar at the moment. So instead, you get a guide that suggests you pay a service instead.

Here is an example screenshot of Crater being DDOS by Google.

robertlowe · February 3, 2016, 4:19am

But I thought spiderable is web scale…

Joking aside, while still a DDoS threat, googlebot can at least have a crawl rate set…

Some of us try to make spiderable better:

But it takes a bunch of effort to get pulls through.

However, I think now that MDG is teaming up with prerender spiderable will die a slow death or be ignored moving forward.

waiholiu · February 3, 2016, 5:08am

Hi joshowens,

Did not realize that prerender is open sourced. I will try to get it installed on my machine tonight and see how I go. Quick question, the guide says this

“To do so, we can use the Prerender.io service, thanks to the dfischer:prerenderio package. It’s a simple as meteor add-ing it, and optionally setting your prerender token if you have a premium prerender account and would like to enable more frequent cache changes.”

Do all I need to do is add the dfischer:prerenderio like it says? Or do I actually need to install the prerender service on my machine? What did you mean you did it at the nginx level? Does that mean you’re not actually using the dfischer:prerenderio package to point to your prerender server?

I’m sorry if this is all sounding a little naive - I’m really just getting my head around how all this works!

joshowens · February 3, 2016, 2:48pm

I don’t think they are teaming up with Prerender, I think they just host an internal version of the open sourced app for Galaxy. Given the issues with spiderable I think it was just easier for @sashko and @tmeasday to recommend prerender vs looking over your pull requests.

joshowens · February 3, 2016, 2:49pm

I am pretty sure you can set an environment variable with the @dfischer package and you can point it at your own prerender server instead. Personally, I catch these requests at the Nginx level and let Nginx cache the responses for 60 minutes so that I don’t overwhelm my prerender server with duplicates requests for popular stuff.

sashko · February 3, 2016, 5:48pm

Some of the big problems with spiderable are:

PhantomJS is hard to run properly
Just like server side rendering with React, it uses a lot of CPU in a very spiky manner

By using an external service for prerender, you solve both issues - you let someone else handle the hard part of running and configuring Phantom, and you offload the CPU spikes to a different server. In fact, depending on the CPU characteristics of the server you’re hosting on it might actually be better for your app’s performance and let you get a cheaper hosting plan since you don’t have to worry about those CPU spikes.

dfischer · February 4, 2016, 6:06am

Yeah – what Sashko said. The package I put together works – prerender.io is the only thing that made it “work reliably.”

@joshowens I’m pretty sure prerender caches same requests for you with a TTL.

elie · March 2, 2016, 12:11am

What’s the difference between doing this at the nginx level or node.js level?

Also, is there an example somewhere of combining a standard meteor nginx config file with what prerender suggest?

I don’t Nginx very well but use it. It would need to combine: https://gist.github.com/thoop/8165802

server {
    listen 80;
    server_name example.com;
 
    root   /path/to/your/root;
    index  index.html;

    location / {
        try_files $uri @prerender;
    }
 
    location @prerender {
        #proxy_set_header X-Prerender-Token YOUR_TOKEN;
        
        set $prerender 0;
        if ($http_user_agent ~* "baiduspider|twitterbot|facebookexternalhit|rogerbot|linkedinbot|embedly|quora link preview|showyoubot|outbrain|pinterest|slackbot|vkShare|W3C_Validator") {
            set $prerender 1;
        }
        if ($args ~ "_escaped_fragment_") {
            set $prerender 1;
        }
        if ($http_user_agent ~ "Prerender") {
            set $prerender 0;
        }
        if ($uri ~ "\.(js|css|xml|less|png|jpg|jpeg|gif|pdf|doc|txt|ico|rss|zip|mp3|rar|exe|wmv|doc|avi|ppt|mpg|mpeg|tif|wav|mov|psd|ai|xls|mp4|m4a|swf|dat|dmg|iso|flv|m4v|torrent|ttf|woff)") {
            set $prerender 0;
        }
        
        #resolve using Google's DNS server to force DNS resolution and prevent caching of IPs
        resolver 8.8.8.8;
 
        if ($prerender = 1) {
            
            #setting prerender as a variable forces DNS resolution since nginx caches IPs and doesnt play well with load balancing
            set $prerender "service.prerender.io";
            rewrite .* /$scheme://$host$request_uri? break;
            proxy_pass http://$prerender;
        }
        if ($prerender = 0) {
            rewrite .* /index.html break;
        }
    }
}

and something like this:

upstream myAppName {
  ip_hash;               # for sticky sessions, more below
  server 123.45.678.901:3000;  # server 1
  server 333.22.444.555:3000;  # server 2
}

server {
  listen 80;
  server_name www.myapp.com
  # and all other "server" directives

  location / {
    # the "hostname" below must be same myAppName from upstream directive above
    proxy_pass http://myAppName/;
    # and all other "location" directives
  }
}

Would the following work:

upstream myAppName {
  ip_hash;               # for sticky sessions, more below
  server 123.45.678.901:3000;  # server 1
  server 333.22.444.555:3000;  # server 2
}

server {
  listen 80;
  server_name www.myapp.com
  # and all other "server" directives

  location / {
    try_files $uri @prerender;
    proxy_pass http://myAppName/;
    # and all other "location" directives
  }

  # and the rest of the prerender stuff here...

joshowens · March 2, 2016, 12:18am

Nginx has a file caching solution, so I cache the response for 60 minutes in case it is a hot link or something. You could extend that to 6 hours or something if you wanted. Nginx just stores the results on the filesystem and serves that up super fast.

joshowens · March 2, 2016, 12:19am

Yeah, I just don’t agree here, @sashko. Phantom is running just fine, just too many at once is the issue here.

lfisher · March 21, 2016, 1:01pm

I was also wondering about this. Do I need the package or is it enough to configure on Nginx?

elie · March 21, 2016, 1:36pm

I did it using the package. But I think doing it on the nginx level might
be more efficient. Both work though

piaf666 · January 5, 2017, 3:03pm

Hi, old post, but interesting.
I want to cache static html pages generated by prerender by using nginx file caching solution (not S3 or redis solutions including with prerender.io plugin). And this only for bots user-agent (if ($prerender = 1) …) . I understand you did it. So how did you configure Nginx ? Any example ?
I’m beginer so my question is maybe incongruous.
thx !

gurjitmehta · September 11, 2017, 3:56pm

Hey guys, I know this is an old post but anyone here able to guide. I was reading on the google official docs that Ajax crawling has been deprecated as of oct 2015 https://developers.google.com/webmasters/ajax-crawling/docs/specification

So does it affects the meteor.js package for prerender.io(dfischer one). How we are supposed to specify the prerender.io service now ?
I understand that nginx config are used for caching the response, but what is the advice as of now with new specs of google. How can we specify the caching now ( if needed).

Any help @joshowens @dfischer, really appreciate it guys.

xvendo · April 5, 2018, 10:26pm

@sashko Does my app not getting busy, when connection to prerender.io with node package?

import prerenderIO from 'prerender-node';

Meteor.startup(function () {
    prerenderIO.set('prerenderToken', "mytoken");
    prerenderIO.set('protocol', 'https');
    prerenderIO.set('host', 'myapp.com');
});