Quite simple: how do I disabling search engines and other bots crawl a Meteor app which is deployed on, say, staging.myapp.com
? The robots.txt
in the public
dir is untouchable by server code.
The NODE_ENV
in this case has appropriate values, such as staging
and testing
.
1 Like
Are you using something like mupx for deployments? If so you could handle this in advance by wrapping your mupx call in a shell script, that first sets up the appropriate robots.txt file based on environment. For example:
your_app_root/deploy/robots.prod.txt
your_app_root/deploy/robots.staging.txt
your_app_root/deploy/deploy.sh
deploy.sh would look something like:
#!/bin/bash
robots="robots.prod.txt"
if [ "$1" = "staging" ];
then
robots="robots.staging.txt"
fi
cp $robots ../public/robots.txt
mupx deploy
When deploy.sh is run it would overwrite your apps existing /public/robots.txt with the production one, then continue to deploy via mupx. If called as “deploy.sh staging” it would overwrite with the staging one, and continue to deploy.
We’re sadly not using mup/mupx for deployments – for production we use Ansible scripts, and for staging/testing we use plain shell scripts.
But I reckon we can use the same thinking for those scripts when bundling and deploying. Thanks for the idea!
Way late to the party here, but figured I’d share the solution I just implemented for the common good.
Install this package: https://atmospherejs.com/gadicohen/robots-txt
Then at startup, you can run:
robots.addLine('User-agent: *')
if (isStaging) {
robots.addLine('Disallow: /') //blocks all URLs
} else {
robots.addLine('Disallow: /specific/blocked/url/path/*')
}
1 Like