Hi Guys! When doing pen-testing for clients I often do quite a bit of OSINT (Open Source Intelligence), Google Dorking and searches on web archiving.
And let me tell you! I find a lot of development apps and sites indexed and cached by Google. The funny thing is that even if you delete those pages, they will remain present in web archives! So this is a friendly reminder always to create no-index and no-follows for your development pages. AND… Always remember to remove them when you are putting apps into production!
To use the “noindex” and “nofollow” tags for search engines on your web page, you can add them to the <head> section of the HTML code for the page in question.
- For “noindex,” you can add a “meta” tag with the attribute “name” set to “robots” and the attribute “content” set to “noindex.”
- For “nofollow,” you can add a “meta” tag with the attribute “name” set to “robots” and the attribute “content” set to “nofollow.”
You can also use “noindex, nofollow” if you want to use both tags together.
<meta name="robots" content="noindex, nofollow">
If you want to apply the “noindex” and “nofollow” tags to all pages on your website, there are a few ways you can do this.
One way is to include the “meta” tags in the common template or header file that is used by all pages on your site. This way, the tags will be included on every page that uses the template or header file.
Another way is to use server-side scripting, such as PHP, to include the “meta” tags on all pages dynamically. You can write a script that checks for the presence of the tags on each page, and if they are not present, the script will add them automatically.
A final way is to use the robots.txt file, which gives instructions to the search engines on what pages or sections of your site should not be crawled, you can simply add the
Disallow: / on the robots.txt file to block all pages on your site.
It’s important to keep in mind that using the “noindex” and “nofollow” tags can prevent your pages from being indexed by search engines, which can negatively impact your website’s visibility and search engine rankings. So, make sure you only use them where necessary and appropriate.
Want to use a simple token to restrict access? Check this blog out!
I hope this helps!