Robots.txt File

How to Use Robots.txt File For Allow or Disallow Everything

The Robots.txt file is located on your root domain.
It’s a simple text file, and its main purpose is to crawl your website and robots which folder and files to avoid.
Search engine robots always visit your website and look for new links on your website pages or posts. For example, Google’s web crawler is called Google Bot.
All bots are generally checking all website robots.txt files visiting any website. They do this to see if they can crawl the site and if there is anything they should avoid from the website.
The robots.txt should be placed in the top-level directory of every domain address, And it’s like something like www.yourdomain.com/robots.txt.
The best way to edit this file is via your web host via an FTP client like FileZilla; then, you can edit this file with a text editor like TextEdit (Mac) or Notepad (Windows).
If someone doesn’t know how to log in to your FTP server, kindly contact your hosting provider and ask for instructions.
Some WordPress plugins, Like Yoast SEO, also allow you to edit the Robot.txt file from your WordPress dashboard.

How to disallow all using robots.txt

If you want to instruct all the robots to stay away from your website, then you should put in your robots.txt to disallow all:

User-agent: *
Disallow: /

The “User-agent:” is like a body part of all robots. And the “Disallow:” is applied to your entire website.

Important: Disallowing all robots from your website can be a harmful matter for you, by removing all bots from the website, your website is removed from the search engines and the result is you lost all organic traffic and revenue. If you know what you’re doing then you can do this.

How to allow all By Robots.txt

Robots.txt works primarily by crawling. Everything will be considered to be allowed when you exclude the files and folders that you don’t want to access. 

If you want that google bot crawls your entire site, then you can simply have no file or an empty file at all.

Also, you can put this into your robot.txt file to allow search engines bot:

User-agent: *
Disallow:

It means that interpreted as disallowing noting, So effectively everything is allowed for search engine.

How to disallow specific files and folders

You can use the “Disallow” command to block some individual files or folders if you want.

You simply put a separate line for each file or folder which you want to disallow from crawling bots.

Here’s an example:

User-agent: *
Disallow: /topsy/
Disallow: /crets/
Disallow: /hidden/file.html

In this case, everything will be allowed from crawling with except that two subfolders and the single file. How to Disable New User Notifications in WordPress

How to disallow specific bots

If you want to block a specific bot for crawling your site, then you can do this like below:

User-agent: Bingbot
Disallow: /

User-agent: *
Disallow:

You can specifically block Bing’s search engine bot from crawling your website, but other bots will be crawling your website perfectly.
You can do the same thing with Googlebot by using “user-argent: Googlebot.”
You can also block specific bots from accessing specific folders or files from your website.

A good robots.txt file for WordPress

The following code below this line is the default setting for wordpress:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

Sitemap: https://searchfacts.com/sitemap.xml

This Robots.txt file tells bots that they can easily crawl everything from Yourdomain.com/wp-admin/folder. However, the search console bots can easily crawl yourdomain.com/wp-admin/folder by the admin-ajak.php file.
Sometimes it shows an error when the search console bot can’t crawl the admin-ajak.php file.
Google bot is only understood as “Allow:” – it’s used to crawl some particular files that are disallowed to crawl.
You can also use the Sitemap: by telling your bots where you can find your XML sitemap. All pages and posts from your website are listed on your Sitemap.

When to use no-index instead of robots

If you want to block your full website or some specific page from showing in search engines like Google or Bing, then the Robots.txt file is not best for this.
Search engines can still index all files you block by your robot’s text; they can’t show your ranking post or page metadata.
Instead, the search results description says that “A description from search results isn’t available for this website’s robot.txt.

Robots.txt File

If you hide your site some folder or file by robots.txt, but when some link it, then google shows your specific folder on his search results without the description.
In these cases, it is better to use the noindex tag to block the search engine for not showing your file or data in search results.
You can check this from your WordPress setting -> Reading and Check ” Discourage search engine from indexing this site” then you can add the noindex tag for specific pages.
It looks like this:

<meta name='robots' content='noindex,follow' />

You can also use the free WordPress SEO plugin like the toast SEO for noindex some specific posts or pages or categories from your website.
In most cases, noindex is better for block indexing from search engine crawl.

When to block your entire site instead

In some cases, you can block your entire site from being accessed, bot by search engines, bots, or people.
Putting a password on your website is the best way for this. You can do this with a WordPress free plugin called Password Protested. WordPress Vs Substack

Important facts about the robots.txt file

Keep in mind that robots can ignore your robots.txt file, especially some bots like those run by hackers looking for security vulnerabilities.
Also, if you’re trying to hide a specific folder from your website, then just the robots.txt file may not be a smart way for this.
Anyone can see your website robots.txt file if they are trying to their browser, and they can figure out the file you’re trying to hide.
You can see most popular websites showing their robot.txt file; you can see this by typing domain.com/robots.txt.
If you want to ensure that your robots.txt file is working, you can use the Google search console to test it. Hostinger Coupon Code

If you want to instruct all search engine robots from away fro your website then you can use this code to disallow all search engine bot:
User-agent: *
Disallow: /

If you want that, search engine bots are crawling your website then you can go with this method.
Just copy the code bellow this:

User-agent: *
Disallow:

You can use this separate line for each file or folder, which you don’t want to crawl by search engines bot.
Here those code:

User-agent: *
Disallow: /topsy/
Disallow: /crets/
Disallow: /hidden/file.html

If you wanna blocking one specific bot from your website crawling such as Yandex, then you can do this:

User-agent: Bingbot
Disallow: /

The following code is the best robots.txt for default setting for WordPress:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://searchfacts.com/sitemap.xml

1 thought on “How to Use Robots.txt File For Allow or Disallow Everything”

Leave a Comment

Your email address will not be published. Required fields are marked *

15585

Don't Miss Any Updates & Deals From WPLecture

Get Blogging Tips & SEO Guildeline On You Mail

15856
Scroll to Top