Last updated on October 3rd, 2023 at 10:48 pm
The robots.txt file is one of a website’s most essential pages or files. The primary purpose of this file is to tell Google or other search engine bots to crawl your website page and some pages to avoid.
Search engine robots always visit everyone’s websites and look for new updates to their website pages or posts. For example, Google’s web crawler is called Google Bot.
All bots are generally checking all website robots.txt files visiting any website. They do this to see if they can crawl the site and if there is anything they should avoid from the website.
You can easily change or edit your Robots.txt file from your WordPress with the help of the best SEO plugins, Rank Math, Yoast SEO, Or AIOSEO.
And we’ll share with you how to edit the Robots.txt File to allow or disallows Google or other search engine bot to crawl your website.
How to disallow all using robots.txt
If you want to instruct all the robots to stay away from your website, then you should put in your robots.txt to disallow all:
User-agent: *
Disallow: /
The “User-agent:” is like a body part of all robots. And the “Disallow:” is applied to your entire website.
Important: Disallowing all robots from your website can be a harmful matter for you, by removing all bots from the website, your website is removed from the search engines and the result is you lost all organic traffic and revenue. If you know what you’re doing then you can do this.
How to allow all By Robots.txt
Robots.txt works primarily by crawling. Everything will be considered to be allowed when you exclude the files and folders that you don’t want to access.
If you want that google bot crawls your entire site, then you can simply have no file or an empty file at all.
Also, you can put this into your robot.txt file to allow search engines bot:
User-agent: *
Disallow:
It means that interpreted as disallowing noting, So effectively everything is allowed for search engine.
How to disallow specific files and folders
You can use the “Disallow” command to block some individual files or folders if you want.
You simply put a separate line for each file or folder which you want to disallow from crawling bots.
Here’s an example:
User-agent: *
Disallow: /topsy/
Disallow: /crets/
Disallow: /hidden/file.html
In this case, everything will be allowed from crawling with except that two subfolders and the single file. How to Disable New User Notifications in WordPress
How to disallow specific bots
If you want to block a specific bot for crawling your site, then you can do this like below:
User-agent: Bingbot
Disallow: /
User-agent: *
Disallow:
You can specifically block Bing’s search engine bot from crawling your website, but other bots will be crawling your website perfectly.
You can do the same thing with Googlebot by using “user-argent: Googlebot.”
You can also block specific bots from accessing specific folders or files from your website.
A good robots.txt file for WordPress
The following code below this line is the default setting for wordpress:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://searchfacts.com/sitemap.xml
This Robots.txt file tells bots that they can easily crawl everything from Yourdomain.com/wp-admin/folder. However, the search console bots can easily crawl yourdomain.com/wp-admin/folder by the admin-ajak.php file.
Sometimes it shows an error when the search console bot can’t crawl the admin-ajak.php file.
Google bot is only understood as “Allow:” – it’s used to crawl some particular files that are disallowed to crawl.
You can also use the Sitemap: by telling your bots where you can find your XML sitemap. All pages and posts from your website are listed on your Sitemap.
When to use no-index instead of robots
If you want to block your full website or some specific page from showing in search engines like Google or Bing, then the Robots.txt file is not best for this.
Search engines can still index all files you block by your robot’s text; they can’t show your ranking post or page metadata.
Instead, the search results description says that “A description from search results isn’t available for this website’s robot.txt.
If you hide your site some folder or file by robots.txt, but when some link it, then google shows your specific folder on his search results without the description.
In these cases, it is better to use the noindex tag to block the search engine for not showing your file or data in search results.
You can check this from your WordPress setting -> Reading and Check ” Discourage search engine from indexing this site” then you can add the noindex tag for specific pages.
It looks like this:
<meta name='robots' content='noindex,follow' />
You can also use the free WordPress SEO plugin like the toast SEO for noindex some specific posts or pages or categories from your website.
In most cases, noindex is better for block indexing from search engine crawl.
When to block your entire site instead
In some cases, you can block your entire site from being accessed, bot by search engines, bots, or people.
Putting a password on your website is the best way for this. You can do this with a WordPress free plugin called Password Protested. WordPress Vs Substack
Important facts about the robots.txt file
Keep in mind that robots can ignore your robots.txt file, especially some bots like those run by hackers looking for security vulnerabilities.
Also, if you’re trying to hide a specific folder from your website, then just the robots.txt file may not be a smart way for this.
Anyone can see your website robots.txt file if they are trying to their browser, and they can figure out the file you’re trying to hide.
You can see most popular websites showing their robot.txt file; you can see this by typing domain.com/robots.txt.
If you want to ensure that your robots.txt file is working, you can use the Google search console to test it. Hostinger Coupon Code
If you want to instruct all search engine robots from away fro your website then you can use this code to disallow all search engine bot:
User-agent: *
Disallow: /
If you want that, search engine bots are crawling your website then you can go with this method.
Just copy the code bellow this:
User-agent: *
Disallow:
You can use this separate line for each file or folder, which you don’t want to crawl by search engines bot.
Here those code:
User-agent: *
Disallow: /topsy/
Disallow: /crets/
Disallow: /hidden/file.html
If you wanna blocking one specific bot from your website crawling such as Yandex, then you can do this:
User-agent: Bingbot
Disallow: /
The following code is the best robots.txt for default setting for WordPress:
User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php
Sitemap: https://searchfacts.com/sitemap.xml