Content is crawlable and indexable
Crawling is how search engines discover most new content. It’s where the spiders visit and download new data from known pages or posts.
For example, let’s say you add a new page to your site and link to it from your homepage. When Google next crawls your homepage, it’ll discover the link to the new page. Then, if it decides the content on that page is valuable for searchers, it’ll get indexed.
This process works ok, as long as the search engines are not getting blocked from crawling or indexing a page.
The robots.txt is the file that tells search engines like Google which pages they can and can’t crawl. You can view it by navigating to yourdomain.com/robots.txt.
The robots.txt file can be a problematic and costly issue if not dealt with in the right way, so make sure you check it and fix any problems. You can check if any pages are blocked by robots.txt in Google Search Console. Just go to the Coverage report, toggle to view any excluded URLs, then look for the blocked by robots.txt error.
If there are any URLs in there that shouldn’t be blocked, you’ll need to remove or edit your robots.txt file to fix the issue.
However, crawlable pages aren’t always indexable. If your webpage has a meta robots tag or x‑robots header set to “noindex,” search engines won’t be able to index the page.
Fix these by removing the ‘noindex’ meta tag or x‑robots-tag for any pages that should be indexed.
Resources & further reading