If you want to find out, why a given page is not indexable by search engines, please take a look at the indexability report of the zoom module. my.onpage.org/zoom
Generally, there are several reasons why a page is not indexable.
1. Disallow via robots.txt:
Telling search engine bots via robot.txt file not to crawl your website is to be handled with care. Pages that are disallowed via robots.txt will not be indexed in many cases. But they can be indexed if other pages are referring to them or linking them. If do not want a page to be indexed, please use the canonical tag or the meta tag "robots" no index.
2. Tag page is using a canonical tag which refers to another page:
The canonical tag defines which page should be indexed. This means that if a page has a canonical tag to another page, the page using the canonical tag is not indexable.
When a page is redirecting to another page the redirected page gets indexed instead.
4. Meta tag robots no index:
By using the meta robots tag no index you can tell search engines that pages using this tag should not be indexed. But be careful. Do not use the meta robots no index tag in combination with a robots.txt disallow.
Paginated pages using next/rel prev won't be indexed by search engines.
6. Broken pages:
Pages that answer with header status codes 4xx, 5xx or 9xx also won't get indexed by search engines.