First Method:- With the help of Robots.txt
Robots.txt are special files which directs the search engine crawlers not to crawl certain webpages. Example contents:-
User-agent: * Disallow: /cgi-bin/ Disallow: /banner/ Disallow: /~flower/
This directive disallows the robots from accessing cgi-bin,banner
and flower directories.
User-agent: * Disallow: /
This directive disallows the robots from accessing the entire
contents from the server.
Second Method:- With the help of Meta Robots tag
Meta Robots tag directs the search engine bots not to index pages with the help of no index attribute. Some examples are given below:-
<meta name=”robots” content=”index,follow”>
This tag directs the search engine robots to index the content and follow the urls of the page.
<meta name=”robots” content=”noindex,nofollow”>
This tag directs the search engine robots not to index the content and not to follow the urls of the page.
Third Method:- IP Blocking
Blocks certain IP’s from accessing the server. You can create this with the help of an .htacess file.