Instructing GoogleBot using robots.txt

According to Google .., their crawler for indexing purpose (GoogleBot) do respect the robots.txt content..

If we happened to specify something in robots.txt .. such as of

Disallow: /restrict_folder/

Its crawler then will respect this directive.. and will not crawl whatever inside /restrict_folder/ ..
somehow some other crawler might not respect this directive though..
so Google recommend us to protect our .. not so public page with a password or some sort of authentication ..

Ok.. but if you don’t have robots.txt defined.. or robots.txt is just allowing .. no restricting to folder..
Only then GoogleBot will crawl the page and read its meta directive..

and depending on the instruction at meta for robots.. it might index, archive .. or not archiving based on META tag directive..
If everything is okay.. It will then archive, index and all sort of thing that can be done for searching purpose.

then come the canonical directive in META tag for robot..

what does this one define is…
if the page happened to have two different link pointing to it but displaying same content..
using this directive.. we can define which one to be indexed..

example :

1. https://namran.com/2009/05/19/instructing-googlebot-using-robotstxt
2. https://namran.com/2009/05/19/instructing-googlebot-using-robotstxt#comment

both link are pointing to the same page.. and we prefer its to index the first one only..
instead of both two page..

we can then write the canonical as

<link rel='canonical' href='https://namran.com/instructing-googlebot-using-robotstxt'>

more detail can be found here.

.. and to examine you robots setting..

1. login to Google webmaster tool
2. click to Tools at the left menu.
3. the can see “Analyze robots.txt”

my link would be something like .. https://www.google.com/webmasters/tools/robots?siteUrl=http%3A%2F%2Fnamran.com%2F&hl=en
this one can test if the robots.txt is properly written.. and either it is blocking crawler to access certain page or not..
just fill in the desired URL into the box provided.. you will be able to see its analyze..

something like this..

robot-analyzer

p/s : still can’t understand why my recent post can’t be archived/indexed ..though.. since 10th May 2009… can’t recall why.. *sigh*

Related Post

6 Responses

  1. Stotti says:

    Hi namran, you should not use MD5 hashing as “encryption” for passwords. Why? I wrote in my blog how easy it is to crack MD5 passwords using local software (see http://www.stottmeister.com/blog/2009/06/29/how-to-crack-md5-passwords-with-john-the-ripper-a-live-example-exploiting-typo3/ ) and using online services (see http://www.stottmeister.com/blog/2009/04/14/how-to-crack-md5-passwords/ ). These articles tell you how to crack MD5 hashes quite easily (for educational purposes only). So please don’t use MD5 as password “encryption”.

    Even SHA-1 is considered unsafe nowadays. Better use a new hashing mechanism such as SHA-256 or something similiar.

    Best regards
    Stotti

    Reply
    • namran says:

      Hi Stotti,

      Thanks for your comment.

      apparently to change that to use SHA-256 ..
      just need to change the line ..
      md5()
      to use sha256 by …
      sha256()

      and have to make sure the password field length in SQL table is long enough to store the hash..
      .. and you’ll be not able to add new user via phpMyAdmin interface as no built in sha256 interface there and have to calculate your password yourself..

      correct ?

      Reply
  2. Brian says:

    Be sure to salt your hashes if you do use the MD5 algorithm. Simple reverse lookup attacks could crack your hashes otherwise. There are sites such as http://ww.netmd5crack.com and http://gdataonline.com that specialize in this sort of attack.

    Brian

    Reply
  3. Alessandro says:

    Hash cracker is a web-service that allows you to encrypt your passwords
    or crack your hashed passwords with MD5, SHA1 or NTLM algorithms.
    You can also encode or decode texts with Base64 system.

    http://www.hash-cracker.com

    Video tutorial:

    http://www.youtube.com/watch?v=JVxdQPdGXec

    Reply
  4. sterewete says:

    Mueller Sports Medicine Turnover http://www.mishymashy.com/ – effexor sale The medication usually needs to be tapered slowly to avoid the withdrawal symptoms that can be very uncomfortable. cheapest effexor

    Reply
  5. Janett Dopazo says:

    Sweet blog! I found it while surfing around on Yahoo News. Do you have any tips on how to get listed in Yahoo News? I’ve been trying for a while but I never seem to get there! Many thanks

    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *