Saturday 4 October 2014

What is sitemap and how to use it ?

What is a sitemap ?

sitemap is a file where you can list the webpages of your site to tell Google and other search engines about the organization of your site content. Search engine web crawlers like Googlebot read this file to more intelligently crawl your site.



Types Of Sitemap ?

There are two popular versions of a site map. An XML Sitemap is a structured format that a user doesn't need to see, but it tells the search engine about the pages in a site, their relative importance to each other, and how often they are updated. HTML sitemaps are designed for the user to help them find content on the page, and don't need to include each and every subpage. This helps visitors and search engine bots find pages on the site.

Sitemaps are written in XML (Extensible Markup Language) it is due to that the search engine crawlers can crawl easily an XML file. You should know that every directory can contain its own XML fine but its a good Web Developer's habit is to put the main sitemap.xml file on the root directory of the website. 


Here's an example of a sitemap.xml file :



sitemap.xml example



Below we will revise the lines of the sitemap file one by one:



  • Every Sitemap XML file must begin with an opening tag <urlset> and must end with </urlset>.
  • Every "parent" entry should begin with <url> tag and end with </url>.
  • In a similar way, every "child" entry should be placed between <loc> and </loc> tags. After a <loc> tag, an URL is expected which should start with "http://". The length of the URL can be 2048 characters at most.


  • The <lastmod> tag expects a date in the following format YYYY-MM-DD. Be advised that you do not have to modify this tag each time you modify the document. The search engines will get the dates of the documents once they crawl them.


  • The <changefreq> tag is used as a hint for the crawlers to indicate how ofter the page is modified and how often it should be indexed. Note that this value may or may not affect the crawl bot behavior which depends solely on the search engine.


  • The <changefreq> tag expects one of the following values: always, hourly, daily, weekly, monthly, yearly, never. Be advised that "always" is used for pages which are dynamically generated or changed/modified upon every access. As for the "never" value – be advised that even if you mark your page with a never value most probably it will be indexed once in a week for example.


  • The <priority> value can vary from 0.0 to 1.0. Be advised that this indicates only your personal preferences for the way you would like to have your website indexed. The default value of a page that is not prioritized is 0.5. Any page with higher value will be crawled before the page with priority 0.5, and all pages with lower priority will be indexed after the page with 0.5 value. Since the priority is relative it is used only for your website and even if you set a high priority to all of your pages this does not mean that they will be indexed more often, because this value is not used to make comparison between different websites.




Whats Next ?

I hope now you know what a sitemap of a website is and its tags also. But for those who dont want to write these all by themselves dont worry I have got a good website for you which will do these all automatically and then you can download the sitemap.xml file of your website and upload it on you root folder of your website.

Sitemap Generator



And Then...?

Well after you got the sitemap.xml file you can upload it on the root of your website you can goto Google WebMaster and submit your website's sitemap so that google bots can crawl it easily !!


How to submit..?

If you have uploaded the sitemap.xml file on your root folder of your website you can simply goto Crawl tab on the Google Webmaster and then sitemap and then click on Test/Add Sitemap and type your website url.

example: www.yourwebsite.com/sitemap.xml



I hope this was informative for you..Please Share 

No comments:

Post a Comment