Monday, June 9, 2008

To Know About Search Engine Spider


A spider is a program that visits Web pages and other information in order to create entries for a search engine index. All search engine used spider to collect information on the web. Spider is also know as crawler or bot. The major search engines on the Web all have such a program, which is also known as a "crawler" or a "bot. Spider are called spiders because they usually crawl many sites in parallel at the same time, their "legs" spanning a large area of the "web."

Taking care of the search engine spiders likes and dislikes can prove to be crucial for success of your website.

How The Spider Works

They are called spiders because they crawl over the Web in search of content. Search engines gather data about a website by 'sending' the spider or bots to 'read' the site and copy its content. This content is stored in the search engine's database. As they copy and absorb content from one document, they create record links and send other bots to make copies of content on those linked documents this process goes on and on. As of now the major search engines have established databases that measure their size in the tens of billions using this process.

Spiders are program to read site content like a human. It will start at the top left hand corner, reading line by line. In case of columns spiders will read the left hand column before moving to central and right hand column. It will read and record the whole site to the very last word.

Spider Loves

Provide clear paths for spiders to follow, put easy to follow text links directed to the most important pages in the site at the bottom of each document. The sitemap is the most important link in the site as it directs spiders through the website.

Provide as much relevant key word rich text as possible. Search engine spiders read the text that is on your Web site to decide if your site is relevant or not. If they do not find the keyword being searched they will simply ignore your site and move to the next site. Also make sure the text is relevant and makes sense.

Provide clear identification between areas that to be used and areas that are off limit to the spiders.

Provide clear titles; after the URL of a site, the first information a spider records is the title of the site. Titles should have the most important keyword targets.

Provide a well written Description Meta tag. Search engines use Description Meta tag to collect information on the topic or theme of the site.

Spider Hates

Avoid Very large volume in content. Spiders can only download a fraction of the Web pages within a given time frame, so if you have too much it might skip the most important parts.

Avoid dynamic page generation, spiders generally tend to avoid pages having any resources that have a "?" i.e. dynamic in nature. Stick to static generation spiders prefer them.

Avoid too many graphics. Spiders are not fond of graphics as they are not readable for them so keep it simple.

Avoid Key word misuse. Doesn’t just string a lot of keywords together, remember the spiders are very smart and they will ban you for spamming.

To Know more about Web Promotion India