Featured Posts

Legal Suite Pro Review What Is Legal Suite Pro This post is a review of Legal Suite Pro which I purchased a month ago. I hope you gain some insight from this review and that it will help decide if this tool is worth your...

Readmore

Create information product in four simple steps This blog post covers tips and tactics to create an awesome information product quickly based on four simple steps so that you can market it on the digital marketplace and money online. The four simple...

Readmore

Best Marketing Platforms for Selling Digital Products Best Marketing Platforms To Promote Digital Products The top 6 best marketing platforms for selling digital products are: JVZoo WarriorPlus UpWork Clickbank KonKer Udemy The fastest...

Readmore

What To Sell Online: Digital Or Physical Products? Sell online - What Are The Options? What should you sell online, digital or physical products? Yes, that's the biggest question on the mind of many that are new to the  online business world. Much...

Readmore

  • Prev
  • Next

How Search Engines Work

Posted on : 16-03-2010 | By : Graham McKenzie | In : Internet_Marketing

Tags: , , , , , ,

0


Sometimes referred to as ‘spiders’ or crawlers, automated search engine robots seek out web pages for the user. Just how do they accomplish this and is this of importance? What is the real purpose of these robots?

Robots actually have the same basic functionality that earlier browsers had. Just like these early browsers, search engine robots do not have the ability to do certain things. Robots cannot get past password protected areas. They do not understand frames, Flash movies, nor Images or JavaScript.

Even if you use a robot, you have to click the buttons on your website. They can cease to function while using JavaScript navigation or when indexing a dynamically generated URL. A search engine robot retrieves data and finds information and links on the web.

The robot makes a list of the web pages in the system at the ‘submit a URL page, then searches for these web pages in order from the list the next time it goes on the web. Sometimes a robot will find your page whether you have submitted it or not because other site links may lead the robot to your site. Building your link popularity and getting links from other topical sites back to your site is important.

The first thing a robot does when it arrives is to check for a robots.txt file. This file tells the robots which sites are off-limits. Usually these are files that should be of no concern because they are binaries or other files that are not needed by the robot.

Links are collected from every page that is visited. These links are used in following those links to other pages. The robot gets around on the World Wide Web by following links from one place to another.

When the robots return, the information they gathered is assimilated into the search engine’s database. Through a complex algorithm, this data is interpreted and web sites are ranked according to how relevant they are to various topics that would be searched for. Some of the bots are quite easy to notice – Google’s is the appropriately-named Googlebot, where Inktomi utilizes a more ambiguous bot named Slurp. Others may be difficult to identify at all.

There may be robots that you do not want to visit your website such as aggressive bandwidth grabbing robots and others. The ability to identify individual robots and the number of their visits is useful. Information on the undesirable robots is helpful also.

IP names and addresses of search engine robots are listed at the end of this article in a resources section. These robots read the pages on your website by visiting your page and looking at the text that is visible on the page, and then looks at the source code tags such as title tags, meta tags and others.

They look at the hyperlinks on your page. From these links, the search engine robot can determine what your page is about. Each search engine has its own algorithm to determine what is important. Information is indexed and delivered to the search engine’s database according to how the robot has been set up through the search engine.

If you’re interested in seeing which pages the spiders have visited on your website, you can check your server logs or the results from your log statistics. From this information you’ll know which spiders have visited, where they went, when they came, and which pages they crawl most often.

Some are easy to identify, such as Google’s ‘Googlebot,’ while others are harder: ‘Slurp’ from Inktomi, for example. In addition to identifying which spiders visit, you can also find if any spiders are draining your bandwidth so that you can block them from your site. The internet has plenty of information on identifying these bad bots.

There are also certain things can prevent good spiders from crawling your site, such as the site being down or huge amounts of traffic. This can prevent your site from being re-indexed, though most spiders will eventually come by again to try re-accessing the page.

Justin Harrison is an internationally recognised Internet Marketing expert who provides world class SEO Services to website owners. For more information visit: http://www.seorankings.co.za

SocialTwist Tell-a-Friend