Web Site

Internet-description.com



» Internet » World Wide Web » Website » Search machine


Page modified: Saturday, June 24, 2006 10:36:50

History

Archie can as older moves forward that today all-side well-known search machines and Web listings to be regarded.

The first forerunner of the today's search machines was one in the year 1991 at the University OF Minnesota considerably from Paul Lidner and Mark of P. McCahill developed software named Gopher. It was developed as Campuswide information system (CWIS) for the cross-linking of the there information servers and be based on the Client server principle. The structure of Gopher was pointing the way for the time at that time; all Gopher sides were listed and could be scanned by the Gopher Sucher Veronica (Very Easy Rodent Oriented Net wide index ton of Computerized Archives) completely. However Gopher already disappeared some years later, probably particularly because of the missing possibility of merging pictures and diagrams.

With the release of the www standard for free use 1993 and a handful web page the singular success history of the world-wide data network began. The first Webcrawler named The Wanderer was programmed in the same year by Mathew Gray, a student of the Massachusetts Institute of Technology (WITH). The Wanderer scanned and listed from 1993 to 1996 half-yearly at this time still very clear Web. In June 1993 altogether 130 Websites were counted. To become developed in the October of the same yearly Aliweb (archie Like Indexing OF the Web), with which the operators of Web servers a description of their service in a file placing had, in order such a part scanable index.

In December 1993 the search machines Jumpstation, WorldWideWeb Worm and RBSE Spider went to the net. The two first mentioned were Crawler, which indexed web pages after title and URL. RBSE Spider was the first search machine, which indicated its results sorted according to its own Ranking system. None of these search machines offers today still their services.

In April 1994 a further search machine went named WebCrawler on-line, which could likewise show one after Ranking sorted hit list. 1995 were sold it at AOL, one year later further at Excite. In May began the work of Michael Mauldins on the search machine Lycos, which went in July 1994 on-line. Apart from the word frequency of the search words within the web pages Lycos scanned also the proximity of the search words among themselves in the document.

In the same year called David Filo and Jerry Yang, both at that time students of the specialist area electro-technology to the Stanford University, a collection of their best Web addresses in an on-line available listing service in the life - the birth of Yahoo! (for Yet Another Hierarchical Officious Oracle).

The year 1995 should become an important change of trend for the only short history of the search machines: In this year search machines were developed for the first time by commercial companies. From these developments developed Infoseek, Architext (in Excite one renamed later) and AltaVista. One year later became Inktomi Corp. based, whose became search machine of the same name to the basis from Hotbot and other search sides. In this time the listing service of Yahoo was prominent, but AltaVista (the name meant "„view from above "“, is in addition, a wordplay the location Palo Alto concerning) became increasingly popular.

1996 were the starting year of two Metasuchmaschinen. MetaCrawler saw the light of the world and - in Germany - MetaGer in the USA. Up to the market penetration of Google Metasuchmaschinen were considered as one of the most interesting information procuring parties, since the search index of the search machines contained subranges of the Internets predominantly. Metasuchmaschinen pass the inquiries on of the user parallel to several search machines and summarize the hits formatted.

Larry PAGE and Sergey Brin their innovative search machine technology in the article The Anatomy OF A Large Scale Hypertextual Web Search Engine published end of 1998 . This work represented the starting signal for the most successful search machine of the world: Google. In September 1999 Google reached beta status. The arranged user surface, the speed and the relevance of the search results formed those for cornerstone on the way to win the computer-experienced users for itself. Followed them in the next years to today crowds of new Internet users. But Google does not dominate the search machine market alone, by spectacular buyings up in the spring 2003 secured itself Yahoo! the connection in this market segment.

Since 2004 there are three large (related to the number of seized documents) index-based commercial Web search machines after some firm assumptions only more. Beside Google is this Yahoo! Search and Microsoft MSN search.

Due to that ever more largely becoming spreading Weblogs Web log search machines are special for Web log contents like e.g. Technorati, Feedster, Plazoo, IceRocket, Google Blog search developed.

Challenges

Search machines must deal in the enterprise with different problems:

  • Ambiguity - retrieval queries are often imprecise. So the search machine cannot decide independently whether with the term vice is to be looked for for a truck or a bad habit. Turned around the search machine should not insist too stubbornly on the entered term. It should include also synonyms, so that the search word computer Linux finds also sides, which contain the word computer instead of computers.
  • Grammar - many possible hits are lost, because the user looks for a certain grammatical form of a search word. Thus the search for the term car finds all sides, which contain this term, contained in the search index, not however those with the term car. Some search machines permit the search by means of Wildcards, with which this problem can be partly dealt (the retrieval query Auto* e.g. considers also the term car or automatism), however must the user the possibility also know. Further Stemming is often used, words are reduced to their basic trunk. Like that on the one hand an inquiry is after similar word forms possible (beautiful flowers finds to so also beautiful flower), in addition the number of terms in the index is reduced. A further possibility is the employment of statistic procedures, with which the search machine evaluates the inquiry e.g. by emerging more differently related terms on web pages after the fact whether with the search for car repair also the search for cars repair or automatism could have been repaired meant.
  • Data set - which grows Web faster than the search machines with the present technology index can. That is still not at all taken into consideration the search machines unknown part - the so-called Deep Web -.
  • Topicality - many web pages are frequently updated, which forces the search machines to visit these sides again and again according to definable rules (Robots). This is also necessary, in order to recognize and longer than result not offer in the meantime from the database removed documents. Regular downloading several billions of the documents, which a search machine in the index has, makes large demands against network resources (Traffic) of the search machine operator.
  • Spam - By means of search machine Spamming some Website operators try to outwit the Ranking algorithm of the search machines in order to receive a better placement for certain retrieval queries. Both this harms the operators of the search machine and their customer, since now no more are not first indicated the most relevant documents.
  • Technical - to convert searches in such a way on very large data sets the fact that the availability is high (despite hardware losses and net bottlenecks) and low (although per retrieval query reading and processing of several 100 MT are necessary often index data), places the periods of reply large requirements against the search machine operator. Systems must be very redundantly designed, on the one hand for the computers locally in a computing centre, on the other hand should it to more than one computing centre give, which offers complete search machine functionality.
  • Legal - search machines are mostly internationally operated and to offer thus to users of results of servers, which are located in other countries. Since the legislation of the different countries have different views of it, which contents are permitted, operator of search machines often turns out under pressure to exclude certain sides from their results. The German Internet search machines want to paint youth-endangering sides by the Freiwillige self-check from their hit lists.

See also

  • Stroke and Authorities, Google bomb, link farm, Web Impact Factor, information retrieval, vector space retrieval, DATA Mining, semantic Web, open Archives initiative, Web Mining, shred with potato salad, Nutch, objectivity, search machine optimization, Business Suchmaschine, search function

Related Websites

We found here 6 related websites.

Page cached: Wednesday, July 5, 2006 23:58:16
Valid XHTML 1.0!  Valid CSS!

Navigation

Related articles


Page copy protected against web site content infringement by Copyscape