Krishna Bharat, Andrei Broder, Monika Henzinger, Puneet Kumar and Suresh Venkatasubramanian
Summary
The Connectivity Server provides the Google search engine with immediate access to the neighborhood of pages around a given page. The idea behind this approach is that pages that are strongly connected within a web subgraph are more likely to be related than weakly connected pages. The Connectivity Server provides this information quickly upon a query event, which is then used to rank the web pages efficiently. The linkage information is provided via a graph structure, which is stored as pointers to urls in a database. Each URL has a simple list of 'inlist' and 'outlist' links, which are links to and from a page, respectively.
This server provides the solid foundation upon which Google is built, which validates the efficiency and effectiveness of the approach. However, there is little room to build a crawler application from this material.
Keywords
connectivity, web links, web graph, web visualization, search
Methods
The connectivity server represents the web as a connected graph, and for every page (node), a list of 'inlist' and 'outlist' links are stored. The connectivity server assists with the connectivity analysis for queries generating a hub and an authority score for each page (normalized sum of the links from a page and to a page, respectively). The connectivity server is also able to generate an actual graph of the connections between different pages, real time.
Rating
6
Bibtex Entry
@proceedings{ bharat98,
author = "Krishna Bharat and Andrei Broder and Monika Henzinger and Puneet Kumar and Suresh Venkatasubramanian",
title = "The Connectivity Server: fast access to linkage information on the web",
text = "WWW7 Conference, 1998.",
year = "2000",
url = "http://www7.scu.edu.au/programme/fullpapers/1938/com1938.htm"
}