C. Mic Bowman, Peter B. Danzig, Udi Manber and Michael F. Schwartz
Summary
This older paper highlights the scaling problems that occur when searching a dynamic data source such as the Internet. The authors provide a vocabulary for how to discuss these concepts, and include the importance of correct classification of web pages and how indexes should be used. These concepts are basically given today, and provide little foundation upon which new work can be drawn.
Keywords
scalable, data mining, server replication, data object caching, indexing schemes
Methods
Rather than describing specific algorithms, this paper serves the technical audience with descriptions and approaches for how internet content should be retrieved. The methods include indexing pages across replicated servers, and maintaining a cache of relevant pages customizable to a user.
Rating
5
Bibtex Entry
@article = { bowman94,
author = "C. Mic Bowman and Peter B. Danzig and Udi Manber and Michael F. Schwartz",
title = "Scalable {Internet} resource discovery: research problems and approaches",
journal = "Communications of the ACM",
volume = "37",
number = "8",
pages = "98--107",
year = "1994",
url = "citeseer.nj.nec.com/bowman94scalable.html"
}