Kurt Bollacker, Steve Lawrence and C. Lee Giles
Summary
This paper presents an overview of the CiteSeer implementation, which is able to automatically generate a list of related papers, custom to a particular user. CiteSeer maintains a user profile for each user, who is assigned a unique ID in the form of a cookie. This profile is a list of keywords and topics a user may be interested in. As a user searches with CiteSeer, the papers retrieved also generate a list of related papers which CiteSeer recommends, based on the 'relatedness' measure. Paper relatedness breaks into text or citation relatedness, meaning that a paper contains similar verbage or similar citations, respectively.
CiteSeer is one of the most advanced databases of papers, and its ease of use make it one of the most useful site for research. The relatedness measures are common measures, which validates their robustability in the face of the dynamic web. The use of a profile is unique, and is able to generate useful results. Overall, a well documented high level view of the CiteSeer engine.
Methods
The text relatedness measure is generated through applying the term frequency x inverse document frequency weighting, and the citation relatedness uses the common citation x inverse document frequency measure. Profiles are indexed on the server, and manually tuned by the user.
Keywords
user profile, citation index, knowledge representation, information filtering
Rating
7
Bibtex Entry
@inproceedings = { bollacker99,
author = "Kurt Bollacker and Steve Lawrence and C. Lee Giles",
title = "A System for Automatic Personalized Tracking of Scientific Literature on the Web",
booktitle = "Digital Libraries 99 - The Fourth {ACM} Conference on Digital Libraries",
publisher = "ACM Press",
address = "New York",
pages = "105--113",
year = "1999",
url = "citeseer.nj.nec.com/bollacker99system.html"
}