ARACHNID: Adaptive Retrieval Agents Choosing Heuristic Neighborhoods for Information Discovery

Filippo Menczer

Back to index

Summary

This paper discusses the ARACHNID adaptive agent for web-based searches, emphasizing the learning and representation portions of the agent. The agent itself is built up from a simple naive implementation using only user relevance feedback to provide a guideline for the agent to select which links should be followed. The next step up includes learning through a Q-learning algorithm using weighted keywords, which may be modified during the search itself. In addition, the environmental characteristics of the web are explored, using the concept of relevant pages occurring frequently together, defined to be the relevance autocorrelation.

Experimental evidence indicates drastic improvement over the breadth first search, increasing with user feedback and Q-learning techniques. This paper is widely cited, and its strong experimental performance proves the success of this method. Possible improvements may be experimental performance of different, more intelligent agents than just BFS.

Keywords

ARACHNID, keyword vector, measure of fitness, relevance feedback, relevance computation, neural net, relevance autocorrelation, semantic topology conjecture, internalization, local selection, search length, Q-learning

Methods

ARACHNID searches the web by predicting which links to follow based on relevance feedback and reinforcement learning. The web itself is represented as a graph G = (V, E), where generally related documents are clustered, which is the semantic topology. The relevance autocorrelation measure uses both the generality of a query and the search graph to find the probability that the link will lead to a relevant page. This leads to the hypothesis if the relevance autocorrelation is greater than the generality of the query, then it is likely relevant documents will be clustered, which is called the semantic topology conjecture. Local selection occurs based on the energy of the individual agent, allowing agents with high energy to reproduce and agents with zero energy to die. Relevance feedback in conjunction with Q-learning leads to the highest performing agents.

Rating

8

Bibtex Entry

@inproceedings = { menczer97,

author = "Filippo Menczer",

title = "ARACHNID: Adaptive Retrieval Agents Choosing Heuristic Neighborhoods for Information Discovery",

booktitle = "Machine Learning: Proceedings of the Fourteenth International Conference",

pages = "227--235",

year = "1997",

url = "citeseer.nj.nec.com/menczer97arachnid.html"

}

Back to index