A focused crawler may be described as a crawler which returns relevant web pages on a given topic in traversing the web. There are a number of issues related to existing focused crawlers, in particular the ability to ``tunnel'' through lowly ranked pages in the search path to highly ranked pages related to a topic which might re-occur further down the search path. We will introduce a simple focused crawler, which is described by two parameters, viz., degree of relatedness, and depth. Both provide an opportunity for the crawler to ``tunnel'' through lowly ranked pages. Results from initial experiments are promising and motivate for further research.

A. C., T., Frosali, D., Gori, M., M., H., Scarselli, F. (2003). A Simple Focused Crawler. In Proceedings of the 12th international conference on World Wide Web (pp.356-365).

A Simple Focused Crawler

FROSALI, DANIELE;GORI, MARCO;SCARSELLI, FRANCO
2003-01-01

Abstract

A focused crawler may be described as a crawler which returns relevant web pages on a given topic in traversing the web. There are a number of issues related to existing focused crawlers, in particular the ability to ``tunnel'' through lowly ranked pages in the search path to highly ranked pages related to a topic which might re-occur further down the search path. We will introduce a simple focused crawler, which is described by two parameters, viz., degree of relatedness, and depth. Both provide an opportunity for the crawler to ``tunnel'' through lowly ranked pages. Results from initial experiments are promising and motivate for further research.
2003
A. C., T., Frosali, D., Gori, M., M., H., Scarselli, F. (2003). A Simple Focused Crawler. In Proceedings of the 12th international conference on World Wide Web (pp.356-365).
File in questo prodotto:
File Dimensione Formato  
b6bb2f43349c21e293a1b38459c91b71d40b.pdf

non disponibili

Tipologia: PDF editoriale
Licenza: PUBBLICO - Pubblico con Copyright
Dimensione 96.71 kB
Formato Adobe PDF
96.71 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/43682