This paper presents an algorithm to bound the bandwidth of a Web crawler. The crawler collects statistics on the transfer rate of each server to predict the expected bandwidth use for future downloads. The prediction allows us to activate the optimal number of fetcher threads in order to exploit the assigned bandwidth. The experimental results show the effectiveness of the proposed technique.
Diligenti, M., Maggini, M., F. M., P., Scarselli, F. (2004). Design of a crawler with bounded bandwidth. In Proceedings of the Alternate track papers, poster session of the World Wide Web conference (WWW13) (pp.292-293) [10.1145/1013367.1013441].
Design of a crawler with bounded bandwidth
DILIGENTI, MICHELANGELO;MAGGINI, MARCO;SCARSELLI, FRANCO
2004-01-01
Abstract
This paper presents an algorithm to bound the bandwidth of a Web crawler. The crawler collects statistics on the transfer rate of each server to predict the expected bandwidth use for future downloads. The prediction allows us to activate the optimal number of fetcher threads in order to exploit the assigned bandwidth. The experimental results show the effectiveness of the proposed technique.File | Dimensione | Formato | |
---|---|---|---|
www04.pdf
non disponibili
Tipologia:
Post-print
Licenza:
NON PUBBLICO - Accesso privato/ristretto
Dimensione
51.16 kB
Formato
Adobe PDF
|
51.16 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/11365/36100
Attenzione
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo