This paper presents an algorithm to bound the bandwidth of a Web crawler. The crawler collects statistics on the transfer rate of each server to predict the expected bandwidth use for future downloads. The prediction allows us to activate the optimal number of fetcher threads in order to exploit the assigned bandwidth. The experimental results show the effectiveness of the proposed technique.

Diligenti, M., Maggini, M., F. M., P., Scarselli, F. (2004). Design of a crawler with bounded bandwidth. In Proceedings of the Alternate track papers, poster session of the World Wide Web conference (WWW13) (pp.292-293) [10.1145/1013367.1013441].

Design of a crawler with bounded bandwidth

DILIGENTI, MICHELANGELO;MAGGINI, MARCO;SCARSELLI, FRANCO
2004-01-01

Abstract

This paper presents an algorithm to bound the bandwidth of a Web crawler. The crawler collects statistics on the transfer rate of each server to predict the expected bandwidth use for future downloads. The prediction allows us to activate the optimal number of fetcher threads in order to exploit the assigned bandwidth. The experimental results show the effectiveness of the proposed technique.
2004
1581139128
Diligenti, M., Maggini, M., F. M., P., Scarselli, F. (2004). Design of a crawler with bounded bandwidth. In Proceedings of the Alternate track papers, poster session of the World Wide Web conference (WWW13) (pp.292-293) [10.1145/1013367.1013441].
File in questo prodotto:
File Dimensione Formato  
www04.pdf

non disponibili

Tipologia: Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 51.16 kB
Formato Adobe PDF
51.16 kB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/36100
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo