Learning probabilistic graphical models from high-dimensional datasets is a computationally challenging task. In many interesting applications, the domain dimensionality is such as to prevent state-of-the-art statistical learning techniques from delivering accurate models in reasonable time. This paper presents a hybrid random field model for pseudo-likelihood estimation in high-dimensional domains. A theoretical analysis proves that the class of pseudo-likelihood distributions representable by hybrid random fields strictly includes the class of joint probability distributions representable by Bayesian networks. In order to learn hybrid random fields from data, we develop the Markov Blanket Merging algorithm. Theoretical and experimental evidence shows that Markov Blanket Merging scales up very well to high-dimensional datasets. As compared to other widely used statistical learning techniques, Markov Blanket Merging delivers accurate results in a number of link prediction tasks, while achieving also significant improvements in terms of computational efficiency. Our software implementation of the models investigated in this paper is publicly available at http://www.dii.unisi.it/~freno/. The same website also hosts the datasets used in this work that are not available elsewhere in the same preprocessing used for our experiments.

A., F., Trentin, E., Gori, M. (2009). Scalable pseudo-likelihood estimation in hybrid random fields. In Proceedings of the 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2009) (pp.319-328). New York : ASSOC COMPUTING MACHINERY [10.1145/1557019.1557059].

Scalable pseudo-likelihood estimation in hybrid random fields

TRENTIN, EDMONDO;GORI, MARCO
2009-01-01

Abstract

Learning probabilistic graphical models from high-dimensional datasets is a computationally challenging task. In many interesting applications, the domain dimensionality is such as to prevent state-of-the-art statistical learning techniques from delivering accurate models in reasonable time. This paper presents a hybrid random field model for pseudo-likelihood estimation in high-dimensional domains. A theoretical analysis proves that the class of pseudo-likelihood distributions representable by hybrid random fields strictly includes the class of joint probability distributions representable by Bayesian networks. In order to learn hybrid random fields from data, we develop the Markov Blanket Merging algorithm. Theoretical and experimental evidence shows that Markov Blanket Merging scales up very well to high-dimensional datasets. As compared to other widely used statistical learning techniques, Markov Blanket Merging delivers accurate results in a number of link prediction tasks, while achieving also significant improvements in terms of computational efficiency. Our software implementation of the models investigated in this paper is publicly available at http://www.dii.unisi.it/~freno/. The same website also hosts the datasets used in this work that are not available elsewhere in the same preprocessing used for our experiments.
2009
978-1-60558-495-9
A., F., Trentin, E., Gori, M. (2009). Scalable pseudo-likelihood estimation in hybrid random fields. In Proceedings of the 15th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2009) (pp.319-328). New York : ASSOC COMPUTING MACHINERY [10.1145/1557019.1557059].
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/24211
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo