Grid computing defines the combination of computers orclusters of computers across networks, like the internet, to form a distributedsupercomputer. This infrastructure allows scientists to processcomplex and time consuming computations in parallel on demand. Phylogeneticinference for large datasets of DNA/protein sequences is knownto be computationally intensive. Bayesian algorithms allows the estimationof important parameters on species divergence modus and time butat the price of running repetitive long series of Monte Carlo simulations.As part of the BioinfoGrid project, we ported parallel MrBayes tothe EGEE (Enabling Grids for E-sciencE ) grid infrastructure. As casestudy we utilise this supercomputer and program to investigate both achallenging dataset of arthropod phylogeny and the most appropriatemodel of amino acid replacement for that dataset. We relate in this documentthe parallel usage of Bayesian inference and Metropolis-CoupledMarkov Chain Monte Carlo (MCMCMC) to analyse our dataset of completemitochondrial genomes from the Pancrustacea, aiming at resolvingthe position of basal hexapod lineages with respect to Insecta and Crustacea.In this effort, a new matrix of protein change was derived from thedataset itself, and its performance compared with other currently usedmodels.

VAN DER WATH R., C., VAN DER WATH, E., Carapelli, A., Nardi, F., Frati, F., Milanesi, L., et al. (2008). Bayesian phylogeny on Grid. In Bioinformatics Research and Development (BIRD) in Communications in Computer and Information Science (CCIS) 13 (pp. 404-416). BERLIN : Spinger [10.1007/978-3-540-70600-7_30].

Bayesian phylogeny on Grid

CARAPELLI, ANTONIO;NARDI, FRANCESCO;FRATI, FRANCESCO;
2008-01-01

Abstract

Grid computing defines the combination of computers orclusters of computers across networks, like the internet, to form a distributedsupercomputer. This infrastructure allows scientists to processcomplex and time consuming computations in parallel on demand. Phylogeneticinference for large datasets of DNA/protein sequences is knownto be computationally intensive. Bayesian algorithms allows the estimationof important parameters on species divergence modus and time butat the price of running repetitive long series of Monte Carlo simulations.As part of the BioinfoGrid project, we ported parallel MrBayes tothe EGEE (Enabling Grids for E-sciencE ) grid infrastructure. As casestudy we utilise this supercomputer and program to investigate both achallenging dataset of arthropod phylogeny and the most appropriatemodel of amino acid replacement for that dataset. We relate in this documentthe parallel usage of Bayesian inference and Metropolis-CoupledMarkov Chain Monte Carlo (MCMCMC) to analyse our dataset of completemitochondrial genomes from the Pancrustacea, aiming at resolvingthe position of basal hexapod lineages with respect to Insecta and Crustacea.In this effort, a new matrix of protein change was derived from thedataset itself, and its performance compared with other currently usedmodels.
2008
9783540705987
VAN DER WATH R., C., VAN DER WATH, E., Carapelli, A., Nardi, F., Frati, F., Milanesi, L., et al. (2008). Bayesian phylogeny on Grid. In Bioinformatics Research and Development (BIRD) in Communications in Computer and Information Science (CCIS) 13 (pp. 404-416). BERLIN : Spinger [10.1007/978-3-540-70600-7_30].
File in questo prodotto:
File Dimensione Formato  
van der Wath et al., 2008.pdf

non disponibili

Tipologia: Post-print
Licenza: NON PUBBLICO - Accesso privato/ristretto
Dimensione 1.78 MB
Formato Adobe PDF
1.78 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11365/22997
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo