Externally indexed torrent
If you are the original uploader, contact staff to have it moved to your account
Textbook in PDF format
This book presents an integrated collection of representative approaches for scaling up machine learning and data mining methods on parallel and distributed computing platforms. Demand for parallelizing learning algorithms is highly task-specific: in some settings it is driven by the enormous dataset sizes, in others by model complexity or by real-time performance requirements. Making task-appropriate algorithm and platform choices for large-scale machine learning requires understanding the benefits, trade-offs, and constraints of the available options. Solutions presented in the book cover a range of parallelization platforms from FPGAs and GPUs to multi-core systems and commodity clusters, concurrent programming frameworks including CUDA, MPI, MapReduce, and DryadLINQ, and learning settings (supervised, unsupervised, semi-supervised, and online learning). Extensive coverage of parallelization of boosted trees, SVMs, spectral clustering, belief propagation and other popular learning algorithms and deep dives into several applications make the book equally useful for researchers, students, and practitioners. .
Frameworks for Scaling Up Machine Learning:
Mapreduce and its application to massively parallel learning of decision tree ensembles.
Large-scale machine learning using DryadLINQ.
IBM parallel machine learning toolbox.
Uniformly fine-grained data parallel computing for machine learning algorithms.
Supervised and Unsupervised Learning Algorithms:
PSVM: parallel support vector machines with incomplete Cholesky Factorization.
Massive SVM parallelization using hardware accelerators.
Large-scale learning to rank using boosted decision trees.
The transform regression algorithm.
Parallel belief propagation in factor graphs.
Distributed Gibbs sampling for latent variable models.
Large-scale spectral clustering with Mapreduce and MPI.
Parallelizing information-theoretic clustering methods.
Alternative Learning Settings:
Parallel online learning.
Parallel graph-based semi-supervised learning.
Distributed transfer learning via cooperative matrix factorization.
Parallel large-scale feature selection.
Applications:
Large-scale learning for vision with GPUS.
Large-scale FPGA-based convolutional networks.
Mining tree structured data on multicore systems.
Scalable parallelization of automatic speech recognition