Externally indexed torrent
If you are the original uploader, contact staff to have it moved to your account
Textbook in PDF format
The study of most scientific fields now relies on an ever-increasing amount of data, due to instrumental and experimental progress in monitoring and manipulating complex systems made of many microscopic constituents. How can we make sense of such data, and use them to enhance our understanding of biological, physical, and chemical systems?
Aimed at graduate students in physics, applied mathematics, and computational biology, the primary objective of this textbook is to introduce the concepts and methods necessary to answer this question at the intersection of probability theory, statistics, optimisation, statistical physics, inference, and machine learning.
The second objective of this book is to provide practical applications for these methods, which will allow students to assimilate the underlying ideas and techniques. While readers of this textbook will need basic knowledge in programming (Python or an equivalent language), the main emphasis is not on mathematical rigour, but on the development of intuition and the deep connections with statistical physics.
Preface
Introduction to Bayesian inference
Why Bayesian inference?
Notations and deffnitions
The German tank problem
Laplace's birth rate problem
Tutorial 1: diffusion coeffcient from single-particle tracking
Asymptotic inference and information
Asymptotic inference
Notions of information
Inference and information: the maximum entropy principle
Tutorial 2: entropy and information in neural spike trains
High-dimensional inference: searching for principal components
Dimensional reduction and principal component analysis
The retarded learning phase transition
Tutorial 3: replay of neural activity during sleep following task learning
Priors, regularisation, sparsity
Lp-norm based priors
Conjugate priors
Invariant priors
Tutorial 4: sparse estimation techniques for RNA alternative splicing
Graphical models: from network reconstruction to Boltzmann machines
Network reconstruction for multivariate Gaussian variables
Boltzmann machines
Pseudo-likelihood methods
Tutorial 5: inference of protein structure from sequence data
Unsupervised learning: from representations to generative models
Autoencoders
Restricted Boltzmann machines and representations
Generative models
Learning from streaming data: principal component analysis revisited
Tutorial 6: online sparse principal component analysis of neural assemblies
Supervised learning: classi cation with neural networks
The perceptron, a linear classifier
Case of few data: overfitting
Case of many data: generalisation
A glimpse at multi-layered networks
Tutorial 7: prediction of binding between PDZ proteins and peptides
Time series: from Markov models to hidden Markov models
Markov processes and inference
Hidden Markov models
Tutorial 8: CG content variations in viral genomes
References
Index