Externally indexed torrent
If you are the original uploader, contact staff to have it moved to your account
Textbook in PDF format
Statistical and Machine Learning methods have many applications in the environmental sciences, including prediction and data analysis in meteorology, hydrology and oceanography pattern recognition for satellite images from remote sensing management of agriculture and forests assessment of climate change and much more. With rapid advances in Machine Learning in the last decade, this book provides an urgently needed, comprehensive guide to Machine Learning and statistics for students and researchers interested in environmental Data Science. It includes intuitive explanations covering the relevant background mathematics, with examples drawn from the environmental sciences. A broad range of topics is covered, including correlation, regression, classification, clustering, neural networks, random forests, boosting, kernel methods, evolutionary algorithms and Deep Learning, as well as the recent merging of Machine Learning and physics. End"‘of"‘chapter exercises allow readers to develop their problem-solving skills, and online datasets allow readers to practise analysis of real data.
Modern Data Science has two main branches – statistics and Machine Learning – analogous to physics containing classical mechanics and quantum mechanics. Statistics, the much older branch, grew out from mathematics, while the advent of the computer and Computer Science in the post–World War II era led to an interest in intelligent machines, henceforth Artificial Intelligence (AI), and Machine Learning (ML), the fastest growing branch of AI.
Environmental Data Science is the intersection between environmental science and Data Science. Environmental science is composed of many parts – atmospheric science, oceanography, hydrology, cryospheric science, ecology, agricultural science, remote sensing, climate science, and so on.
Environmental datasets have their unique characteristics, for example most non-environmental datasets used in ML contain discrete or categorical data (alphabets and numbers in texts, colour pixels in an image, etc.), whereas most environmental datasets contain continuous variables (temperature, air pressure, precipitation amount, pollutant concentration, sea level height, streamflow, crop yield, etc.). Hence, environmental scientists need to assess astutely whether data methods developed from non-environmental fields would work well for particular environmental datasets.
This book is an introduction to environmental Data Science, attempting to balance the yin (ML) and the yang (statistics) when teaching Data Science to environmental science students. Written as a textbook for advanced undergraduates and beginning graduate students, it should also be useful for researchers and practitioners in environmental science.
The reader is assumed to know multivariate calculus, linear algebra and basic probability.
Contents:
Introduction
Basics
Probability distributions
Statistical inference
Linear regression
Neural networks
Nonlinear optimization
Learning and generalization
Principal components and canonical correlation
Unsupervised learning
Time series
Classification
Kernel methods
Decision trees, random forests and boosting
Deep Learning
Forecast verification and post-processing
Merging of Machine Learning and physics
Appendices
References
Index