Externally indexed torrent
If you are the original uploader, contact staff to have it moved to your account
Textbook in PDF format
Data Science is the foundation of our modern world. It underlies applications used by billions of people every day, providing new tools, forms of entertainment, economic growth, and potential solutions to difficult, complex problems. These opportunities come with significant societal consequences, raising fundamental questions about issues such as data quality, fairness, privacy, and causation. In this book, four leading experts convey the excitement and promise of Data Science and examine the major challenges in gaining its benefits and mitigating its harms. They offer frameworks for critically evaluating the ingredients and the ethical considerations needed to apply data science productively, illustrated by extensive application examples. The authors' far-ranging exploration of these complex issues will stimulate Data Science practitioners and students, as well as humanists, social scientists, scientists, and policy makers, to study and debate how Data Science can be used more effectively and more ethically to better our world.
Data science emerged from combining three fields. For the purposes of this book,we define them as follows:
Statistics is the mathematical field that interprets and presents numerical data, making inferences and describing properties of the data.
Operations research (OR) is a scientific method for decision-making in the management of organizations, focused on understanding systems and taking optimal actions in the real world. It is heavily focused on the optimization of an objective function – a precise statement of a goal, such as maximizing profit or minimizing travel distance.
Computing is the design, development, and deployment of software and hardware to manage data and complete tasks. Software engineering gives us the ability to implement the algorithms that make data science work, as well as the tools to create and deploy those algorithms at scale. Hardware design gives us ever-increasing processing speed, storage capacity, and throughput to handle Big Data.
Some of Data Science’s most important techniques emerged from work across disciplines. While we include machine learning within computing, its development included contributions from statistics, pattern recognition, and neuropsychology. Information visualization arose from statistics, but has benefited greatly from computing’s contributions.
Introduction
Data Science
Foundations of Data Science
Data Science Is Transdisciplinary
A Framework for Ethical Considerations
Applying Data Science
Data Science Applications: Six Examples
The Analysis Rubric
Applying the Analysis Rubric
A Principlist Approach to Ethical Considerations
Challenges in Applying Data Science
Tractable Data
Building and Deploying Models
Dependability
Understandability
Setting the Right Objectives
Toleration of Failures
Ethical, Legal, and Societal Challenges
Addressing Concerns
Societal Concerns
Education and Intelligent Discourse
Regulation
Research and Development
Quality and Ethical Governance
Concluding Thoughts