Machine Learning: A Concise Introduction

Author:   Steven W. Knox (University of Illinois; Carnegie Mellon University)
Publisher:   John Wiley & Sons Inc
Edition:   2nd edition
ISBN:  

9781394325252


Pages:   432
Publication Date:   28 January 2026
Format:   Hardback
Availability:   Awaiting stock   Availability explained
The supplier is currently out of stock of this item. It will be ordered for you and placed on backorder. Once it does come back in stock, we will ship it out for you.

Our Price $177.95 Quantity:  
Add to Cart

Share |

Machine Learning: A Concise Introduction


Overview

New edition of a PROSE award finalist title on core concepts for machine learning, updated with the latest developments in the field, now with Python and R source code side-by-side Machine Learning is a comprehensive text on the core concepts, approaches, and applications of machine learning. It presents fundamental ideas, terminology, and techniques for solving applied problems in classification, regression, clustering, density estimation, and dimension reduction. New content for this edition includes chapter expansions which provide further computational and algorithmic insights to improve reader understanding. This edition also revises several chapters to account for developments since the prior edition. In this book, the design principles behind the techniques are emphasized, including the bias-variance trade-off and its influence on the design of ensemble methods, enabling readers to solve applied problems more efficiently and effectively. This book also includes methods for optimization, risk estimation, model selection, and dealing with biased data samples and software limitations — essential elements of most applied projects. Written by an expert in the field, this important resource: Illustrates many classification methods with a single, running example, highlighting similarities and differences between methods Presents side-by-side Python and R source code which shows how to apply and interpret many of the techniques covered Includes many thoughtful exercises as an integral part of the text, with an appendix of selected solutions Contains useful information for effectively communicating with clients on both technical and ethical topics Details classification techniques including likelihood methods, prototype methods, neural networks, classification trees, and support vector machines A volume in the popular Wiley Series in Probability and Statistics, Machine Learning offers the practical information needed for an understanding of the methods and application of machine learning for advanced undergraduate and beginner graduate students, data science and machine learning practitioners, and other technical professionals in adjacent fields.

Full Product Details

Author:   Steven W. Knox (University of Illinois; Carnegie Mellon University)
Publisher:   John Wiley & Sons Inc
Imprint:   John Wiley & Sons Inc
Edition:   2nd edition
ISBN:  

9781394325252


ISBN 10:   1394325258
Pages:   432
Publication Date:   28 January 2026
Audience:   Professional and scholarly ,  College/higher education ,  Professional & Vocational ,  Postgraduate, Research & Scholarly
Format:   Hardback
Publisher's Status:   Active
Availability:   Awaiting stock   Availability explained
The supplier is currently out of stock of this item. It will be ordered for you and placed on backorder. Once it does come back in stock, we will ship it out for you.

Table of Contents

Preface xi Organization — How to Use This Book xii Acknowledgments xiv About the Companion Website xiv 1 Introduction – Examples from Real Life 1 2 The Problem of Learning 3 2.1 Domain 3 2.2 Range 4 2.3 Data 4 2.4 Loss 5 2.5 Risk 8 2.6 The Reality of the Unknown Function 12 2.7 Training and Selection of Models 12 2.8 Purposes of Learning 14 2.9 Notation 14 3 Regression 15 3.1 General Framework 16 3.2 Loss 17 3.3 Estimating the Model Parameters 17 3.4 Properties of Fitted Values 19 3.5 Estimating the Variance 22 3.6 A Normality Assumption 23 3.7 Computation 25 3.8 Categorical Features 26 3.9 Feature Expansions, Interactions, and Transformations 28 3.10 Penalized Regression: Model Transformation for Risk Reduction 31 3.11 Variations in Linear Regression 37 3.12 Nonlinear Regression 39 3.13 Nonparametric Regression 42 4 Classification 45 4.1 The Bayes Classifier 46 4.2 Introduction to Classifiers 47 4.3 Mitigating Biases in Software, Biases in Data, and Zero Probabilities 49 4.4 Class Boundaries 53 4.5 A Running Example 54 4.6 Likelihood Methods 55 4.7 Prototype Methods 69 4.8 Logistic Regression 76 4.9 Neural Networks 81 4.10 Classification Trees 93 4.11 Support Vector Machines 100 4.12 Postscript: Example Problem Revisited 119 5 Bias-Variance Trade-Off 121 5.1 Squared-Error Loss 121 5.2 General Loss 125 6 Combining Classifiers 131 6.1 Ensembles 131 6.2 Ensemble Design 136 6.3 Bootstrap Aggregation (Bagging) 138 6.4 Random Forests 141 6.5 Boosting and Arcing 142 6.6 Classification by Regression Ensemble 147 6.7 Gradient Boosting 151 6.8 Stacking and Mixture of Experts 156 6.9 Postscript: Example Problem Revisited 160 7 Risk Estimation and Model Selection 163 7.1 Risk Estimation via Training Data 164 7.2 Risk Estimation via Validation or Test Data 164 7.3 Cross-Validation 169 7.4 Improvements on Cross-Validation 171 7.5 Out-of-Bag Risk Estimation 172 7.6 Akaike's Information Criterion 173 7.7 Schwartz's Bayesian Information Criterion 174 7.8 Rissanen's Minimum Description Length Criterion 175 7.9 R2 and Adjusted R2 175 7.10 Stepwise Model Selection 177 7.11 Occam's Razor 177 7.12 Size of Validation and Test Data Sets 178 8 Consistency 187 8.1 Convergence of Sequences of Random Variables 187 8.2 Consistency for Parameter Estimation 188 8.3 Consistency for Prediction 188 8.4 There Are Consistent and Universally Consistent Classifiers 189 8.5 Convergence to Asymptopia Is Not Uniform and May Be Slow 191 9 Clustering 193 9.1 Gaussian Mixture Models 194 9.2 k-Means 194 9.3 Clustering by Mode-Hunting in a Density Estimate 195 9.4 Using Classifiers to Cluster 196 9.5 Dissimilarity 196 9.6 k-Medoids 197 9.7 k-Modes and k-Prototypes 197 9.8 Agglomerative Hierarchical Clustering 198 9.9 Divisive Hierarchical Clustering 1999.10 How Many Clusters Are There? Interpretation of Clustering 200 9.11 An Impossibility Theorem 201 10 Optimization 203 10.1 Quasi-Newton Methods 204 10.2 The Nelder–Mead Algorithm 207 10.3 Simulated Annealing 207 10.4 Genetic Algorithms 209 10.5 Particle Swarm Optimization 210 10.6 General Remarks on Optimization 211 10.7 Solving Least-Squares Problems via Quasi-Newton Methods 213 10.8 Gradient Computation for Neural Networks via Backpropagation 214 10.9 Handling Missing Data via the Expectation-Maximization Algorithm 219 10.10 Fitting Support Vector Machines via Sequential Minimal Optimization 224 11 High-Dimensional Data 235 11.1 The Curse of Dimensionality 236 11.2 Two Running Examples 242 11.3 Reducing Dimension While Preserving Information 243 11.4 Model Regularization 261 12 Communication with Clients 267 12.1 Binary Classification and Hypothesis Testing 267 12.2 Terminology for Binary Decisions 269 12.3 Receiver Operating Characteristic (ROC) Curves 271 12.4 One-Dimensional Measures of Performance 273 12.5 Confusion Matrices 276 12.6 Pairwise Model Comparison 277 12.7 Multiple Testing 277 12.8 Expert Systems 279 12.9 Ethics in Machine Learning 280 13 Current Challenges in Machine Learning 283 13.1 Streaming Data 283 13.2 Distributed Data 283 13.3 Semi-Supervised Learning 283 13.4 Active Learning 284 13.5 Feature Construction via Deep Neural Networks 284 13.6 Transfer Learning 284 13.7 Interpretability and Protection of Complex Models 285 14 R and Python Source Code 287 14.1 Author's Biases 288 14.2 Packages and Code 288 14.3 The Running Example (Section 4.5) 289 14.4 The Bayes Classifier (Section 4.1) 292 14.5 Quadratic Discriminant Analysis (Section 4.6.1) 294 14.6 Linear Discriminant Analysis (Section 4.6.2) 296 14.7 Gaussian Mixture Models (Section 4.6.3) 297 14.8 Kernel Density Estimation (Section 4.6.4) 300 14.9 Histograms (Section 4.6.5) 304 14.10 The Naive Bayes Classifier (Section 4.6.6) 309 14.11 k-Nearest-Neighbor (Section 4.7.1) 312 14.12 Learning Vector Quantization (Section 4.7.4) 314 14.13 Logistic Regression (Section 4.8) 317 14.14 Neural Networks (Section 4.9) 319 14.15 Classification Trees (Section 4.10) 324 14.16 Support Vector Machines (Section 4.11) 332 14.17 Bootstrap Aggregation (Bagging) (Section 6.3) 341 14.18 Random Forests (Section 6.4) 343 14.19 Boosting by Reweighting (Section 6.5) 345 14.20 Boosting by Sampling (Arcing) (Section 6.5) 346 14.21 Gradient Boosted Trees (Section 6.7) 347 Appendix-A: List of Symbols 351 Appendix-B: The Condition Number of a Matrix with Respect to a Norm 353 Appendix-C: Converting Between Normal Parameters and Level-Curve Ellipsoids 357 Appendix-D: The Geometry of Linear Functions and Linear Classifiers 359 Appendix-E: Training Data and Fitted Parameters 367 Appendix-F: Solutions to Selected Exercises 371 Bibliography 399 Index 413

Reviews

Author Information

Steven W. Knox holds a Ph.D. in Mathematics from the University of Illinois and an M.S. in Statistics from Carnegie Mellon University. He has almost thirty years’ experience in using Machine Learning, Statistics, and Mathematics to solve real-world problems. He is currently a Data Science Subject Matter Expert at the National Security Agency, where he has also served as Technical Director of Mathematics Research and in other senior technical and leadership roles.

Tab Content 6

Author Website:  

Countries Available

All regions
Latest Reading Guide

RGFEB26

 

Shopping Cart
Your cart is empty
Shopping cart
Mailing List