Probability and Statistics for Computer Science

Author: David Forsyth
Publisher: Springer International Publishing AG
Edition: Softcover reprint of the original 1st ed. 2018
ISBN:

9783319877884

Pages: 367
Publication Date: 04 June 2019
Format: Paperback
Availability: Manufactured on demand

We will order this item for you from a manufactured on demand supplier.

Our Price $145.17 Quantity:

Share |

Probability and Statistics for Computer Science

Overview

This textbook is aimed at computer science undergraduates late in sophomore or early in junior year, supplying a comprehensive background in qualitative and quantitative data analysis, probability, random variables, and statistical methods, including machine learning. With careful treatment of topics that fill the curricular needs for the course, Probability and Statistics for Computer Science features: • A treatment of random variables and expectations dealing primarily with the discrete case. • A practical treatment of simulation, showing how many interesting probabilities and expectations can be extracted, with particular emphasis on Markov chains. • A clear but crisp account of simple point inference strategies (maximum likelihood; Bayesian inference) in simple contexts. This is extended to cover some confidence intervals, samples and populations for random sampling with replacement, and the simplest hypothesis testing. • Achapter dealing with classification, explaining why it’s useful; how to train SVM classifiers with stochastic gradient descent; and how to use implementations of more advanced methods such as random forests and nearest neighbors. • A chapter dealing with regression, explaining how to set up, use and understand linear regression and nearest neighbors regression in practical problems. • A chapter dealing with principal components analysis, developing intuition carefully, and including numerous practical examples. There is a brief description of multivariate scaling via principal coordinate analysis. • A chapter dealing with clustering via agglomerative methods and k-means, showing how to build vector quantized features for complex signals. Illustrated throughout, each main chapter includes many worked examples and other pedagogical elements such as boxed Procedures, Definitions, Useful Facts, and Remember This (short tips). Problems and Programming Exercises are at the end of each chapter, with a summary of what the reader should know. Instructor resources include a full set of model solutions for all problems, and an Instructor's Manual with accompanying presentation slides.

Full Product Details

Author: David Forsyth
Publisher: Springer International Publishing AG
Imprint: Springer International Publishing AG
Edition: Softcover reprint of the original 1st ed. 2018
Weight: 0.990kg
ISBN:

9783319877884

ISBN 10: 3319877887
Pages: 367
Publication Date: 04 June 2019
Audience: Professional and scholarly , Professional & Vocational
Format: Paperback
Publisher's Status: Active
Availability: Manufactured on demand

We will order this item for you from a manufactured on demand supplier.

1 Notation and conventions 9 1.0.1 Background Information........................................................................ 10 1.1 Acknowledgements................................................................................................. 11 I Describing Datasets ; 12 2 First Tools for Looking at Data 13 2.1 Datasets.................................................................................... ................................... 13 2.2 What’s Happening? - Plotting Data................................................................. 15 2.2.1 Bar< Charts.................................................................................................... 16 2.2.2 Histograms................................................................................................... 16 2.2.3 How to Make Histograms...................................................................... 17 2.2.4 Conditional Histograms.......................................................................... 19 2.3 Summarizing 1D Data................................. ........................................................... 19 2.3.1 The Mean...................................................................................................... 20 2.3.2 Standard Deviation................................................................................... 22 2.3.3 Computing Mean and Standard Deviation Online...................... 26 2.3.4 Variance......................................................................................................... 26 2.3.5 The Median.................................................................................................. 27 2.3.6 Interqu artile Range.................................................................................. 29 2.3.7 Using Summaries Sensibly.................................................................... 30 2.4 Plots and Summaries............................................................................................. 31 2.4.1 Some Properties of Histograms.......................................................... 31 2.4.2 Standard Coordinates and Normal Data......................................... 34 2.4.3 Box Plots....................................................................................................... 38 2.5 Whose is bigger? Inves tigating Australian Pizzas...................................... 39 2.6 You should.................................................................................................................. 43 2.6.1 remember these definitions:................................................................. 43 2.6.2 remember these terms............................................................................ 43 2.6.3 remember these facts:............................................................................. 43 2.6.4 be able to...................................................................................................... 43 3 Looking at Relationships 47 3.1 Plotting 2D Data...................................................................................................... 47 3.1.1 3.1.2 Series............................... ............................................................................... 51 3.1.3 Scatter Plots for Spatial Data.............................................................. 53 3.1.4 Exposing Relationships with Scatter Plots..................................... 54 3.2 Correlation.................................................................................................................. 57 3.2.1 The Correlation Coefficient................................................................... 60 3.2.2 Using Correlation to Predict................................................................ 64 3.2.3 Confusion caused by co rrelation......................................................... 68 1 <3.3 Sterile Males in Wild Horse Herds.................................................................. 68 3.4 You should.................................................................................................................. 72 3.4.1 remember these definitions:................................................................. 72 3.4.2 remember these terms............................................................................ 72 3.4.3 remember these facts: . . . . . 3.4.4 use these procedures: . . . . . . 3.4.5 be able to: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 . . . . . . . . . . . . . . . . . 72 . . . . . . . . . . . . . . . . . 72 II Probability & nbsp; 78 4 Basic ideas in probability 79 4.1 Experiments, Outcomes and Probability....................................................... 79 4.1.1 Outcomes and Probability...................................................................... 79 4.2 Events................. .......................................................................................................... 81 4.2.1 Computing Event Probabilities by Counting Outcomes............. 83 4.2.2 The Probability of Events...................................................................... 87 4.2.3 Computing Probabilities by Reasoning about Sets...................... 89 4.3 Independence............................................................................................................ 92 4.3.1 Example: Airline Overbooking............................................................ 96 4.4 Conditional ........................................... ............. 99 4.4.1 Evaluating Conditional Probabilities.............................................. 100 4.4.2 Detecting Rare Events is Hard......................................................... 104 4.4.3 Conditional Probability and Various Forms of Independence . 106 4.4.4 The Prosecutor’s Fallacy 108 4.4.5 Example: The Monty Hall Problem................................................ 110 4.5 Extra Worked Examples.................................................................................... 112 4.5.1 Outcomes and Probability........................................ ........................... 112 4.5.2 Events.......................................................................................................... 114 4.5.3 Independence........................................................................................... 115 4.5.4 Conditional Probability......................................................................... 117 4.6 You should............................................................................................................... 121 4.6.1 remember these definitions:.............................................................. 121 4.6.2 remember these terms........ ................................................................. 121 4.6.3 remember and use these facts.......................................................... 121 4.6.4 remember these points:....................................................................... 121 4.6.5 be able to.................................................................................................... 121 5 Random Variables and Expectations 128 5.1 Random Variables................................................................................................. 128 5.1.1 Joint and Conditional Probability for Random Variables . . . 131 5.1.2 Just a Little Continuous Probability............................................... 134 5.2 Expectations and Expected Values................................................................ 137 5.2.1 Expected Values...................................................................................... 138 5.2.2 Mean, Variance and Covariance....................................................... 141 5.2.3 Expectations and Statistics................................................................. 145 5.3 The Weak Law of Large Numbers................................................................ 145 5.3.1 IID Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 5.3.2 Two Inequalities . . . . . . . . . . . . . . . . . . . . . . . .< . 146 5.3.3 Proving the Inequalities . . . . . . . . . . . . . . . . . . . . . 147 5.3.4 The Weak Law of Large Numbers.................................................. 149 5.4 Using the Weak Law of Large Numbers 151 5.4.1 Should you accept a bet?..................................................................... 151 5.4.2 Odds, Expectations and Bookmaking — a Cultural Diversion 152 5.4.3 Ending a Game Early 154 5.4.4 Making a Decision with Decision Trees and Expectations . . 154 5.4.5 Utility 156 5.5 You should................................................................................... 159 5.5.1 remember these definitions:.............................................................. 159 5.5.2 remember these terms......................................................................... 159 5.5.3 use and remember these facts.......................................................... 159 5.5.4 be able to.................................................................................................... 160 6 Useful Probability Distributions ; 167 6.1 Discrete Distributions 167 6.1.1 The Discrete Uniform Distribution................................................. 167 6.1.2 Bernoulli Random Variables.......................................................... ..... 168 6.1.3 The Geometric Distribution................................................................ 168 6.1.4 The Binomial Probability Distribution........................................... 169 6.1.5 Multinomial probabilities..................................................................... 171 6.1.6 The Poisson Distribution..................................................................... 172 6.2 Continuous Distributions ; 174 6.2.1 The Continuous Uniform Distribution........................................... 174 6.2.2 The Beta Distribution........................................................................... 174 6.2.3 The Gamma Distribution..................................................................... 176 6.2.4 The Exponential Distribution............................................................ 176 6.3 The Normal Distribution ; 178 6.3.1 The Standard Normal Distribution................................................. 178 6.3.2 The Normal Distribution..................................................................... 179 6.3.3 Properties of The Normal Distribution......................................... 180 6.4 Approximating Binomials with Large N 182 6.4.1 Large N.............................................................. ......................................... 183 6.4.2 Getting Normal<........................................................................................ 185 6.4.3 Using a Normal Approximation to the Binomial Distribution 187 6.5 You should . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.5.1 remember these definitions: . . . . . . . . . . . . . . . . 6.5.2 remember these terms: . . . . . . . . . . . . . . . . . . . 6.5.3 remember these facts: . . . . . . . . . . . . . . . . . . . 6.5.4 remember these points: . . . . . . . . . . . . . . . . . .< . . . 188 . . . 188 . . . 188 . . . 188 . . . 188 III Inference ; 196 7 Samples and Populations 197 7.1 The Sample Mean................................................................................................. 197 7.1.1 The Sample Mean is an Estimate of the Population Mean . . 197 7.1.2 The Varianc e of the Sample Mean.................................................. 198 7.1.3 When The Urn Model Works............................................................ 201 7.1.4 Distributions are Like Populations................................................. 202 7.2 Confidence Intervals............................................................................................ 203 7.2.1 Constructing Confidence Intervals.................................................. 203 7.2.2 Estimating the Variance of the Sample Mean............................ 204 7.2.3 The Probability Distribution of the Sample Mean..................... 206 & lt; 7.2.4 Confidence Intervals for Population Means................................. 208 7.2.5 Standard Error Estimates from Simulation................................. 212 7.3 You should............................................................................................................... 216 7.3.1 remember these definitions:.............................................................. 216 7.3.2 remember these terms......................................................................... 216 7.3.3 remember these facts:........................................................................... 216 7.3.4 use these procedures............................................................................. 216 7.3.5 be able to.................................................................................................... 216 8 The Significance of Evidence 221 8.1 Significance.............................................................. ................................................ 222 8.1.1 Evaluating Significance......................................................................... 223 8.1.2 P-values....................................................................................................... 225 8.2 Comparing the Mean of Two Populations.................................................. 230 8.2.1 Assuming Known Population Standard Deviations................... 231 8.2.2 Assuming Same, Unknown Population Standard Deviation . 233 8.2.3 Assuming Different, Unknown Population Stand ard Deviation 235 8.3 Other Useful Tests of Significance................................................................. 237 8.3.1 F-tests and Standard Deviations...................................................... 237 8.3.2 χ2 Tests of Model Fit............................................................................ 239 8.4 Dangerous Behavior............................................................................................. 244 8.5 You should............................................................................................................... 246 8.5.1 remember these definitions:...................................... ........................ 246 8.5.2 remember 8.5.3 remember these facts: . . . . . 8.5.4 use these procedures: . . . . . . 8.5.5 be able to: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 . . . . . . . . . . . . . . . . . 246 . . . . . . . . . . . . . . . . . 246 9 Experiments &nbs p; 251 9.1 A Simple Experiment: The Effect of a Treatment.................................. 251 9.1.1 Randomized Balanced Experiments............................................... 252 9.1.2 Decomposing Error in Predictions.................................................. 253 9.1.3 Estimating the Noise Variance......................................................... 253 9.1.4 The ANOVA Table.................................................................................. 255 9.1.5 Unbalanced Experiments.................................................................... 257 9.1.6 Significant Differences.......................................................................... 259 9.2 Two Factor Experiments.................................................................................... 261 9.2.1 &n bsp; Decomposing the Error........................................................................ 264 9.2.2 Interaction Between Effects................................................................ 265 9.2.3 The Effects of a Treatment................................................................. 266 9.2.4 Setting up an ANOVA Table.............................................................. 267 9.3 You should............................................................................................................... 272 9.3.1 remember these definitions:.............................................................. 272 9.3.2 remember these terms......................................................................... 272 9.3.3 remember these facts:........................................................................... 272 9.3.4 use these procedures............................................................................. 272 9.3.5 be able to.................................................................................................... 272 9.3.6 Two-Way Experiments.......................................................................... 274 10 Inferring Probability Models from Data &n bsp; 275 10.1 Estimating Model Parameters with Maximum Likelihood.................. 275 10.1.1 The Maximum Likelihood Principle............................................... 277 10.1.2 Binomial, Geometric and Multinomial Distributions................ 278 10.1.3 Poisson and Normal Distributions................................................... 281 10.1.4 Confidence Intervals for Model Parameters................................ 286 10.1.5 Cautions about Maximum Likelihood............................................ 288 10.2 Incorporating Prio rs with Bayesian Inference.......................................... 289 10.2.1 Conjugacy................................................................................................... 292 10.2.2 MAP Inference......................................................................................... 294 10.2.3 Cautions about Bayesian Inference................................................. 296 10.3 Bayesian Inference for Normal Distributions............................................ 296 10.3.1 Example: Measuring Depth of a Borehole................................... 296 10.3.2 Normal Prior and Normal Likelihood Yield Normal Posterior 297 10.3.3 Filtering.................................... .................................................................. 300 10.4 You should............................................................................................................... 303 10.4.1 remember these definitions:.............................................................. 303 10.4.2 remember these terms......................................................................... 303 10.4.3 remember these facts:........................................................................... 304 10.4.4 use these procedures............................................................................. 304 10.4.5 be able to.................................................................................................... 304 & lt; IV Tools 312 11 Extracting Important Relationships in High Dimensions 313 11.1 Summaries and Simple Plots........................................................................... 313 11.1.1 The Mean...................... ............................................................................. 314 11.1.2 Stem Plots and Scatterplot Matrices.............................................. 315 11.1.3 Covariance.................................................................................................. 317 11.1.4 The Covariance Matrix......................................................................... 319 11.2 Using Mean and Covariance to Understand High Dimensional Data . 321 11.2.1 Mean and Covariance under Affine Transformations............... 322 11.2.2 . . 324 . . 325 . . 326 . . 327 . . 329 . 332 . . 334 . . 335 . . 335 . . 338 . . 339 . . 341 . . < 345 . . 345 . . 345 . . 345 . . 345 . . 345 349 . . 349 . . 350 . . 350 . . 351 . . 351 . . 353 . . 355 . . 357 . . 358 . . 359 . . <360 .< . 361 < Eigenvectors and Diagonalization . . . . . . . . . . . . . . 11.2.3 Diagonalizing Covariance by Rotating Blobs . . . . . . . . 11.2.4 Approximating Blobs . . . . . . . . . . . . . . . . . . . . 11.2.5 Example: Transforming the Height-Weight Blob . . . . . 11.3 Principal Components Analysis . . . . . . . . . . . . . . . . . . . 11.3.1 Example: Representing Colors with Principal Components 11.3.2 Example: Representing Faces with Principal Components 11.4 Multi-Dimensional Scaling . . . . . . . . . . . . . . . . . . . . . . 11.4.1 Choosing Low D Points using High D Distances . . . . . . 11.4.2 Factoring a Dot-Product Matrix . . . . . . . . . . . . . . 11.4.3 Example: Mapping with Multidimensional Scaling . . . . 11.5 Example: Understanding Height and Weight . . . . . . . . . . . 11.6 You should . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.6.1 remember these definitions: . . . . . . . . . . . . . . . . . 11.6.2 remember these terms: . . . . . . . . . . . . . . . . . . . . 11.6.3 remember these facts: .& nbsp; . . . . . . . . . . . . . . . . . . . 11.6.4 use these procedures: . . . . . . . . . . . . . . . . . . . . . 11.6.5 be able to: . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Learning to Classify 12.1 Classification: The Big Ideas . . . . . . . . . . . . . . . . . . . . 12.1.1 The Error Rate . . . . . . . . . . . . . . . . . . . . . . . . 12.1.2 Overfitting . . . . . . . . . . . . . . . . . . . . . . . . . . 12.1.3 Cross-Validation . . . . . . . . . . . . . . . . . . . . . . . 12.1.4 Is the Classifier Working Well? . . . . . . . . . . . . . . . 12.2 Classifying with Nearest Neighbors . . . . . . . . . . . . . . . . . 12.3 Classifying with Naive Bayes . . . . . . . . . . . . . . . . . . . . 12.3.1 Missing Data . . . . . . . . . . . . . . . . . . . . . . . . . 12.4 The Support 12.4.1 Choosing a Classifier with the Hinge Loss . . . . . . . . . 12.4.2 Finding a Minimum: General Points . . . . . . . . . . . . 12.4.3 Finding a Minimum: Stochastic Gradient Descent . . . . 12.4.4 Example: Training an SVM with Stochastic Gradient Descent 363 12.4.5 Multi-Class Classification with SVMs.............................................. 366 12.5 Classifying with Random Forests................................................................... 367 12.5.1 Building a Decision Tree..................................................................... 367 12.5.2 Choosing a Split with Information Gain........................................ 370 12.5.3 Forests......................................................................................................... 373 12.5.4 Building and Evaluating a Decision Forest.................................. 374 12.5.5 Classifying Data Items with a Decision Forest........................... 375 12.6 You should................................................................... ............................................ 378 12.6.1 remember these definitions:.............................................................. 378 12.6.2 remember these terms......................................................................... 378 12.6.3 remember these facts:........................................................................... 379 12.6.4 use these procedures............................................................................. 379 12.6.5 be able to.................................................................................................... 379 < 13.1 The Curse of Dimension........................................................................ ............. 384 13.1.1 The Curse: Data isn’t Where You Think it is............................. 384 13.1.2 Minor Banes of Dimension.................................................................. 386 13.2 The Multivariate Normal Distribution......................................................... 387 13.2.1 Affine Transformations and Gaussians.......................................... 387 13.2.2 Plotting a 2D Gaussian: Covariance Ellipses.............................. 388 13.3 Agglomerative and Divisive Clustering........................................................ 389 13.3.1 Clustering and Distance....................................................................... 391 13.4 & nbsp; The K-Means Algorithm and Variants......................................................... 392 13.4.1 How to choose K...................................................................................... 395 13.4.2 Soft Assignment....................................................................................... 397 13.4.3 General Comments on K-Means....................................................... 400 13.4.4 K-Mediods.................................................................................................. 400 13.5 Application Example: Clustering Documents........................................... 401 13.5.1 A Topic Model...................................................................................... .... 402 13.6 Describing Repetition with Vector Quantization...................................... 403 13.6.1 Vector Quantization............................................................................... 404 13.6.2 Example: Groceries in Portugal....................................................... 406 13.6.3 Efficient Clustering and Hierarchical K Means.......................... 409 13.6.4 Example: Activity from Accelerometer Data............................... 409 13.7 You should............................................................................................................... 413 13.7.1 remember these definitions:.............................................................. 413 13.7.2 remember these terms......................................................................... 413 13.7.3 remember these facts:........................................................................... 413 13.7.4 use these procedures............................................................................. 413 14 Regression &nbs p; 417 14.1.1 Regression to Make Predictions....................................................... 417 14.1.2 Regression to Spot Trends.................................................................. 419 14.1 Linear Regression and Least Squares.......................................................... 421 14.1.1 Linear Regression................................................................................... 421 14.1.2 Choosing β.................................................................................................. 422 14.1.3 Solving the Least Squares Problem................................................ 423 14.1.4 &n bsp; Residuals..................................................................................................... 424 14.1.5 R-squared.................................................................................................... 424 14.2 Producing Good Linear Regressions............................................................. 427 14.2.1 Transforming Variables........................................................................ 428 14.2.2 Problem Data Points have Significant Impact............................ 431 14.2.3 Functions of One Explanatory Variable........................................ 433 14.2.4 Regularizing Linear Regressions...................................................... 435 14.3 &nbs p; Exploiting Your Neighbors 14.3.1 Using your Neighbors to Predict More than a Number............ 441 14.3.2 Example: Filling Large Holes with Whole Images.................... 441 14.4 You should . . . . . . . . . . . . . . . . . . . . . . . . . . . 14.4.1 remember these definitions: . . . . . . . . . . . . . . 14.4.2 remember these terms: . . . . . . . . . . . . . . . . . . . . . . 444 . . . . . 444 . . . . . 444 14.4.3 remember these facts:........................................................................... 444 14.4.4 remember these procedures:............................................................. 444 15 Markov Chains and Hidden Markov Models &n bsp; 454 15.1 Markov Chains........................................................................................................ 454 15.1.1 Transition Probability Matrices........................................................ 457 15.1.2 Stationary Distributions....................................................................... 459 15.1.3 Example: Markov Chain Models of Text...................................... 462 15.2 Estimating Properties of Markov Chains.................................................... 465 15.2.1 Simulation.................................................... .............................................. 465 15.2.2 Simulation Results as Random Variables..................................... 467 15.2.3 Simulating Markov Chains.................................................................. 469 15.3 Example: Ranking the Web by Simulating a Markov Chain................ 472 15.4 Hidden Markov Models and Dynamic Programming............................. 473 15.4.1 Hidden Markov Models........................................................................ 474 15.4.2 Picturing Inference with a Trellis.................................................... 474 15.4.3 Dynamic Programming for HMM’s: Formalities....................... 478 15.4.4 &nb sp; Example: Simple Communication Errors..................................... 478 15.5 You should............................................................................................................... 481 15.5.1 remember these definitions:.............................................................. 481 15.5.2 remember these terms......................................................................... 481 15.5.3 remember these facts:........................................................................... 481 15.5.4 be able to.................................................................................................... 481 V Some Mathematical Background &nb sp; 484 16 Resources 485 16.1 Useful Material about Matrices....................................................................... 485 16.1.1 The Singular Value Decomposition................................................. 486 16.1.2 Approximating A Symmetric Matrix............................................... 487 16.2 Some Special Functions..................................................................................... 489 16.3 Finding Nearest Neighbors............................................................................... 490 16.4 Entropy and Information Gain........................................................................ 493

Reviews

Author Information

David Alexander Forsyth is Fulton Watson Copp Chair in Computer Science at the University of Illinois at Urbana-Champaign, where he is a leading researcher in computer vision. Professor Forsyth has regularly served as a program or general chair for the top conferences in computer vision, and has just finished a second term as Editor-in-Chief for IEEE Transactions on Pattern Analysis and Machine Intelligence. A Fellow of the ACM (2014) and IEEE (2009), Forsyth has also been recognized with the IEEE Computer Society’s Technical Achievement Award (2005), the Marr Prize, and a prize for best paper in cognitive computer vision (ECCV 2002). Many of his former students are famous in their own right as academics or industry leaders. He is the co-author with Jean Ponce of Computer Vision: A Modern Approach (2002; 2011), published in four languages, and a leading textbook on the topic. Among a variety of odd hobbies, he is a compulsive diver, certified up to normoxic trimix level.

Tab Content 6

Author Website:

Customer Reviews

Recent Reviews

No review item found!

Add your own review!

Countries Available

All regions

Latest Reading Guide

Shopping Cart

Your cart is empty

Mailing List

Probability and Statistics for Computer Science

9783319877884

Availability Information

Overview

Full Product Details

9783319877884

Table of Contents

Reviews

Author Information

Tab Content 6

Customer Reviews

Recent Reviews

Countries Available

Sign up now