Our search has the following Google-type functionality:

If you use '+' at the start of a word, that word will be present in the search results.

*eg. Harry +Potter*

*Search results will contain 'Potter'.*

If you use '-' at the start of a word, that word will be absent in the search results.

*eg. Harry -Potter*

*Search results will not contain 'Potter'.*

If you use 'AND' between 2 words, then both those words will be present in the search results.

*eg. Harry AND Potter*

*Search results will contain both 'Harry' and 'Potter'.*

NOTE: AND will only work with single words not phrases.

If you use 'OR' between 2 single words, then either or both of those words will be present in the search results.

*eg. 'Harry OR Potter'*

*Search results will contain just 'Harry', or just 'Potter', or both 'Harry' and 'Potter'.*

NOTE: OR will only work with single words not phrases.

If you use 'NOT' before a word, that word will be absent in the search results. (This is the same as using the minus symbol).

*eg. 'Harry NOT Potter'*

*Search results will not contain 'Potter'.*

NOTE: NOT will only work with single words not phrases.

If you use double quotation marks around words, those words will be present in that order.

*eg. "Harry Potter"*

*Search results will contain 'Harry Potter', but not 'Potter Harry'.*

NOTE: "" cannot be combined with AND, OR & NOT searches.

If you use '*' in a word, it performs a wildcard search, as it signifies any number of characters. (Searches cannot start with a wildcard).

*eg. 'Pot*er'*

*Search results will contain words starting with 'Pot' and ending in 'er', such as 'Potter'.*

A Computational Approach to Statistical Learning gives a novel introduction to predictive modeling by focusing on the algorithmic and numeric motivations behind popular statistical methods. The text contains annotated code to over 80 original reference functions. These functions provide minimal working implementations of common statistical learning algorithms. Every chapter concludes with a fully worked out application that illustrates predictive modeling tasks using a real-world dataset.

The text begins with a detailed analysis of linear models and ordinary least squares. Subsequent chapters explore extensions such as ridge regression, generalized linear models, and additive models. The second half focuses on the use of general-purpose algorithms for convex optimization and their application to tasks in statistical learning. Models covered include the elastic net, dense neural networks, convolutional neural networks (CNNs), and spectral clustering. A unifying theme throughout the text is the use of optimization theory in the description of predictive models, with a particular focus on the singular value decomposition (SVD). Through this theme, the computational approach motivates and clarifies the relationships between various predictive models.

Taylor Arnold is an assistant professor of statistics at the University of Richmond. His work at the intersection of computer vision, natural language processing, and digital humanities has been supported by multiple grants from the National Endowment for the Humanities (NEH) and the American Council of Learned Societies (ACLS). His first book, Humanities Data in R, was published in 2015.

Michael Kane is an assistant professor of biostatistics at Yale University. He is the recipient of grants from the National Institutes of Health (NIH), DARPA, and the Bill and Melinda Gates Foundation. His R package bigmemory won the Chamber's prize for statistical software in 2010.

Bryan Lewis is an applied mathematician and author of many popular R packages, including irlba, doRedis, and threejs.

1. Introduction Computational approach Statistical learning Example Prerequisites How to read this book Supplementary materials Formalisms and terminology Exercises 2. Linear Models Introduction Ordinary least squares The normal equations Solving least squares with the singular value decomposition Directly solving the linear system (?) Solving linear models with orthogonal projection (?) Sensitivity analysis (?) Relationship between numerical and statistical error Implementation and notes Application: Cancer incidence rates Exercises 3. Ridge Regression and Principal Component Analysis Variance in OLS Ridge regression (?) A Bayesian perspective Principal component analysis Implementation and notes Application: NYC taxicab data Exercises 4. Linear Smoothers Non-linearity Basis expansion Kernel regression Local regression Regression splines (?) Smoothing splines (?) B-splines Implementation and notes Application: US census tract data Exercises 5. Generalized Linear Models Classification with linear models Exponential families Iteratively reweighted GLMs (?) Numerical issues (?) Multi-class regression Implementation and notes Application: Chicago crime prediction Exercises 6. Additive Models Multivariate linear smoothers Curse of dimensionality Additive models (?) Additive models as linear models (?) Standard errors in additive models Implementation and notes Application: NYC flights data Exercises 7. Penalized Regression Models Variable selection Penalized regression with the `- and `-norms Orthogonal data matrix Convex optimization and the elastic net Coordinate descent (?) Active set screening using the KKT conditions (?) The generalized elastic net model Implementation and notes Application: Amazon product reviews Exercises 8. Neural Networks Dense neural network architecture Stochastic gradient descent Backward propagation of errors Implementing backpropagation Recognizing hand written digits (?) Improving SGD and regularization (?) Classification with neural networks (?) Convolutional neural networks Implementation and notes Application: Image classification with EMNIST Exercises 9. Dimensionality Reduction Unsupervised learning Kernel functions Kernel principal component analysis Spectral clustering t-Distributed stochastic neighbor embedding (t-SNE) Autoencoders Implementation and notes Application: Classifying and visualizing fashion MNIST Exercises 10. Computation in Practice Reference implementations Sparse matrices Sparse generalized linear models Computation on row chunks Feature hashing Data quality issues Implementation and notes Application Exercises A Matrix Algebra A Vector spaces A Matrices A Other useful matrix decompositions B Floating Point Arithmetic and Numerical Computation B Floating point arithmetic B Numerical sources of error B Computational effort

Taylor Arnold is an assistant professor of statistics at the University of Richmond. His work at the intersection of computer vision, natural language processing, and digital humanities has been supported by multiple grants from the National Endowment for the Humanities (NEH) and the American Council of Learned Societies (ACLS). His first book, Humanities Data in R, was published in 2015. Michael Kane is an assistant professor of biostatistics at Yale University. He is the recipient of grants from the National Institutes of Health (NIH), DARPA, and the Bill and Melinda Gates Foundation. His R package bigmemory won the Chamber's prize for statistical software in 2010. Bryan Lewis is an applied mathematician and author of many popular R packages, including irlba, doRedis, and threejs.

As best as I can determine, `A Computational Approach to Statistical Learning' (CASL) is unique among R books devoted to statistical learning and data science. Other popular texts...cover much of the same ground, and include extensive R code implementing statistical models. What makes CASL different is the unifying mathematical structure underlying the presentation and the focus on the computations themselves...CASL's great strengths are the use linear algebra to provide a coherent, unifying mathematical framework for explaining a wide class of models, a lucid writing style that appeals to geometric intuition, clear explanations of many details that are mostly glossed over in more superficial treatments, the inclusion of historical references, and R code that is tightly integrated into the text. The R code is extensive, concise without being opaque, and in many cases, elegant. The code illustrates R's advantages for developing statistical algorithms as well as its power to present versatile and compelling visualizations...CASL ought to appeal to anyone working in data science or machine learning seeking a sophisticated understanding of both the theoretical basis and efficient algorithms underlying a modern approach to computational statistics. ~Joe Rickert, RStudio The `literate programming' style is my favorite part of this book (borrowing the term from Don Knuth). It would be well suited for an engineer seeking to understand the implementations and ideas behind these statistical models. Real code beats pseudocode, because one can easily tweak and experiment with it...The other part I especially like is the development of neural nets based on extending the models previously introduced in the text. This takes some of the mystery out of neural nets and makes them more accessible to a statistician studying them for the first time... I would happily buy this book for my own reference and self-study... I'm not aware of any books that are written at this level that combines the motivation, the mathematics and the code in such a nice way. If I ever happen to be teaching a course on this material, then I would definitely teach from this book. ~Clark Fitzgerald, University of California, Davis I think the book is quite clearly written and covers really important things to consider that can help optimize model building. The book does a really great job of following its theme throughout and explicitly mentioning why they are explaining something the way they explain it. Reading the book, it is clear they considered how all the parts the included (at least the chapters I read) fit into the broader scope of the book's goal. ~Justin Post, North Carolina State University