Home > Computing and Information Technology > Computer science > Artificial intelligence > Machine learning > Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science
38%
Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

          
5
4
3
2
1

Available


Premium quality
Premium quality
Bookswagon upholds the quality by delivering untarnished books. Quality, services and satisfaction are everything for us!
Easy Return
Easy return
Not satisfied with this product! Keep it in original condition and packaging to avail easy return policy.
Certified product
Certified product
First impression is the last impression! Address the book’s certification page, ISBN, publisher’s name, copyright page and print quality.
Secure Checkout
Secure checkout
Security at its finest! Login, browse, purchase and pay, every step is safe and secured.
Money back guarantee
Money-back guarantee:
It’s all about customers! For any kind of bad experience with the product, get your actual amount back after returning the product.
On time delivery
On-time delivery
At your doorstep on time! Get this book delivered without any delay.
Add to Wishlist

About the Book

An essential guide for tackling outliers and anomalies in machine learning and data science. In recent years, machine learning (ML) has transformed virtually every area of research and technology, becoming one of the key tools for data scientists. Robust machine learning is a new approach to handling outliers in datasets, which is an often-overlooked aspect of data science. Ignoring outliers can lead to bad business decisions, wrong medical diagnoses, reaching the wrong conclusions or incorrectly assessing feature importance, just to name a few. Fundamentals of Robust Machine Learning offers a thorough but accessible overview of this subject by focusing on how to properly handle outliers and anomalies in datasets. There are two main approaches described in the book: using outlier-tolerant ML tools, or removing outliers before using conventional tools. Balancing theoretical foundations with practical Python code, it provides all the necessary skills to enhance the accuracy, stability and reliability of ML models. Fundamentals of Robust Machine Learning readers will also find: A blend of robust statistics and machine learning principles Detailed discussion of a wide range of robust machine learning methodologies, from robust clustering, regression and classification, to neural networks and anomaly detection Python code with immediate application to data science problems Fundamentals of Robust Machine Learning is ideal for undergraduate or graduate students in data science, machine learning, and related fields, as well as for professionals in the field looking to enhance their understanding of building models in the presence of outliers.

Table of Contents:
Preface xv About the Companion Website xix 1 Introduction 1 1.1 Defining Outliers 2 1.2 Overview of the Book 3 1.3 What Is Robust Machine Learning? 3 1.3.1 Machine Learning Basics 4 1.3.2 Effect of Outliers 6 1.3.3 What Is Robust Data Science? 7 1.3.4 Noise in Datasets 7 1.3.5 Training and Testing Flows 8 1.4 Robustness of the Median 9 1.4.1 Mean vs. Median 9 1.4.2 Effect on Standard Deviation 10 1.5 l 1 and l 2 Norms 11 1.6 Review of Gaussian Distribution 12 1.7 Unsupervised Learning Case Study 13 1.7.1 Clustering Example 14 1.7.2 Clustering Problem Specification 14 1.8 Creating Synthetic Data for Clustering 16 1.8.1 One-Dimensional Datasets 16 1.8.2 Multidimensional Datasets 17 1.9 Clustering Algorithms 19 1.9.1 k-Means Clustering 19 1.9.2 k-Medians Clustering 21 1.10 Importance of Robust Clustering 22 1.10.1 Clustering with No Outliers 22 1.10.2 Clustering with Outliers 23 1.10.3 Detection and Removal of Outliers 25 1.11 Summary 27 Problems 28 References 34 2 Robust Linear Regression 35 2.1 Introduction 35 2.2 Supervised Learning 35 2.3 Linear Regression 36 2.4 Importance of Residuals 38 2.4.1 Defining Errors and Residuals 38 2.4.2 Residuals in Loss Functions 39 2.4.3 Distribution of Residuals 40 2.5 Estimation Background 42 2.5.1 Linear Models 42 2.5.2 Desirable Properties of Estimators 43 2.5.3 Maximum-Likelihood Estimation 44 2.5.4 Gradient Descent 47 2.6 M-Estimation 49 2.7 Least Squares Estimation (LSE) 52 2.8 Least Absolute Deviation (LAD) 54 2.9 Comparison of LSE and LAD 55 2.9.1 Simple Linear Model 55 2.9.2 Location Problem 56 2.10 Huber’s Method 58 2.10.1 Huber Loss Function 58 2.10.2 Comparison with LSE and LAD 63 2.11 Summary 64 Problems 64 References 67 3 The Log-Cosh Loss Function 69 3.1 Introduction 69 3.2 An Intuitive View of Log-Cosh 69 3.3 Hyperbolic Functions 71 3.4 M-Estimation 71 3.4.1 Asymptotic Behavior 72 3.4.2 Linear Regression Using Log-Cosh 74 3.5 Deriving the Distribution for Log-Cosh 75 3.6 Standard Errors for Robust Estimators 79 3.6.1 Example: Swiss Fertility Dataset 81 3.6.2 Example: Boston Housing Dataset 82 3.7 Statistical Properties of Log-Cosh Loss 83 3.7.1 Maximum-Likelihood Estimation 83 3.8 A General Log-Cosh Loss Function 84 3.9 Summary 88 Problems 88 References 93 4 Outlier Detection, Metrics, and Standardization 95 4.1 Introduction 95 4.2 Effect of Outliers 95 4.3 Outlier Diagnosis 97 4.3.1 Boxplots 98 4.3.2 Histogram Plots 100 4.3.3 Exploratory Data Analysis 101 4.4 Outlier Detection 102 4.4.1 3-Sigma Edit Rule 102 4.4.2 4.5-MAD Edit Rule 104 4.4.3 1.5-IQR Edit Rule 105 4.5 Outlier Removal 105 4.5.1 Trimming Methods 105 4.5.2 Winsorization 105 4.5.3 Anomaly Detection Method 106 4.6 Regression-Based Outlier Detection 107 4.6.1 LS vs. LC Residuals 108 4.6.2 Comparison of Detection Methods 109 4.6.3 Ordered Absolute Residuals (OARs) 110 4.6.4 Quantile–Quantile Plot 111 4.6.5 Quad-Plots for Outlier Diagnosis 113 4.7 Regression-Based Outlier Removal 114 4.7.1 Iterative Boxplot Method 114 4.8 Regression Metrics with Outliers 116 4.8.1 Mean Square Error (MSE) 117 4.8.2 Median Absolute Error (MAE) 118 4.8.3 MSE vs. MAE on Realistic Data 119 4.8.4 Selecting Hyperparameters for Robust Regression 120 4.9 Dataset Standardization 121 4.9.1 Robust Standardization 122 4.10 Summary 126 Problems 126 References 131 5 Robustness of Penalty Estimators 133 5.1 Introduction 133 5.2 Penalty Functions 133 5.2.1 Multicollinearity 133 5.2.2 Penalized Loss Functions 135 5.3 Ridge Penalty 136 5.4 LASSO Penalty 137 5.5 Effect of Penalty Functions 138 5.6 Penalty Functions with Outliers 139 5.7 Ridge Traces 142 5.8 Elastic Net (Enet) Penalty 143 5.9 Adaptive LASSO (aLASSO) Penalty 145 5.10 Penalty Effects on Variance and Bias 146 5.10.1 Effect on Variance 146 5.10.2 Geometric Interpretation of Bias 148 5.11 Variable Importance 151 5.11.1 The t-Statistic 151 5.11.2 LASSO and aLASSO Traces 153 5.12 Summary 155 Problems 156 References 159 6 Robust Regularized Models 161 6.1 Introduction 161 6.2 Overfitting and Underfitting 161 6.3 The Bias–Variance Trade-Off 162 6.4 Regularization with Ridge 164 6.4.1 Selection of Hyperparameter λ 165 6.4.2 Example: Diabetes Dataset 167 6.5 Generalization using Robust Estimators 169 6.5.1 Training and Test Sets 169 6.5.2 k-Fold Cross-validation 171 6.6 Robust Generalization and Regularization 173 6.6.1 Regularization with LC-Ridge 174 6.7 Model Complexity 175 6.7.1 Variable Selection Using LS-LASSO 176 6.7.2 Variable Ordering Using LC-aLASSO 176 6.7.3 Building a Compact Model 179 6.8 Summary 182 Problems 182 References 186 7 Quantile Regression Using Log-Cosh 187 7.1 Introduction 187 7.2 Understanding Quantile Regression 188 7.3 The Crossing Problem 189 7.4 Standard Quantile Loss Function 190 7.5 Smooth Regression Quantiles (SMRQ) 192 7.6 Evaluation of Quantile Methods 195 7.6.1 Qualitative Assessment 196 7.6.2 Quantitative Assessment 198 7.7 Selection of Robustness Coefficient 200 7.8 Maximum-Likelihood Procedure for SMRQ 202 7.9 Standard Error Computation 204 7.10 Summary 206 Problems 207 References 209 8 Robust Binary Classification 211 8.1 Introduction 211 8.2 Binary Classification Problem 212 8.2.1 Why Linear Regression Fails 212 8.2.2 Outliers in Binary Classification 213 8.3 The Cross-Entropy (CE) Loss 215 8.3.1 Deriving the Cross-Entropy Loss 216 8.3.2 Understanding Logistic Regression 218 8.3.3 Gradient Descent 221 8.4 The Log-Cosh (LC) Loss Function 221 8.4.1 General Formulation 223 8.5 Algorithms for Logistic Regression 224 8.6 Example: Motor Trend Cars 226 8.7 Regularization of Logistic Regression 227 8.7.1 Overfitting and Underfitting 228 8.7.2 k-Fold Cross-Validation 229 8.7.3 Penalty Functions 229 8.7.4 Effect of Outliers 230 8.8 Example: Circular Dataset 231 8.9 Outlier Detection 234 8.10 Robustness of Binary Classifiers 235 8.10.1 Support Vector Classifier (SVC) 235 8.10.2 Support Vector Machines (SVMs) 238 8.10.3 k-Nearest Neighbors (k-NN) 241 8.10.4 Decision Trees and Random Forest 243 8.11 Summary 244 Problems 244 Reference 249 9 Neural Networks Using Log-Cosh 251 9.1 Introduction 251 9.2 A Brief History of Neural Networks 251 9.3 Defining Neural Networks 252 9.3.1 Basic Computational Unit 253 9.3.2 Four-Layer Neural Network 254 9.3.3 Activation Functions 255 9.4 Training of Neural Networks 257 9.5 Forward and Backward Propagation 258 9.5.1 Forward Propagation 259 9.5.2 Backward Propagation 260 9.5.3 Log-Cosh Gradients 263 9.6 Cross-entropy and Log-Cosh Algorithms 264 9.7 Example: Circular Dataset 266 9.8 Classification Metrics and Outliers 269 9.8.1 Precision, Recall, F 1 Score 269 9.8.2 Receiver Operating Characteristics (ROCs) 271 9.9 Summary 273 Problems 273 References 280 10 Multi-class Classification and Adam Optimization 281 10.1 Introduction 281 10.2 Multi-class Classification 281 10.2.1 Multi-class Loss Functions 282 10.2.2 Softmax Activation Function 284 10.3 Example: MNIST Dataset 288 10.3.1 Neural Network Architecture 289 10.3.2 Comparing Cross-Entropy with Log-Cosh Losses 289 10.3.3 Outliers in MNIST 291 10.4 Optimization of Neural Networks 291 10.4.1 Momentum 293 10.4.2 rmsprop Approach 294 10.4.3 Optimizer Warm-Up Phase 295 10.4.4 Adam Optimizer 296 10.5 Summary 297 Problems 297 References 302 11 Anomaly Detection and Evaluation Metrics 303 11.1 Introduction 303 11.2 Anomaly Detection Methods 303 11.2.1 k-Nearest Neighbors 304 11.2.2 Dbscan 308 11.2.3 Isolation Forest 311 11.3 Anomaly Detection Using MADmax 316 11.3.1 Robust Standardization 317 11.3.2 k-Medians Clustering 317 11.3.3 Selecting MADmax 319 11.3.4 k-Nearest Neighbors (k-NN) 319 11.3.5 k-Nearest Medians (k-NM) 320 11.4 Qualitative Evaluation Methods 323 11.5 Quantitative Evaluation Methods 326 11.6 Summary 330 Problems 330 Reference 336 12 Case Studies in Data Science 337 12.1 Introduction 337 12.2 Example: Boston Housing Dataset 337 12.2.1 Exploratory Data Analysis 338 12.2.2 Neural Network Architecture 339 12.2.3 Comparison of LSNN and LCNN 342 12.2.4 Predicting Housing Prices 344 12.2.5 RMSE vs. MAE 344 12.2.6 Correlation Coefficients 345 12.3 Example: Titanic Dataset 346 12.3.1 Exploratory Data Analysis 346 12.3.2 LCLR vs. CELR 351 12.3.3 Outlier Detection and Removal 353 12.3.4 Robustness Coefficient for Log-Cosh 355 12.3.5 The Implications of Robustness 356 12.3.6 Ridge and aLASSO 357 12.4 Application to Explainable Artificial Intelligence (XAI) 359 12.4.1 Case Study: Logistic Regression 360 12.4.2 Case Study: Neural Networks 365 12.5 Time Series Example: Climate Change 366 12.5.1 Autoregressive Model 367 12.5.2 Forecasting Using AR(p) 369 12.5.3 Stationary Time Series 371 12.5.4 Moving Average 374 12.5.5 Finding Outliers in Time Series 375 12.6 Summary and Conclusions 376 Problems 376 References 382 Index 383


Best Sellers


Product Details
  • ISBN-13: 9781394294374
  • Publisher: John Wiley & Sons Inc
  • Publisher Imprint: John Wiley & Sons Inc
  • Height: 234 mm
  • No of Pages: 416
  • Spine Width: 28 mm
  • Weight: 839 gr
  • ISBN-10: 1394294379
  • Publisher Date: 09 May 2025
  • Binding: Hardback
  • Language: English
  • Returnable: Y
  • Sub Title: Handling Outliers and Anomalies in Data Science
  • Width: 188 mm


Similar Products

How would you rate your experience shopping for books on Bookswagon?

Add Photo
Add Photo

Customer Reviews

REVIEWS           
Click Here To Be The First to Review this Product
Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science
John Wiley & Sons Inc -
Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

Fundamentals of Robust Machine Learning: Handling Outliers and Anomalies in Data Science

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book
    Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    New Arrivals


    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!
    ASK VIDYA