close menu
Bookswagon-24x7 online bookstore
close menu
My Account
A General Introduction to Data Analytics

A General Introduction to Data Analytics

          
5
4
3
2
1

Available


Premium quality
Premium quality
Bookswagon upholds the quality by delivering untarnished books. Quality, services and satisfaction are everything for us!
Easy Return
Easy return
Not satisfied with this product! Keep it in original condition and packaging to avail easy return policy.
Certified product
Certified product
First impression is the last impression! Address the book’s certification page, ISBN, publisher’s name, copyright page and print quality.
Secure Checkout
Secure checkout
Security at its finest! Login, browse, purchase and pay, every step is safe and secured.
Money back guarantee
Money-back guarantee:
It’s all about customers! For any kind of bad experience with the product, get your actual amount back after returning the product.
On time delivery
On-time delivery
At your doorstep on time! Get this book delivered without any delay.
Add to Wishlist

About the Book

A guide to the principles and methods of data analysis that does not require knowledge of statistics or programming

A General Introduction to Data Analytics is an essential guide to understand and use data analytics. This book is written using easy-to-understand terms and does not require familiarity with statistics or programming. The authors—noted experts in the field—highlight an explanation of the intuition behind the basic data analytics techniques. The text also contains exercises and illustrative examples.

Thought to be easily accessible to non-experts, the book provides motivation to the necessity of analyzing data. It explains how to visualize and summarize data, and how to find natural groups and frequent patterns in a dataset. The book also explores predictive tasks, be them classification or regression. Finally, the book discusses popular data analytic applications, like mining the web, information retrieval, social network analysis, working with text, and recommender systems. The learning resources offer:

  • A guide to the reasoning behind data mining techniques
  • A unique illustrative example that extends throughout all the chapters
  • Exercises at the end of each chapter and larger projects at the end of each of the text’s two main parts

Together with these learning resources, the book can be used in a 13-week course guide, one chapter per course topic.

The book was written in a format that allows the understanding of the main data analytics concepts by non-mathematicians, non-statisticians and non-computer scientists interested in getting an introduction to data science. A General Introduction to Data Analytics is a basic guide to data analytics written in highly accessible terms.



Table of Contents:

Preface xiii

Acknowledgments xv

Presentational Conventions xvii

About the Companion Website xix

Part I Introductory Background 1

1 What Can We Do With Data? 3

1.1 Big Data and Data Science 4

1.2 Big Data Architectures 5

1.3 Small Data 6

1.4 What is Data? 7

1.5 A Short Taxonomy of Data Analytics 9

1.6 Examples of Data Use 10

1.6.1 Breast Cancer in Wisconsin 11

1.6.2 Polish Company Insolvency Data 11

1.7 A Project on Data Analytics 12

1.7.1 A Little History on Methodologies for Data Analytics 12

1.7.2 The KDD Process 14

1.7.3 The CRISP-DM Methodology 15

1.8 How this Book is Organized 16

1.9 Who Should Read this Book 18

Part II Getting Insights from Data 19

2 Descriptive Statistics 21

2.1 Scale Types 22

2.2 Descriptive Univariate Analysis 25

2.2.1 Univariate Frequencies 25

2.2.2 Univariate Data Visualization 27

2.2.3 Univariate Statistics 32

2.2.4 Common Univariate Probability Distributions 38

2.3 Descriptive Bivariate Analysis 40

2.3.1 Two Quantitative Attributes 41

2.3.2 Two Qualitative Attributes, at Least one of them Nominal 45

2.3.3 Two Ordinal Attributes 46

2.4 Final Remarks 47

2.5 Exercises 47

3 Descriptive Multivariate Analysis 49

3.1 Multivariate Frequencies 49

3.2 Multivariate Data Visualization 50

3.3 Multivariate Statistics 59

3.3.1 Location Multivariate Statistics 59

3.3.2 Dispersion Multivariate Statistics 60

3.4 Infographics and Word Clouds 66

3.4.1 Infographics 66

3.4.2 Word Clouds 67

3.5 Final Remarks 67

3.6 Exercises 68

4 Data Quality and Preprocessing 71

4.1 Data Quality 71

4.1.1 Missing Values 72

4.1.2 Redundant Data 74

4.1.3 Inconsistent Data 75

4.1.4 Noisy Data 76

4.1.5 Outliers 77

4.2 Converting to a Different Scale Type 77

4.2.1 Converting Nominal to Relative 78

4.2.2 Converting Ordinal to Relative or Absolute 81

4.2.3 Converting Relative or Absolute to Ordinal or Nominal 82

4.3 Converting to a Different Scale 83

4.4 Data Transformation 85

4.5 Dimensionality Reduction 86

4.5.1 Attribute Aggregation 88

4.5.1.1 Principal Component Analysis 88

4.5.1.2 Independent Component Analysis 91

4.5.1.3 Multidimensional Scaling 91

4.5.2 Attribute Selection 92

4.5.2.1 Filters 92

4.5.2.2 Wrappers 93

4.5.2.3 Embedded 94

4.5.2.4 Search Strategies 95

4.6 Final Remarks 96

4.7 Exercises 96

5 Clustering 99

5.1 Distance Measures 100

5.1.1 Differences between Values of Common Attribute Types 101

5.1.2 Distance Measures for Objects with Quantitative Attributes 103

5.1.3 Distance Measures for Non-conventional Attributes 104

5.2 Clustering Validation 107

5.3 Clustering Techniques 108

5.3.1 K-means 110

5.3.1.1 Centroids and Distance Measures 110

5.3.1.2 How K-means Works 111

5.3.2 DBSCAN 115

5.3.3 Agglomerative Hierarchical Clustering Technique 117

5.3.3.1 Linkage Criterion 119

5.3.3.2 Dendrograms 120

5.4 Final Remarks 122

5.5 Exercises 123

6 Frequent Pattern Mining 125

6.1 Frequent Itemsets 127

6.1.1 Setting the min_sup Threshold 128

6.1.2 Apriori – a Join-based Method 131

6.1.3 Eclat 133

6.1.4 FP-Growth 134

6.1.5 Maximal and Closed Frequent Itemsets 138

6.2 Association Rules 139

6.3 Behind Support and Confidence 142

6.3.1 Cross-support Patterns 143

6.3.2 Lift 144

6.3.3 Simpson’s Paradox 145

6.4 Other Types of Pattern 147

6.4.1 Sequential patterns 147

6.4.2 Frequent Sequence Mining 148

6.4.3 Closed and Maximal Sequences 148

6.5 Final Remarks 149

6.6 Exercises 149

7 Cheat Sheet and Project on Descriptive Analytics 151

7.1 Cheat Sheet of Descriptive Analytics 151

7.1.1 On Data Summarization 151

7.1.2 On Clustering 151

7.1.3 On Frequent Pattern Mining 153

7.2 Project on Descriptive Analytics 154

7.2.1 Business Understanding 154

7.2.2 Data Understanding 155

7.2.3 Data Preparation 155

7.2.4 Modeling 157

7.2.5 Evaluation 158

7.2.6 Deployment 158

Part III Predicting the Unknown 159

8 Regression 161

8.1 Predictive Performance Estimation 164

8.1.1 Generalization 164

8.1.2 Model Validation 165

8.1.3 Predictive Performance Measures for Regression 169

8.2 Finding the Parameters of the Model 171

8.2.1 Linear Regression 171

8.2.1.1 Empirical Error 173

8.2.2 The Bias-variance Trade-off 175

8.2.3 Shrinkage Methods 177

8.2.3.1 Ridge Regression 179

8.2.3.2 Lasso Regression 180

8.2.4 Methods that use Linear Combinations of Attributes 181

8.2.4.1 Principal Components Regression 181

8.2.4.2 Partial Least Squares Regression 182

8.3 Technique and Model Selection 182

8.4 Final Remarks 183

8.5 Exercises 184

9 Classification 187

9.1 Binary Classification 188

9.2 Predictive Performance Measures for Classification 192

9.3 Distance-based Learning Algorithms 199

9.3.1 K-nearest Neighbor Algorithms 199

9.3.2 Case-based Reasoning 202

9.4 Probabilistic Classification Algorithms 203

9.4.1 Logistic Regression Algorithm 205

9.4.2 Naive Bayes Algorithm 207

9.5 Final Remarks 208

9.6 Exercises 210

10 Additional Predictive Methods 211

10.1 Search-based Algorithms 211

10.1.1 Decision Tree Induction Algorithms 212

10.1.2 Decision Trees for Regression 217

10.1.2.1 Model Trees 218

10.1.2.2 Multivariate Adaptive Regression Splines 219

10.2 Optimization-based Algorithms 221

10.2.1 Artificial Neural Networks 222

10.2.1.1 Backpropagation 224

10.2.1.2 Deep Networks and Deep Learning Algorithms 230

10.2.2 Support Vector Machines 233

10.2.2.1 SVM for Regression 237

10.3 Final Remarks 238

10.4 Exercises 239

11 Advanced Predictive Topics 241

11.1 Ensemble Learning 241

11.1.1 Bagging 243

11.1.2 Random Forests 244

11.1.3 AdaBoost 245

11.2 Algorithm Bias 246

11.3 Non-binary Classification Tasks 248

11.3.1 One-class Classification 248

11.3.2 Multi-class Classification 249

11.3.3 Ranking Classification 250

11.3.4 Multi-label Classification 251

11.3.5 Hierarchical Classification 252

11.4 Advanced Data Preparation Techniques for Prediction 253

11.4.1 Imbalanced Data Classification 253

11.4.2 For Incomplete Target Labeling 254

11.4.2.1 Semi-supervised Learning 254

11.4.2.2 Active Learning 255

11.5 Description and Prediction with Supervised Interpretable Techniques 255

11.6 Exercises 256

12 Cheat Sheet and Project on Predictive Analytics 259

12.1 Cheat Sheet on Predictive Analytics 259

12.2 Project on Predictive Analytics 259

12.2.1 Business Understanding 260

12.2.2 Data Understanding 260

12.2.3 Data Preparation 265

12.2.4 Modeling 265

12.2.5 Evaluation 265

12.2.6 Deployment 266

Part IV Popular Data Analytics Applications 267

13 Applications for Text, Web and Social Media 269

13.1 Working with Texts 269

13.1.1 Data Acquisition 271

13.1.2 Feature Extraction 271

13.1.2.1 Tokenization 272

13.1.2.2 Stemming 272

13.1.2.3 Conversion to Structured Data 275

13.1.2.4 Is the Bag of Words Enough? 276

13.1.3 Remaining Phases 277

13.1.4 Trends 277

13.1.4.1 Sentiment Analysis 278

13.1.4.2 Web Mining 278

13.2 Recommender Systems 278

13.2.1 Feedback 279

13.2.2 Recommendation Tasks 280

13.2.3 Recommendation Techniques 281

13.2.3.1 Knowledge-based Techniques 281

13.2.3.2 Content-based Techniques 282

13.2.3.3 Collaborative Filtering Techniques 282

13.2.4 Final Remarks 289

13.3 Social Network Analysis 291

13.3.1 Representing Social Networks 291

13.3.2 Basic Properties of Nodes 294

13.3.2.1 Degree 294

13.3.2.2 Distance 294

13.3.2.3 Closeness 295

13.3.2.4 Betweenness 296

13.3.2.5 Clustering Coefficient 297

13.3.3 Basic and Structural Properties of Networks 297

13.3.3.1 Diameter 297

13.3.3.2 Centralization 297

13.3.3.3 Cliques 299

13.3.3.4 Clustering Coefficient 299

13.3.3.5 Modularity 299

13.3.4 Trends and Final Remarks 299

13.4 Exercises 300

Apendix A: Comprehensive Description of the CRISP-DM Methodology 303

References 311

Index 315


Best Seller

| | See All

Product Details
  • ISBN-13: 9781119296249
  • Publisher: John Wiley & Sons Inc
  • Publisher Imprint: Wiley-Interscience
  • Height: 226 mm
  • No of Pages: 352
  • Spine Width: 23 mm
  • Width: 155 mm
  • ISBN-10: 1119296242
  • Publisher Date: 24 Aug 2018
  • Binding: Hardback
  • Language: English
  • Returnable: N
  • Weight: 598 gr


Similar Products

How would you rate your experience shopping for books on Bookswagon?

Add Photo
Add Photo

Customer Reviews

REVIEWS           
Be The First to Review
A General Introduction to Data Analytics
John Wiley & Sons Inc -
A General Introduction to Data Analytics
Writing guidlines
We want to publish your review, so please:
  • keep your review on the product. Review's that defame author's character will be rejected.
  • Keep your review focused on the product.
  • Avoid writing about customer service. contact us instead if you have issue requiring immediate attention.
  • Refrain from mentioning competitors or the specific price you paid for the product.
  • Do not include any personally identifiable information, such as full names.

A General Introduction to Data Analytics

Required fields are marked with *

Review Title*
Review
    Add Photo Add up to 6 photos
    Would you recommend this product to a friend?
    Tag this Book
    Read more
    Does your review contain spoilers?
    What type of reader best describes you?
    I agree to the terms & conditions
    You may receive emails regarding this submission. Any emails will include the ability to opt-out of future communications.

    CUSTOMER RATINGS AND REVIEWS AND QUESTIONS AND ANSWERS TERMS OF USE

    These Terms of Use govern your conduct associated with the Customer Ratings and Reviews and/or Questions and Answers service offered by Bookswagon (the "CRR Service").


    By submitting any content to Bookswagon, you guarantee that:
    • You are the sole author and owner of the intellectual property rights in the content;
    • All "moral rights" that you may have in such content have been voluntarily waived by you;
    • All content that you post is accurate;
    • You are at least 13 years old;
    • Use of the content you supply does not violate these Terms of Use and will not cause injury to any person or entity.
    You further agree that you may not submit any content:
    • That is known by you to be false, inaccurate or misleading;
    • That infringes any third party's copyright, patent, trademark, trade secret or other proprietary rights or rights of publicity or privacy;
    • That violates any law, statute, ordinance or regulation (including, but not limited to, those governing, consumer protection, unfair competition, anti-discrimination or false advertising);
    • That is, or may reasonably be considered to be, defamatory, libelous, hateful, racially or religiously biased or offensive, unlawfully threatening or unlawfully harassing to any individual, partnership or corporation;
    • For which you were compensated or granted any consideration by any unapproved third party;
    • That includes any information that references other websites, addresses, email addresses, contact information or phone numbers;
    • That contains any computer viruses, worms or other potentially damaging computer programs or files.
    You agree to indemnify and hold Bookswagon (and its officers, directors, agents, subsidiaries, joint ventures, employees and third-party service providers, including but not limited to Bazaarvoice, Inc.), harmless from all claims, demands, and damages (actual and consequential) of every kind and nature, known and unknown including reasonable attorneys' fees, arising out of a breach of your representations and warranties set forth above, or your violation of any law or the rights of a third party.


    For any content that you submit, you grant Bookswagon a perpetual, irrevocable, royalty-free, transferable right and license to use, copy, modify, delete in its entirety, adapt, publish, translate, create derivative works from and/or sell, transfer, and/or distribute such content and/or incorporate such content into any form, medium or technology throughout the world without compensation to you. Additionally,  Bookswagon may transfer or share any personal information that you submit with its third-party service providers, including but not limited to Bazaarvoice, Inc. in accordance with  Privacy Policy


    All content that you submit may be used at Bookswagon's sole discretion. Bookswagon reserves the right to change, condense, withhold publication, remove or delete any content on Bookswagon's website that Bookswagon deems, in its sole discretion, to violate the content guidelines or any other provision of these Terms of Use.  Bookswagon does not guarantee that you will have any recourse through Bookswagon to edit or delete any content you have submitted. Ratings and written comments are generally posted within two to four business days. However, Bookswagon reserves the right to remove or to refuse to post any submission to the extent authorized by law. You acknowledge that you, not Bookswagon, are responsible for the contents of your submission. None of the content that you submit shall be subject to any obligation of confidence on the part of Bookswagon, its agents, subsidiaries, affiliates, partners or third party service providers (including but not limited to Bazaarvoice, Inc.)and their respective directors, officers and employees.

    Accept

    New Arrivals

    | | See All


    Inspired by your browsing history


    Your review has been submitted!

    You've already reviewed this product!
    ASK VIDYA