Best practices in data cleaning : a complete guide to everything you need to do before and after collecting your data /

"Many researchers jump straight from data collection to data analysis without realizing how analyses and hypothesis tests can go profoundly wrong without clean data. This book provides a clear, step-by-step process to examining and cleaning data in order to decrease error rates and increase bot...

Full description

Bibliographic Details
Main Author: Osborne, Jason W.
Format: Book
Language:English
Published: Thousand Oaks, Calif. : SAGE, [2013]
Subjects:
Table of Contents:
  • Machine generated contents note: ch. 1 Why Data Cleaning Is Important: Debunking the Myth of Robustness
  • Origins of Data Cleaning
  • Are Things Really That Bad?
  • Why Care About Testing Assumptions and Cleaning Data?
  • How Can This State of Affairs Be True?
  • The Best Practices Orientation of This Book
  • Data Cleaning Is a Simple Process; However...
  • One Path to Solving the Problem
  • For Further Enrichment
  • SECTION I BEST PRACTICES AS YOU PREPARE FOR DATA COLLECTION
  • ch. 2 Power and Planning for Data Collection: Debunking the Myth of Adequate Power
  • Power and Best Practices in Statistical Analysis of Data
  • How Null-Hypothesis Statistical Testing Relates to Power
  • What Do Statistical Tests Tell Us?
  • How Does Power Relate to Error Rates?
  • Low Power and Type I Error Rates in a Literature
  • How to Calculate Power
  • The Effect of Power on the Replicability of Study Results
  • Can Data Cleaning Fix These Sampling Problems?
  • Conclusions
  • For Further Enrichment
  • Appendix
  • ch. 3 Being True to the Target Population: Debunking the Myth of Representativeness
  • Sampling Theory and Generalizability
  • Aggregation or Omission Errors
  • Including Irrelevant Groups
  • Nonresponse and Generalizability
  • Consent Procedures and Sampling Bias
  • Generalizability of Internet Surveys
  • Restriction of Range
  • Extreme Groups Analysis
  • Conclusion
  • For Further Enrichment
  • ch. 4 Using Large Data Sets With Probability Sampling Frameworks: Debunking the Myth of Equality
  • What Types of Studies Use Complex Sampling?
  • Why Does Complex Sampling Matter?
  • Best Practices in Accounting for Complex Sampling
  • Does It Really Make a Difference in the Results?
  • So What Does All This Mean?
  • For Further Enrichment
  • SECTION II BEST PRACTICES IN DATA CLEANING AND SCREENING
  • ch. 5 Screening Your Data for Potential Problems: Debunking the Myth of Perfect Data
  • The Language of Describing Distributions
  • Testing Whether Your Data Are Normally Distributed
  • Conclusions
  • For Further Enrichment
  • Appendix
  • ch. 6 Dealing With Missing or Incomplete Data: Debunking the Myth of Emptiness
  • What Is Missing or Incomplete Data?
  • Categories of Missingness
  • What Do We Do With Missing Data?
  • The Effects of Listwise Deletion
  • The Detrimental Effects of Mean Substitution
  • The Effects of Strong and Weak Imputation of Values
  • Multiple Imputation: A Modern Method of Missing Data Estimation
  • Missingness Can Be an Interesting Variable in and of Itself
  • Summing Up: What Are Best Practices?
  • For Further Enrichment
  • Appendixes
  • ch. 7 Extreme and Influential Data Points: Debunking the Myth of Equality
  • What Are Extreme Scores?
  • How Extreme Values Affect Statistical Analyses
  • What Causes Extreme Scores?
  • Extreme Scores as a Potential Focus of Inquiry
  • Identification of Extreme Scores
  • Why Remove Extreme Scores?
  • Effect of Extreme Scores on Inferential Statistics
  • Effect of Extreme Scores on Correlations and Regression
  • Effect of Extreme Scores on t-Tests and ANOVAs
  • To Remove or Not to Remove?
  • For Further Enrichment
  • ch. 8 Improving the Normality of Variables Through Box-Cox Transformation: Debunking the Myth of Distributional Irrelevance
  • Why Do We Need Data Transformations?
  • When a Variable Violates the Assumption of Normality
  • Traditional Data Transformations for Improving Normality
  • Application and Efficacy of Box-Cox Transformations
  • Reversing Transformations
  • Conclusion
  • For Further Enrichment
  • Appendix
  • ch. 9 Does Reliability Matter? Debunking the Myth of Perfect Measurement
  • What Is a Reasonable Level of Reliability?
  • Reliability and Simple Correlation or Regression
  • Reliability and Partial Correlations
  • Reliability and Multiple Regression
  • Reliability and Interactions in Multiple Regression
  • Protecting Against Overcorrecting During Disattenuation
  • Other Solutions to the Issue of Measurement Error
  • What If We Had Error-Free Measurement?
  • An Example From My Research
  • Does Reliability Influence Other Analyses?
  • The Argument That Poor Reliability Is Not That Important
  • Conclusions and Best Practices
  • For Further Enrichment
  • SECTION III ADVANCED TOPICS IN DATA CLEANING
  • ch. 10 Random Responding, Motivated Misresponding, and Response Sets: Debunking the Myth of the Motivated Participant
  • What Is a Response Set?
  • Common Types of Response Sets
  • Is Random Responding Truly Random?
  • Detecting Random Responding in Your Research
  • Does Random Responding Cause Serious Problems With Research?
  • Example of the Effects of Random Responding
  • Are Random Responders Truly Random Responders?
  • Summary
  • Best Practices Regarding Random Responding
  • Magnitude of the Problem
  • For Further Enrichment
  • ch. 11 Why Dichotomizing Continuous Variables Is Rarely a Good Practice: Debunking the Myth of Categorization
  • What Is Dichotomization and Why Does It Exist?
  • How Widespread Is This Practice?
  • Why Do Researchers Use Dichotomization?
  • Are Analyses With Dichotomous Variables Easier to Interpret?
  • Are Analyses With Dichotomous Variables Easier to Compute?
  • Are Dichotomous Variables More Reliable?
  • Other Drawbacks of Dichotomization
  • For Further Enrichment
  • ch. 12 The Special Challenge of Cleaning Repeated Measures Data: Lots of Pits in Which to Fall
  • Treat All Time Points Equally
  • What to Do With Extreme Scores?
  • Missing Data
  • Summary
  • ch. 13 Now That the Myths Are Debunked...: Visions of Rational Quantitative Methodology for the 21st Century.