### Regression Analysis and Linear Models: Concepts, Applications, and Implementation

**by Richard Darlington and Andrew Hayes**

*Guilford Publications*

- Pub Date:
- 09/2016
- ISBN:
- 9781462521135
- Format:
- Hbk
*661 pages* - Price:
**AU$156.00***NZ$160.87*

**Product Status:**

*Available in Approx 14 days***Instructors**

& Academics:

& Academics:

Emphasizing conceptual understanding over mathematics, this user-friendly text introduces linear regression analysis to students and researchers across the social, behavioral, consumer, and health sciences. Coverage includes model construction and estimation, quantification and measurement of multivariate and partial associations, statistical control, group comparisons, moderation analysis, mediation and path analysis, and regression diagnostics, among other important topics. Engaging worked-through examples demonstrate each technique, accompanied by helpful advice and cautions. The use of SPSS, SAS, and STATA is emphasized, with an appendix on regression analysis using R. The author's website (href=www.afhayes.com>

*www.afhayes.com*) provides datasets for the book's examples as well as the RLM macro for SPSS and SAS.Pedagogical Features:

*Chapters include SPSS, SAS, or STATA code pertinent to the analyses described, with each distinctively formatted for easy identification.

*An appendix documents the RLM macro, which facilitates computations for estimating and probing interactions, dominance analysis, heteroscedasticity-consistent standard errors, and linear spline regression, among other analyses. *Students are guided to practice what they learn in each chapter using datasets provided online.

*Addresses topics not usually covered, such as ways to measure a variable’s importance, coding systems for representing categorical variables, causation, and myths about testing interaction.

List of Symbols and Abbreviations 1. Statistical Control and Linear Models 1.1 Statistical Control 1.1.1 The Need for Control 1.1.2 Five Methods of Control 1.1.3 Examples of Statistical Control 1.2 An Overview of Linear Models 1.2.1 What You Should Know Already 1.2.2 Statistical Software for Linear Modeling and Statistical Control 1.2.3 About Formulas 1.2.4 On Symbolic Representations 1.3 Chapter Summary 2. The Simple Regression Model 2.1 Scatterplots and Conditional Distributions 2.1.1 Scatterplots 2.1.2 A Line through Conditional Means 2.1.3 Errors of Estimate 2.2 The Simple Regression Model 2.2.1 The Regression Line 2.2.2 Variance, Covariance, and Correlation 2.2.3 Finding the Regression Line 2.2.4 Example Computations 2.2.5 Linear Regression Analysis by Computer 2.3 The Regression Coefficient versus the Correlation Coefficient 2.3.1 Properties of the Regression and Correlation Coefficients 2.3.2 Uses of the Regression and Correlation Coefficients 2.4 Residuals 2.4.1 The Three Components of Y 2.4.2 Algebraic Properties of Residuals 2.4.3 Residuals as Y Adjusted for Differences in X 2.4.4 Residual Analysis 2.5 Chapter Summary 3. Partial Relationship and the Multiple Regression Model 3.1 Regression Analysis with More Than One Predictor Variable 3.1.1 An Example 3.1.2 Regressors 3.1.3 Models 3.1.4 Representing a Model Geometrically 3.1.5 Model Errors 3.1.6 An Alternative View of the Model 3.2 The Best-Fitting Model 3.2.1 Model Estimation with Computer Software 3.2.2 Partial Regression Coefficients 3.2.3 The Regression Constant 3.2.4 Problems with Three or More Regressors 3.2.5 The Multiple Correlation R 3.3 Scale-Free Measures of Partial Association 3.3.1 Semipartial Correlation 3.3.2 Partial Correlation 3.3.3 The Standardized Regression Coefficient 3.4 Some Relations among Statistics 3.4.1 Relations among Simple, Multiple, Partial, and Semipartial Correlations 3.4.2 Venn Diagrams 3.4.3 Partial Relationships and Simple Relationships May Have Different Signs 3.4.4 How Covariates Affect Regression Coefficients 3.4.5 Formulas for bj, prj, srj, and R 3.5 Chapter Summary 4. Statistical Inference in Regression 4.1 Concepts in Statistical Inference 4.1.1 Statistics and Parameters 4.1.2 Assumptions for Proper Inference 4.1.3 Expected Values and Unbiased Estimation 4.2 The ANOVA Summary Table 4.2.1 Data = Model + Error 4.2.2 Total and Regression Sums of Squares 4.2.3 Degrees of Freedom 4.2.4 Mean Squares 4.3 Inference about the Multiple Correlation 4.3.1 Biased and Less Biased Estimation of TR2 4.3.2 Testing a Hypothesis about TR 4.4 The Distribution of and Inference about a Partial Regression Coefficient 4.4.1 Testing a Null Hypothesis about Tbj 4.4.2 Interval Estimates for Tbj 4.4.3 Factors Affecting the Standard Error of bj 4.4.4 Tolerance 4.5 Inferences about Partial Correlations 4.5.1 Testing a Null Hypothesis about Tprj and Tsrj 4.5.2 Other Inferences about Partial Correlations 4.6 Inferences about Conditional Means 4.7 Miscellaneous Issues in Inference 4.7.1 How Great a Drawback Is Collinearity? 4.7.2 Contradicting Inferences 4.7.3 Sample Size and Nonsignificant Covariates 4.7.4 Inference in Simple Regression (When k = 1) 4.8 Chapter Summary 5. Extending Regression Analysis Principles 5.1 Dichotomous Regressors 5.1.1 Indicator or Dummy Variables 5.1.2 Y Is a Group Mean 5.1.3 The Regression Coefficient for an Indicator Is a Difference 5.1.4 A Graphic Representation 5.1.5 A Caution about Standardized Regression Coefficients for Dichotomous Regressors 5.1.6 Artificial Categorization of Numerical Variables 5.2 Regression to the Mean 5.2.1 How Regression Got Its Name 5.2.2 The Phenomenon 5.2.3 Versions of the Phenomenon 5.2.4 Misconceptions and Mistakes Fostered by Regression to the Mean 5.2.5 Accounting for Regression to the Mean Using Linear Models 5.3 Multidimensional Sets 5.3.1 The Partial and Semipartial Multiple Correlation 5.3.2 What It Means If PR = 0 or SR = 0 5.3.3 Inference Concerning Sets of Variables 5.4 A Glance at the Big Picture 5.4.1 Further Extensions of Regression 5.4.2 Some Difficulties and Limitations 5.5 Chapter Summary 6. Statistical versus Experimental Control 6.1 Why Random Assignment? 6.1.1 Limitations of Statistical Control 6.1.2 The Advantage of Random Assignment 6.1.3 The Meaning of Random Assignment 6.2 Limitations of Random Assignment 6.2.1 Limitations Common to Statistical Control and Random Assignment 6.2.2 Limitations Specific to Random Assignment 6.2.3 Correlation and Causation 6.3 Supplementing Random Assignment with Statistical Control 6.3.1 Increased Precision and Power 6.3.2 Invulnerability to Chance Differences between Groups 6.3.3 Quantifying and Assessing Indirect Effects 6.4 Chapter Summary 7. Regression for Prediction 7.1 Mechanical Prediction and Regression 7.1.1 The Advantages of Mechanical Prediction 7.1.2 Regression as a Mechanical Prediction Method 7.1.3 A Focus on R Rather Than the Regression Weights 7.2 Estimating True Validity 7.2.1 Shrunken versus Adjusted R 7.2.2 Estimating TRS 7.2.3 Shrunken R Using Statistical Software 7.3 Selecting Predictor Variables 7.3.1 Stepwise Regression 7.3.2 All Subsets Regression 7.3.3 How Do Variable Selection Methods Perform? 7.4 Predictor Variable Configurations 7.4.1 Partial Redundancy (the Standard Configuration) 7.4.2 Complete Redundancy 7.4.3 Independence 7.4.4 Complementarity 7.4.5 Suppression 7.4.6 How These Configurations Relate to the Correlation between Predictors 7.4.7 Configurations of Three or More Predictors 7.5 Revisiting the Value of Human Judgment 7.6 Chapter Summary 8. Assessing the Importance of Regressors 8.1 What Does It Mean for a Variable to Be Important? 8.1.1 Variable Importance in Substantive or Applied Terms 8.1.2 Variable Importance in Statistical Terms 8.2 Should Correlations Be Squared? 8.2.1 Decision Theory 8.2.2 Small Squared Correlations Can Reflect Noteworthy Effects 8.2.3 Pearson’s r as the Ratio of a Regression Coefficient to Its Maximum Possible Value 8.2.4 Proportional Reduction in Estimation Error 8.2.5 When the Standard Is Perfection 8.2.6 Summary 8.3 Determining the Relative Importance of Regressors in a Single Regression Model 8.3.1 The Limitations of the Standardized Regression Coefficient 8.3.2 The Advantage of the Semipartial Correlation 8.3.3 Some Equivalences among Measures 8.3.4 Cohen’s f 2 8.3.5 Comparing Two Regression Coefficients in the Same Model 8.4 Dominance Analysis 8.4.1 Complete and Partial Dominance 8.4.2 Example Computations 8.4.3 Dominance Analysis Using a Regression Program 8.5 Chapter Summary 9. Multicategorical Regressors 9.1 Multicategorical Variables as Sets 9.1.1 Indicator (Dummy) Coding 9.1.2 Constructing Indicator Variables 9.1.3 The Reference Category 9.1.4 Testing the Equality of Several Means 9.1.5 Parallels with Analysis of Variance 9.1.6 Interpreting Estimated Y and the Regression Coefficients 9.2 Multicategorical Regressors as or with Covariates 9.2.1 Multicategorical Variables as Covariates 9.2.2 Comparing Groups and Statistical Control 9.2.3 Interpretation of Regression Coefficients 9.2.4 Adjusted Means 9.2.5 Parallels with ANCOVA 9.2.6 More Than One Covariate 9.3 Chapter Summary 10. More on Multicategorical Regressors 10.1 Alternative Coding Systems 10.1.1 Sequential (Adjacent or Repeated Categories) Coding 10.1.2 Helmert Coding 10.1.3 Effect Coding 10.2 Comparisons and Contrasts 10.2.1 Contrasts 10.2.2 Computing the Standard Error of a Contrast 10.2.3 Contrasts Using Statistical Software 10.2.4 Covariates and the Comparison of Adjusted Means 10.3 Weighted Group Coding and Contrasts 10.3.1 Weighted Effect Coding 10.3.2 Weighted Helmert Coding 10.3.3 Weighted Contrasts 10.3.4 Application to Adjusted Means 10.4 Chapter Summary 11. Multiple Tests 11.1 The Multiple-Test Problem 11.1.1 An Illustration through Simulation 11.1.2 The Problem Defined 11.1.3 The Role of Sample Size 11.1.4 The Generality of the Problem 11.1.5 Do Omnibus Tests Offer “Protection”? 11.1.6 Should You Be Concerned about the Multiple-Test Problem? 11.2 The Bonferroni Method 11.2.1 Independent Tests 11.2.2 The Bonferroni Method for Nonindependent Tests 11.2.3 Revisiting the Illustration 11.2.4 Bonferroni Layering 11.2.5 Finding an “Exact” p-Value 11.2.6 Nonsense Values 11.2.7 Flexibility of the Bonferroni Method 11.2.8 Power of the Bonferroni Method 11.3 Some Basic Issues Surrounding Multiple Tests 11.3.1 Why Correct for Multiple Tests at All? 11.3.2 Why Not Correct for the Whole History of Science? 11.3.3 Plausibility and Logical Independence of Hypotheses 11.3.4 Planned versus Unplanned Tests 11.4 Summary 11.5 Chapter Summary 12. Nonlinear Relationships 12.1 Linear Regression Can Model Nonlinear Relationships 12.1.1 When Must Curves Be Fitted? 12.1.2 The Graphical Display of Curvilinearity 12.2 Polynomial Regression 12.2.1 Basic Principles 12.2.2 An Example 12.2.3 The Meaning of the Regression Coefficients for Lower-Order Regressors 12.2.4 Centering Variables in Polynomial Regression 12.2.5 Finding a Parabola’s Maximum or Minimum 12.3 Spline Regression 12.3.1 Linear Spline Regression 12.3.2 Implementation in Statistical Software 12.3.3 Polynomial Spline Regression 12.3.4 Covariates, Weak Curvilinearity, and Choosing Joints 12.4 Transformations of Dependent Variables or Regressors 12.4.1 Logarithmic Transformation 12.4.2 The Box–Cox Transformation 12.5 Chapter Summary 13. Linear Interaction 13.1 Interaction Fundamentals 13.1.1 Interaction as a Difference in Slope 13.1.2 Interaction between Two Numerical Regressors 13.1.3 Interaction versus Intercorrelation 13.1.4 Simple Linear Interaction 13.1.5 Representing Simple Linear Interaction with a Cross-product 13.1.6 The Symmetry of Interaction 13.1.7 Interaction as a Warped Surface 13.1.8 Covariates in a Regression Model with an Interaction 13.1.9 The Meaning of the Regression Coefficients 13.1.10 An Example with Estimation Using Statistical Software 13.2 Interaction Involving a Categorical Regresson 13.2.1 Interaction between a Dichotomous and a Numerical Regressor 13.2.2 The Meaning of the Regression Coefficients 13.2.3 Interaction Involving a Multicategorical and a Numerical Regressor 13.2.4 Inference When Interaction Requires More Than One Regression Coefficient 13.2.5 A Substantive Example 13.2.6 Interpretation of the Regression Coefficients 13.3 Interaction between Two Categorical Regressors 13.3.1 The 2 × 2 Design 13.3.2 Interaction between a Dichotomous and a Multicategorical Regressor 13.3.3 Interaction between Two Multicategorical Regressors 13.4 Chapter Summary 14. Probing Interactions and Various Complexities 14.1 Conditional Effects as Functions 14.1.1 When the Interaction Involves Dichotomous or Numerical Variables 14.1.2 When the Interaction Involves a Multicategorical Variable 14.2 Inference about a Conditional Effect 14.2.1 When the Focal Predictor and Moderator Are Numerical or Dichotomous 14.2.2 When the Focal Predictor or Moderator Is Multicategorical 14.3 Probing an Interaction 14.3.1 Examining Conditional Effects at Various Values of the Moderator 14.3.2 The Johnson–Neyman Technique 14.3.3 Testing versus Probing an Interaction 14.3.4 Comparing Conditional Effects 14.4 Complications and Confusions in the Study of Interactions 14.4.1 The Difficulty of Detecting Interactions 14.4.2 Confusing Interaction with Curvilinearity 14.4.3 How the Scaling of Y Affects Interaction 14.4.4 The Interpretation of Lower-Order Regression Coefficients When a Cross-Product Is Present 14.4.5 Some Myths about Testing Interaction 14.4.6 Interaction and Nonsignificant Linear Terms 14.4.7 Homogeneity of Regression in ANCOVA 14.4.8 Multiple, Higher-Order, and Curvilinear Interactions 14.4.9 Artificial Categorization of Continua 14.5 Organizing Tests on Interaction 14.5.1 Three Approaches to Managing Complications 14.5.2 Broad versus Narrow Tests 14.6 Chapter Summary 15. Mediation and Path Analysis 15.1 Path Analysis and Linear Regression 15.1.1 Direct, Indirect, and Total Effects 15.1.2 The Regression Algebra of Path Analysis 15.1.3 Covariates 15.1.4 Inference about the Total and Direct Effects 15.1.5 Inference about the Indirect Effect 15.1.6 Implementation in Statistical Software 15.2 Multiple Mediator Models 15.2.1 Path Analysis for a Parallel Mediation Model 15.2.2 Path Analysis for a Serial Mediation Model 15.3 Extensions, Complications, and Miscellaneous Issues 15.3.1 Causality and Causal Order 15.3.2 The Causal Steps Approach 15.3.3 Mediation of a Nonsignificant Total Effect 15.3.4 Multicategorical Independent Variables 15.3.5 Fixing Direct Effects to Zero 15.3.6 Nonlinear Effects 15.3.7 Moderated Mediation 15.4 Chapter Summary 16. Detecting and Managing Irregularities 16.1 Regression Diagnostics 16.1.1 Shortcomings of Eyeballing the Data 16.1.2 Types of Extreme Cases 16.1.3 Quantifying Leverage, Distance, and Influence 16.1.4 Using Diagnostic Statistics 16.1.5 Generating Regression Diagnostics with Computer Software 16.2 Detecting Assumption Violations 16.2.1 Detecting Nonlinearity 16.2.2 Detecting Non-Normality 16.2.3 Detecting Heteroscedasticity 16.2.4 Testing Assumptions as a Set 16.2.5 What about Nonindependence? 16.3 Dealing with Irregularities 16.3.1 Heteroscedasticity-Consistent Standard Errors 16.3.2 The Jackknife 16.3.3 Bootstrapping 16.3.4 Permutation Tests 16.4 Inference without Random Sampling 16.5 Keeping the Diagnostic Analysis Manageable 16.6 Chapter Summary 17. Power, Measurement Error, and Various Miscellaneous Topics 17.1 Power and Precision of Estimation 17.1.1 Factors Determining Desirable Sample Size 17.1.2 Revisiting the Standard Error of a Regression Coefficient 17.1.3 On the Effect of Unnecessary Covariates 17.2 Measurement Error 17.2.1 What Is Measurement Error? 17.2.2 Measurement Error in Y 17.2.3 Measurement Error in Independent Variables 17.2.4 The Biggest Weakness of Regression: Measurement Error in Covariates 17.2.5 Summary: The Effects of Measurement Error 17.2.6 Managing Measurement Error 17.3 An Assortment of Problems 17.3.1 Violations of the Basic Assumptions 17.3.2 Collinearity 17.3.3 Singularity 17.3.4 Specification Error and Overcontrol 17.3.5 Noninterval Scaling 17.3.6 Missing Data 17.3.7 Rounding Error 17.4 Chapter Summary 18. Logistic Regression and Other Linear Models 18.1 Logistic Regression 18.1.1 Measuring a Model’s Fit to Data 18.1.2 Odds and Logits 18.1.3 The Logistic Regression Equation 18.1.4 An Example with a Single Regressor 18.1.5 Interpretation of and Inference about the Regression Coefficients 18.1.6 Multiple Logistic Regression and Implementation in Computing Software 18.1.7 Measuring and Testing the Fit of the Model 18.1.8 Further Extensions 18.1.9 Discriminant Function Analysis 18.1.10 Using OLS Regression with a Dichotomous Y 18.2 Other Linear Modeling Methods 18.2.1 Ordered Logistic and Probit Regression 18.2.2 Poisson Regression and Related Models of Count Outcomes 18.2.3 Time Series Analysis 18.2.4 Survival Analysis 18.2.5 Structural Equation Modeling 18.2.6 Multilevel Modeling 18.2.7 Other Resources 18.3 Chapter Summary Appendices A. The RLM Macro for SPSS and SAS B. Linear Regression Analysis Using R C. Statistical Tables D. The Matrix Algebra of Linear Regression Analysis Author Index Subject Index References About the Authors

"This is a great textbook for students who have only basic knowledge of statistics yet would like to gain a deep conceptual understanding of regression. The book is up to date in current methods in regression, with strong examples using SAS/SPSS/STATA.”--Chris Oshima, PhD, Department of Educational Policy Studies, Georgia State University "A terrific addition to the regression literature. I am often asked, 'How do I determine which regressor(s) is/are the most important?' The treatment of this topic is excellent, and the authors have done a fantastic job of bringing important issues to light. The applied nature of the text and the interweaving of software syntax and output are major improvements over similar books. I like the fact that the book has software package information for SPSS, SAS, and STATA. It has a nice balance; not too technical on the statistical side, but not simply a 'how to' on the software side. I could see this book being used as the main text in our department's graduate-level regression course."--Scott C. Roesch, PhD, Department of Psychology, San Diego State University "This fantastic introduction to the general linear model takes the reader from first principles through to widely used techniques such as mediation and path analysis. The clear writing makes it a pleasure to read. Students will find the book an invaluable resource. There are plenty of insights, too, for even seasoned researchers and data analysts. Instructors and students will appreciate the logical structure and bite-sized chapters that break the material up into manageable chunks."--Andy Field, PhD, Professor of Child Psychopathology, University of Sussex, United Kingdom

**Richard B. Darlington**, PhD, is Emeritus Professor of Psychology at Cornell University. He is a Fellow of the American Association for the Advancement of Science and has published extensively on regression and related methods, the cultural bias of mental tests, the long-term effects of preschool programs, and, most recently, the neuroscience of brain development and evolution.

**Andrew F. Hayes**, PhD, is Professor of Quantitative Psychology at The Ohio State University. His research and writing on data analysis has been published widely, and he is the author of

*Introduction to Mediation, Moderation, and Conditional Process Analysis*and

*Statistical Methods for Communication Science*, as well as coauthor, with Richard B. Darlington, of

*Regression Analysis and Linear Models*. Dr. Hayes teaches data analysis, primarily at the graduate level, and frequently conducts workshops on statistical analysis throughout the world. His website is href=www.afhayes.com>

*www.afhayes.com*.