You are on page 1of 12
ipematcal Geolory, Vol 5, No. 4, 1973 ‘youped Regression Analysis—A Sedimentologic Example! Frank G. Ethridge? and David K. Davies* .n sample data are divided into groups, and observations consist of the independent variable ‘associated dependent variable y, a logical form of analysis is “grouped regression.” This tistical technique allows testing of the relationship between the two variables and assessment ‘how the relationship is affected by the grouping, A sedimentologic example illustrates the afulness of such a technique in classifying environments of deposition based on the size of uartz grains and the quartz content, KEY WORDS: analysis of variance, regression analysis, iatistics, sedimentation, INTRODUCTION lassification is of particular importance in many areas of the geological sciences. Grouped regression is a statistical technique which is suited ideally to problems of geological classification. This technique combines both regression and analysis of variance. In so doing, grouped regression allows for the testing of the relationship between two variables, and also enables an assessment to ‘be made of how the relationship between these variables is affected by groupings (or classifications) of the data. As such, the statistical technique demands that we be interested not only in relationships between variables, but also in how classification of data modifies these relationships. The paper deals with the use of grouped regression in objectively dis- criminating between some Tertiary environments of deposition. ‘Tertiary marine, fluvial, and bay-fill sediments from central Texas are shown to be significantly different in quartz grain size and quartz content. This fact suggests that future environmental discriminations in the Tertiary of central Texas can be made on the basis of easily quantifiable data—quartz size and quartz content. + Manuscript received 2 August 1972; revised 29 January 1973, Department of Geology, Southern Illinois University (USA). 3 Department of Geology, University of Missouri, Columbia (USA). 377 © 1973 Plenum Publishing Corporation, 227 West 17th Street, New York, N.¥. 10011. No part of this publica- tion may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, microfilming, recording, or otherwise, without written permission of the publisher, 378 Frank G. Ethridge and David K. Davies GROUPED REGRESSION—GENERAL In grouped regression analysis, sample data are divided into groups (ie, different environments of deposition, different analytical techniques, ete,), Observations consist of the independent x and dependent y variable, ag in simple regression analysis. In this situation, however, we are not only inter. ested in the relationship between the two variables x and y, but also in how the relationship is affected by the groups. There are at least three models which can be specified, as follows, Model A. One unique linear relationship holds for all observations regardless of group (Fig. 1A). For this model we ignore the grouping and perform the usual regression analysis on the entire set of data. Model A is regarded as the simplest model and estimates m parameters in addition to the mean (one for each regression coefficient). Therefore, the residual mean square (deviations from regression) has En,—2 degrees of freedom, Model B, One regression coefficient (slope) holds for all data regardless of group, but each group is characterized by a different y intercept (Fig. 1B), Because the intercepts are essentially functions of the means, Model B de- scribes the data as having one regression relationship if the data are adjusted or Figure 1. Hypothetical models of three group regres- sion models (modified from Freund, 1968), Grouped Regression Analysis 379 corrected for differences among group means. Model B (Fig. 1B) estimates | [+p parameters (1 coefficient, and p intercepts); hence, the residual mean | square has En,—1—p degrees of freedom. Therefore, any reduction in error sums of squares (deviations from regression) from Model A to Model B can - be said to be a reduction in deviations due to the fitting of p—1 additional intercepts. Operationally, each group’s sum of squares and sum of cross products are adjusted or corrected for its respective group mean. These are summed then and subtracted from the uncorrected sums of squares and cross products to give the corrected sums of squares and'cross products used in the analysis of variance. Model C. A completely different regression relationship characterizes each group (Fig. 1C). Model C estimates m x p parameters in addition to the mean; hence, has a residual mean square with Zn,—p(2) degrees of freedom. In Model C, each group is treated as an independent sample, and the usual regression analysis and analysis of variance are performed. The appropriate sums of squares, mean squares, and degrees of freedom are then obtained by summing over all groups. The model which best fits a given set of data is that model which explains most of the variation in terms of the parameters estimated. It is thus the model with the least amount of variation due to deviations from regression. By definition, each least-squares line is fitted to the data by minimizing the sums of the squared deviations, We can conclude, therefore, that if the two or more sets of data come from a population with a common regression line, the sum of squared deviations from such a common line will not be significant- ly greater than the sum of the squared deviations from individual lines (Griffiths, 1967, p. 453). The logical procedure for analysis is one which starts with the most complicated model and steps down to successively lower order models until further simplifications are not statistically significant. AN EXAMPLE FROM SEDIMENTOLOGY—THE CLAIBORNE OF TEXAS General Philosophy The most abundant and common mineral constituent of detrital sediments is quartz. To a large extent it is ignored in environmental studies, despite the fact that its abundance and ubiquity offers it as an obvious line of attack for the successful unravelling of depositional environments (Ferm, 1962). It is the thesis of this research that the quartz content of detrital sediments is significantly affected by changes in the depositional environment. Environ- ments of deposition may be segregated on the basis of quartz content and samples of unknown origin may be classified into one of these predetermined environments on the basis of a properly designed statistical study.

You might also like