Factor Analysis and Stractural Equation Modeling


Factor Analysis

Factor analysis essentially consists of methods for finding clusters of related variables. Each such cluster, or factor, consists of a group of variables whose members correlate more highly among themselves than they do with variables outside the cluster. Factor analysis usually concerns variables and factors, but some applications concern individual scores and factor scores. The goal is to define factor weights that provide these scores.

Principal component analysis (PCA) and exploratory factor analysis (EFA)

PCA is one of the methods of data reduction. This involves taking scores on a large set of measured variables and reducing them to scores on a smaller set of composite variables that retain as much information from the original variables as possible. EFA is based on the common factor model. This model postulates that each measured variable in a battery of measured variables is a linear function of one or more common factors (unobservable latent variables that influence more than one measured variables) and one unique factor (latent variable that influence only one measured variable.) Unique factors have a specific factor component and an error of measurement component. The goal of the common factor model is to understand the structure of correlations among measured variables by estimating the pattern of relations between the common factors and each of the measured variables. In contrast, PCA does not differentiate between common and unique variance. Rather, PCA defines each measured variable as a linear function of principal components, which contain both common and unique variance and thus are not latent variables. Therefore, if the goal is data reduction, PCA is more appropriate. On the other hand, if the goal is to arrive at a parsimonious representation of the associations among measured variables, EFA is more appropriate.

Rotation methods in EFA

Rotations are linear transformations which do not affect the h^2 values, estimated correlations, or overall fit. Factors are usually rotated to make the factor solution more interpretable. Proper rotation will (1) strengthen the relation between variables and factors, (2) concentrate the variance shared by two variables that correlate highly on a single factor, and (3) level the variance, i. e., make them more nearly equal in magnitude. In the orthogonal rotation, (1) the sum of the squired weights for any one factor equals 1, (2) the sum of cross products for pairs of factors equals zero, and (3) the weights represent cosines of angles between original and rotated factors. Oblique rotations allow factors to be placed nearer groups of variables and the property of orthogonal rotation of 1 and 2 are hold but 3 not hold. Quartimax is an orthogonal process which maximizes the average variance of squired structure elements over factors. Varimax, which is also an orthogonal process, maximize the average variance of squared structure elements within factors and much more nearly meets the goal of simple structure. Promax is an oblique process rotates obliquely to a target matrix and increases the disparity between large and small elements. Orthogonal rotations offer the advantage of simplicity at the expense of poorer factor definition. Oblique rotations offer the converse. Therefore, the rules of thumb are (1) use orthogonal rotations until you feel confident about the distinction between pattern and structure, (2) use an orthogonal rotation if the factor correlations in an oblique rotation are all very low, and (3) consider replacing the two factors with one if a factor correlation is very high. According to Fabrigar et al (1999) provide the questionable view to the orthogonal rotaton method. The reasons are (1) substantial theoretical and empirical basis for expecting the constructs to be correlated with one another, (2) orthogonal rotations are likely to produce solutions with poorer simple structure when clusters of variables are less than 90 degree form one another in multidimensional space, and (3) orthogonal rotations provide less information than oblique rotation.

Methodological issues in using EFA

When using EFA, we have to determine (1) what variables to included in the study and the size and nature of the sample on which the study will be based, (2) if EFA is the most appropriate form of analysis, (3) a specific procedure to fit the model, (4) how many factors should be included, and (5) a method of rotating the initial factor analytic solution to a final solution. Researchers should carefully define their domain of interest and specify sound guidelines for the selection of measured variables. Researchers should consider the nature and number of common factors they expect might emerge. In doing so, at least three to fine measured variables representing each common factor should be included in a study. Reliability and validity of measurement should also be considered. The adequate sample size is influenced by the extent to which factors are overdetermined and the level of the communinalities of the measured variables. Overly homogeneous samples and samples whose selection is related to measured variables in the analysis should be avoided. In selecting the number of factors, there are (1) Kaiser criterion of computing the eigenvalues for the correlation matrix to determine how many of these eigenvalues are greater than 1, (2) scree test, in which the eigenvalues of the correlation matrix are computed and then plotted in order of descending values to identify the last substantial drop in the magnitude of the egenvalues, and (3) parallel analysis based on a comparison of eigenvalues obtained from sample data to eigenvalues one would expect to obtain from completely random data.

EFA and CFA (Confirmatory Factor Analysis)

EFA essentially consists of methods for finding clusters of related variables from the data. Each such cluster, or factor, consists of a group of variables whose members correlate more highly among themselves than they do with variables outside the cluster. CFA approaches examine whether or not existing data are consistent with highly constrained a priori structure that meets conditions of model identification.

While EFA use correlation matrix and its diagonal has either 1 or estimates of communality, CFA usually use covariance matrix and its diagonal has estimates of communality. The reason of this difference is that in CFA, the measurement error should be small by choosing covariance matrix. Using covariance matrix increase statistical power for significant testing compared with using correlation matrix. Also covariance matrix keeps information that will lost when using correlation matrix. In terms of errors, CFA hypothesize the relation among errors while EFA does not. The other difference is that, while EFA has rotation method because of infinite number of solutions, CFA does not because of its purpose of identification. That is, CFA has theoretical basis for the model specification. Finally, a researcher makes subjective judgement about the meaningfulness of factors in EFA whereas a researcher makes subjective overall assessments of fit for each indicator.

EFA is primary a data-driven approach. No a priori number of common factors is specified and few restrictions are places on the patterns of relations between the common factors and the measured variables. EFA provides procedures for determining an appropriate number of factors and the pattern of factor loadings primarily from the data. In contrast, CFA requires a researcher to specify a specific number of factors as well as to specify the pattern of zero and nonzoro loadings of the measured variables on the common factors. In situations in which a researchers has relatively little theoretical or empirical basis to make strong assumptions about how many common factors exist or what specific measured variables these common factors are likely to influence, EFA is probably a more sensible approach than CFA. This is because the number of plausible alternative models might be so large that it would be impractical to specify and test each one in CFA. However, if there is sufficient theoretical and empirical basis for a researcher to specify the model or small subset of models that is the most plausible, CFA is likely to be better approach. This is because CFA allows for focused testing of specific hypothesis about the data.

CFA approaches begin with a theoretical model that has to be identified and must attempt to see whether or not data are consistent with that theoretical model that is subject to the falsification. That is, CFA approaches actually cannot confirm but disconfirm the model. CFA attempt to test the viability of a priori structures and a form of latent variable SEM. CFA approaches examine whether or not existing data are consistent with a highly constrained a priori structure that meets conditions of model identification. That is, CFA approaches begin with a theoretical model that has to be identified and must attempt to see whether or not data are consistent with that theoretical model.

Structural Equation Modeling

Path Analysis. Path analysis is one of the components of SEM. Path analysis models are only those models (a) with unidirectional causal flow and (b) in which the measure of each conceptual variable is perfectly reliable. That is, path analysis models are recursive models and do not assume measurement error. Basically, solution processes in path analysis models are estimated by solving a system of equations using linear algebra or multiple regression such as ordinary least square (OLS) technique. Path analysis is a useful analytical tool for those who are interested in particular predictors on the criterion variables and their regression weights in the multiple regression analysis (regression for explanation). Also, path analysis can treat on indirect or noncausal relationships.

In the path models, the direct effects are estimated via least squares regression approaches. Endogenous variables are also thought of as having their own regression equations. Indirect causal effects are indicated by two or more direct effects and each indirect effect is the product of the path coefficients that provide pathway between the two variables that are causally related.

Because the relationships predicted in any model include all the causal and noncausal variance components, the test of fit for a model is not one of how well the predictors explain the dependent or endogenous variables but rather of how well the entire model fits the data. The information used to estimate paths is the correlation of variables with one another. The correlations are the "knowns" in path analysis, whereas the path coefficients to be estimated are the "unknowns." Path analysis is based on ordinary least square (OLS) technique.When logic of factor analysis is integrated with path modeling, the resulting models cannot be solved by ordinary least squire regression techniques.

Relationship among factor analysis, path analysis, and structure equation modeling

In path analysis, the measurement error of each variable is not assumed. Factor analysis enables us to calculate factor scores which contains only variance explained by common factors. If we calculate factor scores and put them into path analysis, then we can eliminate the effect of measurement error. This is the essence of structural equation modeling.

The difference between latent variable path models and CFA models are that in latent path models the latent variables are hypothesized to be causally interrelated, whereas in CFA models they are intercorrelated. That is, in CFA models all the latent variables are viewed as exogenous.

The difference between CFA and latent variables path models is that in path models the latent variables (unmeasured constructs) are hypothesized to be causally interrelated, whereas in CFA models they are intecorrelated.

In latent variable modeling, the variables that appear in the path models actually are factors extracted through CFA. The primary differences between latent variable structural models and basic path analytic models are (a) the variables in latent variable models typically are not measured and that (b) when calculating values for parameter estimates, no distinction needs to be made between recursive and nonrecursive models or models with residual covariation among latent variables. In sum, latent variable SEM methods represent a logical coupling of regression and factor analytic approaches, or a straightforward combination of regression and factor analysis.

Overall fit in SEM

Testing significance of individual paths is very different from testing overall fit of the model. In terms of model fitting, statistical test of the model for all test are tests of differences between variance/covariance matrix predicted by the model and the sample variance/covariance matrix from the observed data. Those differences are referred to as "fit" or "goodness of fit", namely, how similar the hypothesized model is to the observed data.

In the chi-square goodness of fit statistic form, a large value of the chi-square statistic, relative to its degree of freedom, is evidence that the model is not a very good description of the data, whereas a small chi-square is evidence that the model is good one for the data. However, a significant goodness-of-fit chi square value may be a reflection of model misspecification, power of the test, or violation of some technical assumptions underlying the estimation method. Thus the standard chi-square test may not be a good enough guide to model adequacy. As a result of developing alternative of fit indices, applied researchers inevitable face a constant challenge in selecting appropriate fit indices among a large number of fit indices that have recently become available, and they often have difficulties in determining the adequacy of their covariance structure models due to the fact that the values of various fit indices yield conflicting conclusions about the extent to which the model matches the observed data.

The conventional overall test of fit in covariance structure analysis assesses the magnitude of discrepancy between the sample and fitted covariance matrices. T or chi-square statistic derived from ML under the assumption of multivariate normality is the most widely used summary statistic for assessing the adequacy of a structural equation model.

A fit index can be used to quantify the degree of fit along a continuum. It is an overall summary statistic that evaluates how well a particular covariance structure model explains sample data like R^2 in multiple regression. An absolute-fit index directly assesses how well an a priori model reproduces the sample data, and implicit or explicit comparison may be made to a saturated model that exactly reproduces the observed covariance matrix. In contrast, an incremental fit indices, or comparative fit indices measure the proportionate improvement in fit by comparing a target model with a more restricted, nested, baseline model. Among incremental fir indices, there are Type1 index that uses information only from the optimized statistic T, Type2 index that additionally uses information from the expected values of Tt under the certain chi-square distribution, and Type3 index that additionally uses information from the expected values of Tt or Tb or both, under the relevant noncentral chi-square distribution. Type2 and Type3 indices should perform better than Type1 because more information is being used. However, Type2 and Type3 may use inappropriate information, because any particular T may not have the distributional form assumed.

A correct specification implies that a population exactly matches the hypothesized model and also that the parameters estimated in a sample reflect this structure. A model is said to be misspecified when (a) one or more parameters are estimated whose population values are zeros, (b) one or more parameters are fixed to zeros whose population values are nonzeros, or both. Sample size is substantially associated with several fit indices under both true and false models. Test statistics is likely to be performing more poorly in smaller samples that cannot be considered asymptotic enough. Thus, the decision for accepting or rejecting a particular model may vary as a function of sample size, which is certainly not desirable. Estimation methods such as ML and GLS are traditionally developed under multivariate normality assumptions. Therefore, a violation of multivariate normality can seriously invalidate normal-theory test statistics. Good fit indices should be (a) sensitive to model misspecification and (b) stable across different estimation methods, sample sizes, and distributions.

Capitalization on chance

Capitalization on chance in structural equation modeling means the model fits well into the observational data because of the upward fit bias of the sample. This may occur by change because of the sample specificity. Especially, when researchers conduct the improvement of their model based on the "modification indices" or "automatic model modification" provided by some computer programs, the probability of capitalization becomes high. In this case, the model may be meaningless even if it fits well into the data because it is reflected by upward sample bias. To avoid this type of capitalization, researchers should be aware of Cliff's (1983) four principles of scientific inference that researchers might be enticed to violate. That is, (1) data never can confirm a model; they can only fail to disconfirm, (2) post hoc is not proper hoc, (3) nominalistic fallacy, and (4) ex post facto analysis. Basically, SEM techniques are intended to be used for model confirmation, not model development. Therefore, the emphasis on model modification is a substantial shift from the confirmatory intent of latent variable SEM approaches. If researchers improve their model based on the information from the sample data, it is literally impossible to disconfirm the model because the improved model is developed in order to fit into the sample data. Another limitation of improve the model based on the modification indices is that the improved model is just one of the alternative models that may be more appropriate than original model, but it does not mean that researchers find the true model. When researchers consider the model development, they should be open to these criticisms and should keep conservative mode in mind.

If researchers conduct structural equation modeling without enough theoretical basis, the possibility of capitalization on chance increases. They might try to explorer the data using such tools as modification indices and find out the model that fits well into the data. However, structural equation model uses the variance/covariance matrix. If there is major missing variables in their initial model, the situation may lead to the findings of meaningless fit model because it merely fit into the variance/covariance matrix of the observation data. Therefore, using structural equation modeling without enough theorizing is very low moral ground.


Fabrigar, L. R., Wegener, D. T., MacCallum, R. M., & Strahan, E. J. (1999). Evaluating the use of exploratory factor analysis in psychological research. Psychological Methods, 4 (3), 272-299.

Hu, L & Bentler, P. M (1998). Fit indices in covariance structure modeling: Sensitivity to underparameterized model misspcification. Psychological Methods, 3 (4), 424-453.

Maruyama, G. M. (1997). Basics of Structural Equation Modeling. Thoudand Oaks, Sage.

Nannallym J. C., & Bernstein, I. H. (1994). Psychometric Theory: Third Edition. New York: McGraw-Hill.