logo
user
  • Sign in
  • Sign up

What is factor analysis?

10 Pages

0 Downloads

Words: 2850

Date added: 17-06-26


rated 4.8/5 based on 5 customer reviews.

Type:

Level: high-school

Category:

Tags:

open document save to my library

1.0 DEFINITION OF FACTOR ANALYSIS

Factor analysis (FA) refers to a latent structure approach that can be used to analyze interrelationships among a large number of variables by explaining the underlying unobservable variables (latent variables) that are reflected in the observed variables (manifest variables) known as factors. With FA, the researcher can first identify the separate dimensions of the structure and then determine the extent to which each variable is explained by each dimension.  Once these dimensions and the explanation of each variable are determined, the summarization and reduction of data can be achieved. In summarizing the data, FA describes the underlying dimensions of data in a much smaller number of items than the original variables. It examines the pattern of correlations (or covariances) between the observed measures. Data reduction can be achieved by calculating scores for each underlying dimension and substituting them for the original variables. FA is an interdependence technique where variates (factors) are formed to maximize their explanation of the entire variable set. These groups of variable would represent dimensions within the data which the researcher needs to label them. Basically, there are two types of FA, exploratory and confirmatory. The first analysis is used to discover the nature of the construct that influence a set of response and latter, test a specified set of constructs is influencing responses in a predicted way.

Data summarization

The goal of data summarization is achieved by defining a small number of factors that adequately represent the original set of variables.

Data reduction

Data reduction is achieved by identifying representative variables from a much larger set of variables for use in subsequent multivariate analyses or creating an entirely new set of variables whilst retaining the nature and character of the original variables. Data reduction relies on the factor loadings and uses them a s a basis for either identifying variables for subsequent analysis with other techniques or making estimates of the factor themselves (factor scores or summated scales), which then replace the original variables in subsequent analysis. Factor analytic technique is run according to their purpose either an exploratory or confirmatory perspective. Many researchers consider using the Exploratory Factor Analysis (EFA) when they are searching for structure among a set of variables or as a data reduction technique. EFA technique does not set any a priori constraints on the estimation of the components or the number of components to be extracted compared to the Confirmatory Factor Analysis (CFA). CFA is used to confirm what is expected on the basis of pre-established theory.

2.0 PURPOSE OF FACTOR ANALYSIS

The primary purpose of FA is to discover simple patterns in the pattern of relationships amongst variables by defining the underlying structure in a data matrix. This could be done by data summarization) and reduction.

3.0 HISTORY OF FACTOR ANALYSIS

FA was pioneered in 1904 by psychologist, Charles Spearman, who hypothesized that the enormous variety of test of mental ability (measures of mathematical skill, vocabulary, verbal skills and others) could be explained by one underlying factor of general intelligence he called g.FA was developed to analyze the test scores of g so as to determine if g is made up of a single underlying general factor or of several more limited factors measuring attributes like mathematical ability. Raymond Cattell expanded the Spearman g test by using a multi factor theory to explain intelligence. He also developed several mathematical methods such as Cree Test and similarity coefficient.  His statistical methods led to an improved version of factor analyses by statistician.

4.0 PRINCIPAL COMPONENT (PCA) VERSUS FACTOR ANALYSIS (FA)

There are many debates amongst statistician on the different of Principal Component and FA. A distinct different is Principal Component assumes that responses are measured based on the underlying factors whist the latter are based on the measured responses. Principal component analysis is used when the objective is to summarize most of the original information (variance) in a minimum number of factors for prediction purposes. In contrast, FA is used primarily to identify underlying factors or dimension that reflect what the variables share in common. Principal components are defined as linear combinations of measurement, that contain small proportions of unique variance and in some instances, error variance whilst FA considers only the common or shared variance, assuming that both the unique and the error variance are not of interest in defining the structure of the variables. PCA produces an orthogonal transformation of the variables without taking into consideration of underlying model whilst FA is based on a proper statistical model and is more concern with explaining the covariance structure of the variables than with explaining the variance (Chatfield, 1980). The calculation of PC scores is straightforward whilst the calculation of factor scores is more complex and a variety of methods can be used. Looking at the practical perspective, principal component analysis is most appropriate when the primary concern is data reduction focusing on the minimum number of factors needed to account for the maximum portion of the total variance represented in the original set of variables. FA is most appropriate when the primary objective is to identify the latent dimension or construct represented in the original variables.

5.0 STEPS IN FACTOR ANALYSIS

5.1 TEST ASSUMPTIONS

5.1.1 FA is robust to assumptions of normality

If the variables are normally distributed, then the solution is enhanced. To check normality, .....

5.1.2 Measure the sampling adequacy of sample size

There are many proposed sample size for FA. Guilford (1954) recommended that the sample size should be at least 200 whilst Hair, Black, Babin & Anderson (2010) stated that the minimum is to have at least five times as many observation as the number of variables to be analyzed and the more acceptable size would have 10:1 ratio. Comrey and Lee (1992) provided the following guidance in determining the adequacy of sample size: Table 1: Determining the Adequacy of Sample Size
Sample Size Indication
100 Poor
200 Fair
300 Good
500 Very good
1,000 or more Excellent

5.1.3 All variables must be must be suitable for correlational analysis.

The sample is identified homogeneous with the respect to the underlying factor structure. It is inappropriate to treat a subset of items as a set of items known to differ in FA such as gender, where it will mislead the representation of the unique structure of each group. There are various ways to quantify the degree of intercorrelations amongst the variables such as the Measure of Sampling Adequacy (MSA). The index ranges from 0 to 1 when each variable is perfectly predicted without error by other variables.  If MSA value falls below 0.50, researcher should identify variable for deletion to achieve an overall value of 0.50. According to Hair et al. (2010) can be interpreted as the followings: Table 2: Measure of Sampling Adequacy (MSA).
Measure Of Sampling Adequacy Indication
0.8 or above Meritorious
0.7 or above Middling
0.60 or above Mediocre
0.5 or above Miserable
Below 0.5 Unacceptable
Another method of determining the appropriateness of FA is the Bartlet test of sphericity and Kaiser-Myer-Oikin (KMO), a statistical test for the presence of correlations among the variables that indicates the significant status of the correlation matrix among at least some of the variables. KMO should indicates more than 0.5. The factor analyst must ensure that the data matrix has sufficient correlations to justify the application of FA. The anti image correlation matrix can be used to indicate whether the data matrix is suitable for FA. It is based on the correlation matrix of unpredicted variables using multiple regression. FA should not be performed when anti image correlation is less than 0.5 due to the lack of sufficient correlation with other variables.

6.0 SELECT TYPE OF ANALYSIS

6.2. 1EXTRACTION

In FA, the researchers group variables by their correlations, such that in a group (factor) have high correlations with each other. It is important to understand how much variable's variance is shared with olther variables in that factor versus what cannot be shared. The total variance of any variable os composed of its common, unique and error variances. As a variable is more highly correlated with one of more variables, the commune variable known as communalities increases.

6.2.2 ROTATION

This important tools refers to the movement of the reference axes of the factors from the origin to some other position. The ultimate effect of rotating the factor matrix is to redistribute the variance from earlier factors to later ones to achieve a simpler, theoretically more meaningful pattern. There are two ways of rotation, either orthogonal factor rotation or oblique factor rotation. In orthogonal factor rotation, the axes rotation is maintain at 90 degrees compared to oblique factor rotation. The major orthogonal approaches are Varimax, Quartimax and Equimax. The Varimax method encourages the detection of factors each of which is related to few variables, on the other hand, Quartimax seeks to maximize the variance of the squared loadings for each variables and tend to produce factors with high loadings for all variables. Equimax is a solution of compromise between Varimax and Quartimax. For Oblique factor rotation, Oblimin, Promax, Orthoblique, Dquart, Doblimin and Orthoblique has been developed. Oblimin allows factors to covary and to correlate with each other. The researcher need to choose either orthogonal or oblique factor rotation based on the particular needs of a given research problem. However, Hair et al (2010) suggested that Orthogonal Rotation method is preferred when the research goal is data reduction to either a smaller number of variables or a set of uncorrelated measures for subsequent use in other multivariate techniques. Where as the oblique rotation methods are best suited to the goal of obtaining several theoretical meaningful factors or construct.

The Significance of Factor Loadings

Factor loadings indicatehow strongly a measured variable is correlated with a factor. A 0.30 loadings translates to approximately 10 percent explanation and a 0.50 loadings indicates that 25 percent of the variance is accounted for by the factor. Using practical significance of factor loadings, Hair et al. (2010) proposed the followings (for sample size of 100 or above):

Table 3: Significance of Factor Loadings

Factor Loadings

Indication

± 0.30 to 0.49

Meets the minimal level for interpretation of structure

± 0.50 or greater

Practically significant

Exceed 1.7

Indicative of well defined structure

Comrey & Lee (1992) also proposed practical significance of factor loading as below:

Table 4: Significance of Factor Loadings

Factor Loadings

Indication

More than 0.70

Excellent

Less than 0.63

Very good

Less than 0.55

Good

Less than 0.45

Fair

Less than 0.32

Poor

In relation to the table above, Hair et al (2010) provide guidelines for identifying significant factor loandings based on sample size as below:

Table 5: Guidelines for Identifying Significant Factor Loadings Based on Sample Size

Factor Loadings

Sample Size Needed for Significant a

0.30

350

0.35

250

0.40

200

0.45

150

0.50

120

0.55

100

0.60

85

0.65

70

0.70

60

0.75

50

a Significance is based on a 0.5 significance level (α), a power level of 80 percent, and standard errors assumed to be twice those conventional correlation coefficients. Source: Computation made with SOLO Power Analysis, BDMP Statistical Software, Inc. 1993 Assess the Communalities of Variable Communalities measures the percent of variance in a given variable explained by all the factors joint and may be interpreted as the reliability of the indicator. Communalities is used to indicate any variables that are not adequately accounted for by the factor solution. Variables with communalities less than 0.50 are considered of not having an acceptable level of explanation and researchers may then need to extract more factors to explain the variance.

6.3 DETERMINE NUMBER OF FACTORS

There are number of methods to determine the optimal number of factors. Latent root Criterion/Kaiser Criterion. The latent root criterion or also known as Kaiser Criterion states that factors having latent roots or eigenvalues of the correlation matrix that are greater than 1 are considered significant. Eigenvalue refers to amount of variance explained by each principal component to each factor. Hair et all (2010) suggested that using eigenvalue for establishing a cut off is most reliable when the number of variables is between 20 and 50. Scree Test Criterion. The Cattell scree test is derived by plotting the latent roots against the number of factors in their order of extraction and the shape of the resulting curve is used to evaluate the cutoff point. From the Scree test, as one moves to the right, toward later components, the eigenvalues drop, The Cartell Scree test states to drop all other components after the one starting the elbow (a point after which the remaining eigenvalues decline in approximately linear fashion. Variance Criterion Variance Criterion is an approach to ensure practical significance for the derived factors in which the cumulative percentages of the variance extracted by successive factors. Hair (2010) proposed that it is uncommon to accept a solution that accounts for 60 percent of the total variance as a satisfactory solution.

6.4 NAME AND DEFINE FACTORS

As the variables become correlated and group together, the researchers need to label the group that can represent each group of variables as accurate as possible.

6.5 ANALYSE INTERNAL RELIABILITY

Reliability is an indicator to measure internal reliability. The rationale for internal consistency is that the individual items or indicators of the scale should all be measuring the same construct and highly correlated. There are two diagnostic measures of reliabilities, either to look at the item-to-total correlation and inter item correlation or the reliability coefficient. If the researcher choose the first method, the item-to-total correlations should exceed 0.50 and inter item correlation exceed 0.30. Using reliabilities coefficient, Zikmund, Babin, Carr & Griffin (2010) provide guideline in determining reliabilities as in Table below:

Table 6: Coefficient alpha (α) to Determine Reliabilities

Coefficient alpha (α)

Indication

Between 0.80 to 0.95

Very good

Between 0.70 to 0.80

Good

Between 0.60 to 0.70

Fair

Below 0.60

Poor

7.0 EXPLANATORY FACTOR ANALYSIS USING STATISTICAL PACKAGE FOR SOCIAL SCIENCE (SPSS)

Correlation Matrix

att1 att2 att3 att4 att5 att6 att7 att8 att9 att10 att11 att12 att13 att14 att15 att16
Correlation att1 1.000 .664 .250 .435 .490 .315 .378 .328 .574 .336 .575 .338 .176 .436 .379 .560
att2 .664 1.000 .383 .506 .444 .456 .345 .260 .525 .316 .468 .414 .320 .533 .480 .674
att3 .250 .383 1.000 .457 .210 .321 .216 .054 .217 .206 .231 .225 .429 .425 .314 .296
att4 .435 .506 .457 1.000 .351 .352 .336 .240 .415 .352 .405 .416 .331 .558 .439 .529
att5 .490 .444 .210 .351 1.000 .210 .318 .194 .303 .216 .603 .330 .188 .296 .238 .352
att6 .315 .456 .321 .352 .210 1.000 .358 .128 .379 .475 .329 .290 .276 .421 .311 .486
att7 .378 .345 .216 .336 .318 .358 1.000 .256 .373 .344 .332 .320 .175 .333 .265 .397
att8 .328 .260 .054 .240 .194 .128 .256 1.000 .348 .209 .215 .128 .128 .200 .231 .265
att9 .574 .525 .217 .415 .303 .379 .373 .348 1.000 .437 .368 .383 .203 .492 .398 .609
att10 .336 .316 .206 .352 .216 .475 .344 .209 .437 1.000 .366 .296 .181 .325 .289 .419
att11 .575 .468 .231 .405 .603 .329 .332 .215 .368 .366 1.000 .338 .176 .382 .333 .445
att12 .338 .414 .225 .416 .330 .290 .320 .128 .383 .296 .338 1.000 .186 .377 .266 .386
att13 .176 .320 .429 .331 .188 .276 .175 .128 .203 .181 .176 .186 1.000 .391 .233 .318
att14 .436 .533 .425 .558 .296 .421 .333 .200 .492 .325 .382 .377 .391 1.000 .428 .579
att15 .379 .480 .314 .439 .238 .311 .265 .231 .398 .289 .333 .266 .233 .428 1.000 .559
att16 .560 .674 .296 .529 .352 .486 .397 .265 .609 .419 .445 .386 .318 .579 .559 1.000

Anti-image Matrices

att1 att2 att3 att4 att5 att6 att7 att8 att9 att10 att11 att12 att13 att14 att15 att16
Anti-image Covariance att1 .399 -.141 -.001 -.012 -.051 .047 -.042 -.058 -.112 -.005 -.120 .029 .048 -.002 .014 -.016
att2 -.141 .366 -.057 -.013 -.054 -.080 .030 -.010 -.004 .051 .016 -.060 -.030 -.024 -.045 -.102
att3 -.001 -.057 .647 -.135 -.004 -.062 -.023 .075 .021 -.001 .006 .016 -.189 -.074 -.066 .063
att4 -.012 -.013 -.135 .519 -.027 .021 -.017 -.052 .011 -.046 -.024 -.097 -.019 -.107 -.057 -.047
att5 -.051 -.054 -.004 -.027 .570 .040 -.068 -.010 .010 .034 -.224 -.063 -.038 .021 .028 .009
att6 .047 -.080 -.062 .021 .040 .605 -.090 .041 -.010 -.183 -.039 -.004 -.033 -.046 .017 -.060
att7 -.042 .030 -.023 -.017 -.068 -.090 .719 -.089 -.024 -.061 -.002 -.075 .008 -.015 .001 -.035
att8 -.058 -.010 .075 -.052 -.010 .041 -.089 .817 -.102 -.032 .002 .053 -.055 .016 -.054 .017
att9 -.112 -.004 .021 .011 .010 -.010 -.024 -.102 .482 -.099 .039 -.071 .022 -.072 -.013 -.095
att10 -.005 .051 -.001 -.046 .034 -.183 -.061 -.032 -.099 .647 -.082 -.040 -.006 .023 -.014 -.029
att11 -.120 .016 .006 -.024 -.224 -.039 -.002 .002 .039 -.082 .491 -.025 .020 -.029 -.034 -.015
att12 .029 -.060 .016 -.097 -.063 -.004 -.075 .053 -.071 -.040 -.025 .711 .006 -.038 .011 .004
att13 .048 -.030 -.189 -.019 -.038 -.033 .008 -.055 .022 -.006 .020 .006 .737 -.094 .016 -.040
att14 -.002 -.024 -.074 -.107 .021 -.046 -.015 .016 -.072 .023 -.029 -.038 -.094 .501 -.026 -.064
att15 .014 -.045 -.066 -.057 .028 .017 .001 -.054 -.013 -.014 -.034 .011 .016 -.026 .632 -.125
att16 -.016 -.102 .063 -.047 .009 -.060 -.035 .017 -.095 -.029 -.015 .004 -.040 -.064 -.125 .362
Anti-image Correlation att1 .897a -.369 -.002 -.026 -.107 .095 -.079 -.101 -.256 -.009 -.271 .055 .089 -.005 .028 -.042
att2 -.369 .911a -.118 -.030 -.118 -.169 .059 -.019 -.008 .105 .037 -.118 -.058 -.055 -.095 -.280
att3 -.002 -.118 .865a -.234 -.006 -.099 -.034 .104 .037 -.001 .011 .024 -.274 -.130 -.104 .131
att4 -.026 -.030 -.234 .939a -.050 .037 -.027 -.079 .023 -.079 -.047 -.159 -.031 -.209 -.100 -.107
att5 -.107 -.118 -.006 -.050 .875a .069 -.106 -.015 .018 .056 -.423 -.098 -.058 .039 .047 .021
att6 .095 -.169 -.099 .037 .069 .907a -.136 .058 -.019 -.293 -.072 -.007 -.049 -.083 .027 -.127
att7 -.079 .059 -.034 -.027 -.106 -.136 .950a -.117 -.041 -.089 -.003 -.106 .011 -.025 .002 -.070
att8 -.101 -.019 .104 -.079 -.015 .058 -.117 .894a -.162 -.044 .003 .069 -.071 .024 -.075 .032
att9 -.256 -.008 .037 .023 .018 -.019 -.041 -.162 .921a -.177 .080 -.121 .036 -.147 -.023 -.229
att10 -.009 .105 -.001 -.079 .056 -.293 -.089 -.044 -.177 .901a -.145 -.059 -.008 .041 -.022 -.061
att11 -.271 .037 .011 -.047 -.423 -.072 -.003 .003 .080 -.145 .883a -.042 .034 -.057 -.060 -.036
att12 .055 -.118 .024 -.159 -.098 -.007 -.106 .069 -.121 -.059 -.042 .944a .009 -.063 .016 .008
att13 .089 -.058 -.274 -.031 -.058 -.049 .011 -.071 .036 -.008 .034 .009 .887a -.154 .023 -.078
att14 -.005 -.055 -.130 -.209 .039 -.083 -.025 .024 -.147 .041 -.057 -.063 -.154 .946a -.047 -.150
att15 .028 -.095 -.104 -.100 .047 .027 .002 -.075 -.023 -.022 -.060 .016 .023 -.047 .943a -.262
att16 -.042 -.280 .131 -.107 .021 -.127 -.070 .032 -.229 -.061 -.036 .008 -.078 -.150 -.262 .922a
a. Measures of Sampling Adequacy(MSA)

KMO and Bartlett's Test

Kaiser-Meyer-Olkin Measure of Sampling Adequacy. .914
Bartlett's Test of Sphericity Approx. Chi-Square 2491.010
df 120
Sig. .000

Communalities

Initial Extraction
att1 .601 .617
att2 .634 .606
att3 .353 .526
att4 .481 .514
att5 .430 .645
att6 .395 .360
att7 .281 .278
att8 .183 .164
att9 .518 .598
att10 .353 .308
att11 .509 .576
att12 .289 .274
att13 .263 .320
att14 .499 .550
att15 .368 .356
att16 .638 .682
Extraction Method: Principal Axis Factoring.

Total Variance Explained

Factor Initial Eigenvalues Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings
Total % of Variance Cumulative % Total % of Variance Cumulative % Total % of Variance Cumulative %
dimension0 1 6.452 40.324 40.324 5.959 37.243 37.243 3.346 20.915 20.915
2 1.340 8.373 48.697 .833 5.206 42.449 2.150 13.438 34.353
3 1.062 6.639 55.336 .582 3.637 46.086 1.877 11.733 46.086
4 .951 5.942 61.278
5 .841 5.253 66.531
6 .756 4.727 71.257
7 .656 4.101 75.359
8 .643 4.017 79.376
9 .577 3.608 82.985
10 .528 3.298 86.283
11 .499 3.118 89.401
12 .421 2.633 92.033
13 .389 2.431 94.464
14 .348 2.176 96.640
15 .302 1.889 98.529
16 .235 1.471 100.000
Extraction Method: Principal Axis Factoring.

Factor Matrixa

Factor
1 2 3
att16 .797
att2 .778
att1 .725 -.301
att14 .702
att9 .696 -.324
att4 .688
att11 .643 -.333
att15 .581
att6 .569
att5 .562 -.401 .410
att10 .526
att12 .522
att7 .519
att3 .487 .441 .308
att13 .412 .345
att8 .352
Extraction Method: Principal Axis Factoring.
a. 3 factors extracted. 18 iterations required.

Rotated Factor Matrixa

Factor
1 2 3
att9 .732
att16 .710 .359
att1 .567 .525
att2 .555 .391 .382
att10 .498
att6 .464 .366
att15 .462 .344
att7 .424
att8 .370
att12 .361
att3 .709
att13 .545
att14 .470 .544
att4 .403 .527
att5 .770
att11 .341 .655
Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization.
a. Rotation converged in 6 iterations.

Factor Transformation Matrix

Factor 1 2 3
dimension0 1 .717 .515 .470
2 -.113 .751 -.650
3 -.688 .412 .597
Extraction Method: Principal Axis Factoring. Rotation Method: Varimax with Kaiser Normalization.
Read full document← View the full, formatted essay now!
Is it not the essay you were looking for?Get a custom essay exampleAny topic, any type available
banner
x
We use cookies to give you the best experience possible. By continuing we'll assume you're on board with our cookie policy. That's Fine