Question
Asked 30th Apr, 2016

Factor Analysis issue (PCA) - data validity issue?

I have already obtained the author permission for using TEIP and SACIE-R questionnaires for my research study. I have read their documents on how the authors arrived at their data validity by using factor analysis. 
However, on my end, since I am using their questionnaires for my respondents which totalled up to 400 teachers in my study, I would like to test the data validity by using factor analysis in SPSS. 
When I ran the factor analysis in SPSS, I noticed that the factor loading under different component in Pattern Matrix seemed to be different from the original authors' factor analysis results. Some of the questions from the authors' results showed higher factor loading but as for my case, these questions showed poorer factor loading. For example
Question 1) I find it difficult to overcome my initial shock when meeting people with severe physical disabilities.
Authors' factor loading (under Component 1): 0.730
My factor loading (under Component 1): 0.026 
For the same question, my result shows that this question is weakly correlated with that factor as it is less than 0.3 (arbitrarily accepted values).
So, my question is that even though we share the same question and I took every caution to ensure that my research study meets most of the requirement, should I consider accepting this question in my questionnaire? What should I do with my data which is found to contradict with the original authors' factor analysis result? Should I accept the authors' results and original questionnaire blindly even though my own investigation does not agree with their findings?
Your advice is most appreciated.
Thanks

Most recent answer

Imtiaz Ahmad
University of Jhang
Definitely, you have to perform CFA in order to to confirm the construct's validity in your local settings.

All Answers (5)

Ines Kawgan-Kagan
AEM Institute
Dear Yap,
First of all, I am sure you can use the questions and data anyway.
If you only need the answer to this question for your analysis, everything is fine. If you actually need to regenerate components as done before, just be transparent with the issue. If your findings contradict previous studies, then mention it. You might find something interesting why there is this difference. Discuss it!
Please keep in mind that PCA is not the same as factor analysis.
Best,
Ines
Yap Adrian
Waseda University
Dear Ines.
Thanks for your kind response to my question. It is very helpful to have your response to clarify my doubts on this issue.
Just want to reconfirm my understanding on this issue. There is absolutely no problem of using this questionnaires and data even though my findings do contradict the previous studies as long as I must mention that there are some discrepancies between the previous and my findings.
I have a question for you. Just a simple one. Why do you think everything is fine if I only need the answer to this question for my analysis? Mind I request you to give me a bit more detailed explanation, please?
Best
Adrian
Lisete Mónico
University of Coimbra
Dear Yap,
You can follow two types of reasoning:
1)       The easier (I don’t recommend): you accept the authors' results and original questionnaire blindly (you don´t run EFA or CFA)
2)       The more accurate: a) you run an confirmatory factor analysis (CFA) and check the fit of the model. If the model doesn´t fit this means that your sample probably is different from the original and you should test for another factor structure. 2) you run an exploratory factor analysis and you try to obtain an interpretable solution (eliminating items with low factor loadings – lower than .50 but lower than .40 are also a criteria used for eliminating items) – then you should run again your model (try to rotate the factors – with varimax rotation if you want not correlated factors or with oblimin rotation if you want correlated factors). Test the reliability also (with Cronbach alpha)
Good luck with your research!
Yap Adrian
Waseda University
Dear Lisete,
Thank you so much for your suggestions. I have just one simple question.
If I decide to adopt your second suggestion but I do not want to run CFA or EFA, I only use Cronbach alpha to test the reliability of my questionnaire instruments. Currently my Cronbach alpha rating shows more than .70 (70%) and it seems to be reliable in this sense. Is this an acceptable approach to rely on Cronbach alpha rating only?
Imtiaz Ahmad
University of Jhang
Definitely, you have to perform CFA in order to to confirm the construct's validity in your local settings.

Similar questions and discussions

What does a CFA Factorloadings greater than 1.00 with standardised loadings in RStudio mean?
Discussion
3 replies
  • Nynke van KetelNynke van Ketel
Hi,
I am running a CFA in RStudio and would like to get standardised factor loadings. However, I am still receiving factor loadings greater than 1.00 (see bold). Is this possible?
My thoughts:
- I thought I had used the right formula by typing standardized=True as last sentence.
- I thought the factor loadings can be identified by looking under Latent Variables - Estimates.
- I am doing a CFA and would like to get standardised factor loadings. However, I am still receiving factor loadings greater than 1.00. Is this possible?
Am I typing the wrong formula, looking at the wrong place, or is my assumption that standardised factor loadings can not be greater than 1.00 wrong? (or something else)?
Thanks!
INPUT
#cfa part 2, 18 items, 6 factors, ordinal data, unit variance identification
model <- "f1 =~ fb1 +fb2 + fb3 +fb14
f2 =~ fb9 + fb4
f3 =~ fb6 + fb7
f4 =~ fb12 + fb13
f5 =~ fb15 + fb16 +fb21
f6 =~ fb18 + fb19 +fb20 +fb22 + fb23"
fit<-cfa(model,std.lv=T,data=p2,ordered=items)
#standardised factor loadings
summary(fit,standardized=TRUE)
OUTPUT
vaan 0.6-11 ended normally after 44 iterations
Estimator DWLS
Optimization method NLMINB
Number of model parameters 69
Number of observations 328
Model Test User Model:
Standard Robust
Test Statistic 174.908 245.633
Degrees of freedom 120 120
P-value (Chi-square) 0.001 0.000
Scaling correction factor 0.872
Shift parameter 45.025
simple second-order correction
Parameter Estimates:
Standard errors Robust.sem
Information Expected
Information saturated (h1) model Unstructured
Latent Variables:
f1 =~
fb1 0.557 0.075 7.412 0.000 0.557 0.477
fb2 0.446 0.077 5.775 0.000 0.446 0.453
fb3 0.788 0.076 10.437 0.000 0.788 0.669
fb14 0.514 0.080 6.459 0.000 0.514 0.527
f2 =~
fb9 0.886 0.160 5.543 0.000 0.886 0.836
fb4 0.646 0.120 5.390 0.000 0.646 0.618
f3 =~
fb6 0.905 0.058 15.548 0.000 0.905 0.790
fb7 1.013 0.054 18.597 0.000 1.013 0.827
f4 =~
fb12 0.588 0.092 6.430 0.000 0.588 0.523
fb13 1.146 0.117 9.792 0.000 1.146 1.038
f5 =~
fb15 0.879 0.059 14.886 0.000 0.879 0.743
fb16 0.866 0.057 15.177 0.000 0.866 0.767
fb21 0.798 0.064 12.463 0.000 0.798 0.678
f6 =~
fb18 0.831 0.051 16.323 0.000 0.831 0.694
fb19 0.761 0.055 13.729 0.000 0.761 0.694
fb20 0.785 0.056 13.960 0.000 0.785 0.676
fb22 0.681 0.058 11.655 0.000 0.681 0.624
fb23 0.586 0.063 9.292 0.000 0.586 0.518
Covariances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
f1 ~~
f2 -0.004 0.086 -0.043 0.966 -0.004 -0.004
f3 0.053 0.086 0.619 0.536 0.053 0.053
f4 0.542 0.081 6.699 0.000 0.542 0.542
f5 0.267 0.086 3.112 0.002 0.267 0.267
f6 0.244 0.085 2.861 0.004 0.244 0.244
f2 ~~
f3 0.300 0.077 3.904 0.000 0.300 0.300
f4 0.002 0.068 0.036 0.971 0.002 0.002
f5 0.205 0.085 2.409 0.016 0.205 0.205
f6 0.177 0.088 2.012 0.044 0.177 0.177
f3 ~~
f4 0.097 0.071 1.369 0.171 0.097 0.097
f5 0.433 0.067 6.490 0.000 0.433 0.433
f6 0.711 0.048 14.976 0.000 0.711 0.711
f4 ~~
f5 0.262 0.074 3.537 0.000 0.262 0.262
f6 0.264 0.074 3.583 0.000 0.264 0.264
f5 ~~
f6 0.681 0.058 11.657 0.000 0.681 0.681
Intercepts:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.fb1 2.628 0.065 40.726 0.000 2.628 2.249
.fb2 3.497 0.054 64.223 0.000 3.497 3.546
.fb3 3.314 0.065 50.901 0.000 3.314 2.811
.fb14 3.869 0.054 71.807 0.000 3.869 3.965
.fb9 1.774 0.059 30.321 0.000 1.774 1.674
.fb4 1.649 0.058 28.602 0.000 1.649 1.579
.fb6 3.210 0.063 50.792 0.000 3.210 2.805
.fb7 2.814 0.068 41.563 0.000 2.814 2.295
.fb12 3.308 0.062 53.250 0.000 3.308 2.940
.fb13 3.271 0.061 53.645 0.000 3.271 2.962
.fb15 2.802 0.065 42.871 0.000 2.802 2.367
.fb16 3.055 0.062 48.993 0.000 3.055 2.705
.fb21 2.662 0.065 40.935 0.000 2.662 2.260
.fb18 2.741 0.066 41.450 0.000 2.741 2.289
.fb19 3.421 0.061 56.444 0.000 3.421 3.117
.fb20 3.192 0.064 49.787 0.000 3.192 2.749
.fb22 2.994 0.060 49.650 0.000 2.994 2.741
.fb23 3.497 0.062 56.016 0.000 3.497 3.093
f1 0.000 0.000 0.000
f2 0.000 0.000 0.000
f3 0.000 0.000 0.000
f4 0.000 0.000 0.000
f5 0.000 0.000 0.000
f6 0.000 0.000 0.000
Variances:
Estimate Std.Err z-value P(>|z|) Std.lv Std.all
.fb1 1.056 0.096 11.050 0.000 1.056 0.773
.fb2 0.773 0.082 9.421 0.000 0.773 0.795
.fb3 0.769 0.107 7.165 0.000 0.769 0.553
.fb14 0.688 0.080 8.649 0.000 0.688 0.722
.fb9 0.339 0.265 1.278 0.201 0.339 0.302
.fb4 0.674 0.165 4.071 0.000 0.674 0.618
.fb6 0.492 0.076 6.493 0.000 0.492 0.375
.fb7 0.476 0.086 5.553 0.000 0.476 0.317
.fb12 0.919 0.111 8.298 0.000 0.919 0.726
.fb13 -0.094 0.260 -0.363 0.717 -0.094 -0.077
.fb15 0.628 0.080 7.819 0.000 0.628 0.448
.fb16 0.525 0.073 7.211 0.000 0.525 0.412
.fb21 0.750 0.090 8.330 0.000 0.750 0.541
.fb18 0.744 0.071 10.503 0.000 0.744 0.519
.fb19 0.625 0.062 10.119 0.000 0.625 0.519
.fb20 0.732 0.067 10.985 0.000 0.732 0.543
.fb22 0.729 0.070 10.346 0.000 0.729 0.611
.fb23 0.935 0.079 11.804 0.000 0.935 0.732
f1 1.000 1.000 1.000
f2 1.000 1.000 1.000
f3 1.000 1.000 1.000
f4 1.000 1.000 1.000
f5 1.000 1.000 1.000
f6 1.000 1.000 1.000

Related Publications

Article
Full-text available
sigbert@wiwi.hu-berlin.de In teaching, factor analysis and principal component analysis are often used together, although they are quite different methods. We first summarise the similarities and differences between both approaches. From submitted theses it appears that student have difficulties seeing the differences. Although books and online res...
Article
Abstract This article compares two methods, Principle Component Analysis (PCA) and Generalized Least Square (GLS), using in dichotomous variables. This study uses R to simulate data, Tetcorr to compute tetrachoric correlation coefficient, and SPSS to do PCA and GLS, and hope to find which method can estimate factor loadings more accurate in differe...
Got a technical question?
Get high-quality answers from experts.