Question
Asked 2nd Apr, 2015
  • Kathmandu University School of Management

Is it mandatory to have the value of Cronbach's alpha above .70?

In my study using questionnaire survey, I have four variables and job satisfaction is one independent variable. I measured job satisfaction with a 3-item scale. Responses were captured in a 5-point LIkert scale. The value of Cronbach's alpha is .65. However, the value of Cronbach's alpha for other variables (captured with  5-point Likert scale are above .70. Can I use the data (with job satisfaction having alpha value below .70) to test my hypotheses? 

Most recent answer

Asad Abbas
Tecnológico de Monterrey
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. psychometrika, 16(3), 297-334. doi:10.1007/bf02310555
2 Recommendations

Popular answers (1)

  • Nunnally (1978) recommends a minimum level of .7. Cronbach alpha values are dependent on the number of items in the scale.
  • When there are a small number of items in the scale (fewer than 10), Cronbach alpha values can be quite small.
  • In this situation, it may be better to calculate and report the mean inter-item correlation for the items.
  • Optimal mean inter-item correlation values range from .2 to .4 (as recommended by Briggs & Cheek 1986).
  • Source Pallant, J. (2013). SPSS survival manual. McGraw-Hill Education (UK).
  • Please Look at Table 1 Selected Recommended Reliability Level Peterson (1994, p.382)
  1. Starkweather, J. (2012). Step out of the past: Stop using coefficient alpha; there are better ways to calculate reliability. University of North Texas. Research and statistical support.
  2. Serbetar, I., & Sedlar, I. (2016). Assessing Reliability of a Multi-Dimensional Scale by Coefficient Alpha. Revija za Elementarno Izobrazevanje, 9(1/2), 189.
  3. http://data.library.virginia.edu/using-and-interpreting-cronbachs-alpha/
24 Recommendations

All Answers (77)

Subhash Chandra
Agriculture Victoria Research
Arjun - what is the hypothesis you intend to test? That Cronbach's alpha should be >0.70 is absolutely arbitrary. 
Manfred Hammerl
Karl-Franzens-Universität Graz
"with a 3-item scale"
be aware of the fact that cronbachs alpha not only depends on the correlation between the items in your scale but also on the number of items (= 3) in your scale.
with the same average correlation between the items cronbachs alpha would be higher if you have 4 items, 5 items, etc.
so i think 0.65 is quite ok for a 3-item scale.
Holger Steinmetz
Universität Trier
Hi folks,
a) The .70 cut of is a myth, see
Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff criteria: What did they really say? Organizational Research Methods, 9(2), 202–220.
b) In most of the cases, a low alpha rather results from concepual heterogenety (the items measure different things) than low reliability
c) alpha is downward biased if the assumptions of tau-equivalence fail (i.e., the items measure the same thing and the loadings are equal) - this is almost always the case
However, altough Manfred is correct about the role of number of items, I would not concur with his conclusion :) If the assumptions hold (let's asume this), than a low reliability is a low reliability, may concern 10 items or 3. You (Manfred) are correct that this means that the 3 items themselves are more reliable than the 10 items but the conclusson for the whole scale is the same.
Best,
Holger
1 Recommendation
Manfred Hammerl
Karl-Franzens-Universität Graz
of course in scale construction the most important thing is to measure a construct in a valid way, to find content-related items which measure the construct in a broad sense.
if this job satisfaction scale is such a scale, it's ok.
but if it is a scale like: i'm happy with my job / i'm satisfied with my job / i'm not really satisfied with my job (R), then it's not a good scale and in this case cronbachs alpha is not the main problem...
Holger Steinmetz
Universität Trier
Hi Manfred, here we *might* disagree again :)
you seem to have the goal to create composites consisting of different things ("...broad sense"). I, in contrast, have the goal to *measure* variables that reflect existing things. Hence, we might differ with regard to the ontological approach to constructs and measurement.
For my purposes/orientation, your proposed job satisfation scale would indeed be a good scale as each of these items is a valid measure of the same underlying phenomenon (i.e., the degree of job satisfaction) and hence, deviations of the responses to these items (1 - r) would indicate nothing more than measurement error.
Altough I also sometimes built composites, I always do this with some kind of regret because composite-creation is like inventing something what did not exist before. This contradicts strongy my view of science (..to investigate naturally existing phenomena).
In this regard, I am a huge fan of the work of John Cadogan (which is also on researchgate.
Lee, N., & Cadogan, J. W. (2013). Problems with formative and higher-order reflective variables. Journal of Business Research, 66(2), 242–247.
Best,
Holger
1 Recommendation
Manfred Hammerl
Karl-Franzens-Universität Graz
"For my purposes/orientation, your proposed job satisfation scale would indeed be a good scale as each of these items is a valid measure of the same underlying phenomenon"
to my opinion, this would be an "artificial" scale - one would not need such a scale but just ask one of these items. at least that's what i learned from the psychologists (e.g. markus bühner, who is now in munich).
of course some researchers develop such scales for pratical reasons :-) (they need it for structural equation modelling).
Paul E. Spector
University of South Florida
Things to consider:
You can still use the scale, but it is likely the correlations are attenuated over what they would have been if the scale had higher reliability.
Was this an existing scale, and if so, is the reliability typically better? If you translated it, or even used it in a country that is culturally dissimilar to where the scale was developed, it might have lower reliability.
Spector, P. E., Liu, C., and Sanchez, J. I. (2015). Methodological and substantive issues in conducting multinational and cross-cultural research. Annual Review of Organizational Psychology and Organizational Behavior, 2:9, 1.9-31. [Early view]
If items are written in opposite directions, this could result in reduced alpha because item correlations are affected, especially if translated to another language or used in a culturally dissimilar setting.
Spector, P. E., Van Katwyk, P. T., Brannick, M. T., & Chen, P. Y. (1997). When two factors don’t reflect two constructs: How item characteristics can produce artifactual factors. Journal of Management, 23, 659-678.
2 Recommendations
hello
cronbachs alpha is some kind of persisting joke. it gets better the more items you have. 
some people even invent items which are nearly the same as items which are already in a scale to get better alpha-scores....
Yes you can use your data. alphas of .65 are not so bad considering that you only got 3 items. 
there is also reason to believe that job satisfaction is measurable with just one item.... "are you satisfied with your current job ?"
Job satisfaction may not be much of an complex construct after all :D
Donaldo D Canales
University of New Brunswick
Hi Arjun.
As stated by others, you low alpha value could be a result of a low number of questions or poor interrelatedness between items (due to too much heterogeneity in the constructs). Many different researchers have different cut-off criterion thresholds, and quite honestly, you could spend weeks reading differing opinions. I agree that the value of .70 is somewhat arbitrary, but for most practical purposes, minimum values between .70-.80 are acceptable. You don't want values that are substantially lower than .70, which are indicative of an unreliable scale, high measurement error, as you may experience some real problems with your data. 
Also, alphas value should be interpreted within context of the research area. For example, minimum values of .80 are generally accepted in cognitive research, whereas in social research, values of .70 are acceptable. Given your research area of job satisfaction, I think that an alpha of .65 is ok considering your scale is only 3 items.
1 Recommendation
Nathan Thompson
Assessment Systems Corporation
As others have said, the cutoff is totally arbitrary.  You certainly can't expect higher than that with only 3 items.  The number of items should be of far greater concern to you - you are unlikely to meet the needs of the assessment.
Han Suelmann
Van Hall Larenstein, Leeuwarden
The issue of internal consistency was addressed - and I would say: resolved - by Bollen in 1984. His answer is that it depends on the relationship between the construct that is measured and the items that make up the scale. If the items measure things that are caused by that construct, high internal consistency is required. If the items measure things that make up the construct that is measured, there is no necessary relationship, and correlations between items of a scale might even be negative. This is also known as the distinction between an indicative and a formative scale.
Of course, all this doesn't resolve the arbitrariness of any cutoff, nor the issue that Cronbach's alpha is a lower bound (not an unbiased estimate) of reliability. (A better lower bound had already been published by Guttman when Cronbach published his alpha.)
Reference: Bollen, K.A. (1984). "Multiple indicators: Internal consistency or no necessary relationship?" Quality and Quantity, 18, 377-385.
1 Recommendation
Israel Souza
Instituto Federal de Educação, Ciência e Tecnologia do Rio de Janeiro (IFRJ)
I agree with Manfred and Holger.
Depends on your goals. For 3 items is good.
Have you tried the McDonald's omega?
Holger Steinmetz
Universität Trier
Hey guys,
some comments:
1) With regard to the similarity of items: I agree insofar with Manfred that sometimes, items that are very similar in their wording can have problematic consequences: They may distort the factor model as items influence each other rather than an underlying construct influencing the items. Further, respondents may be very annoyed by responding to nearly the same items.
Hence, the crux of item generation is to invent items SIMILAR ENOUGH that they reflect the same underlying latent variable (not "construct") but NOT TOO similar to create those negative consequences. However, most scales that I investigated in the last 10 years contain items that are far more heterogenous and thus simply measure different things. No wonder, that 99% of the "established scales" showed bad results when tested with rigid methods (e.g., CFA).
2) With regard to the single - indicators (SI) vs. multi-indicator (MI) measures: The ONLY thingt that is relevant for enabling use of SI measures is the amount of measurement error that is a result of a) the clarity of the item formulation (--> random error) and b) conceptual closeness (--> systematic error)  - that is how directly the item focuses on the connotation of the focused latent variable. In this regard, SI measures of job satisfaction work well as the item "I am satisfied with my job" is good because it fulfills both criteria.
When researchers have the impression that SI measures can not reflect the "complexity of the construct" - than this is a sign of a poor construct conceptualization as THE concept has a single term and is ofteninterpreted as a singular phenomenon whereas the complexity (=multidimensionality) is treated on the measurement model. Multidimensionality belongs to the conceptual level.
3) Different cut offs for reliability: I would be interested why the reliability depends on the context. Measurement error (of independent variable) leads to endogeneity,  that is, bias of effects on dependent variables. Why should the severity of this bias depend on the context?
4) The fact that alpha raises with the number of items is no joke but makes totally sense: As long as the assumptions hold (tau-equivalence), adding more error-containing variables reduces the random error of the total sum. The problem is that the assumptions fail so fast. In my experience, it is impossible to create more than 3 items (=dissimilarity) that leads to these items measure the same latent variable.
Cheers
Holger
1 Recommendation
Arjun Kumar Shrestha
Kathmandu University School of Management
Hi All
Thank you all for your answers/comments. Let me clear my point more clear. This scale as Manfred point out has three items "All in all, I am satisfied with my job", "In general, O don't like my job" (R) and "In general, I like working here". From your discussions what I have inferred is that it is  better to have alpha value above .70 but this is not the absolute cut off value. When we have less number of items, alpha value less than .70 could be acceptable. 
Thanks once again.
Holger Steinmetz
Universität Trier
Hi Arjun,
I would regard "all in all, I am satisfied with my job" as the FOCAL indicator, because exactly this aspect is the connotation of your intended latent variable.
For example these two studies show that this item is the best valid version and sufficient for measuring job satisfaction:
Nagy, M. S. (2002). Using a single-item approach to measure facet job satisfaction. Journal of Occupational and Organizational Psychology, 75, 77–86.
Wanous, J. P., Reichers, A. E., & Hudy, M. J. (1997). Overall job satisfaction: How good are single-item measures? Journal of Applied Psychology, 82(2), 247–252.
It seems that one of the other two items add something different leading to lack of correlation between these. Take a look at the correlation matrix: What's the outlier? I would suspect that the negatively worded item is the bad boy. I tend you avoid negatively worded items. There are tons of studies showing that such inverted  items measure different things than the same version of hte item formulated positively. See e.g.,
Green, D. P., and J. Citrin (1994). Measurement Error and the Structure of Attitudes: Are Positive and Negative Judgments Opposites? American Journal of Political Science, 38(1), 256-281.
Melnick, S.A. and Gable, R.K. (1990). The use of negative item stems: A cautionary note. Educational Research Quarterly, 14(3), 31–36.
Horan P. M., C. DiStefano, R. W. Motl (2003). Wording Effects in Self-Esteem Scales: Methodological Artifact or Response Style? Structural Equation Modeling, 10(3), 43-455.
Miller, T. R., and T. A. Cleary (1993). Direction of Wording Effects in Balanced Scales. Educational and Psychological Measurement, 53, 51-60.
Marsh H. W. (1996). Positive and Negative Global Self-Esteem: A Substantively Meaningful Distinction or Artifactors? Journal of Personality and Social Psychology, 70(4), 810-819.
HTH
Holger
José Vasconcelos-Raposo
Universidade de Trás-os-Montes e Alto Douro
So far you have received very good advice as they are consolidated in the literature.
However, one should not loose track of the fact that at the end what really matters is the researchers inteletual integreity. Thus, if you have obtained an alpha of .65 report this nunmber and complement it with other relevant data details as previous colleagues have suggested in their answers. 
it is yet to be demonstrated that humans are organized in a decimal manner. Mathematics (A cultural construction or humana invention) are only good as far as they help us systematicaly gather and report data. 
Provide rich methodological details in your publication so that others may be able to duplicate your study. This is the overall purpose of publishing. 
4 Recommendations
Frederick Dayour
SD Dombo University of Business and Integrated Development Studies
I agree that the Cronbach's apha coefficients tells whether or not the items under the various construct effectively measure the constructs. For appropriateness, different books recommend different values such reliability analysis. Pallant (2005) recommend that for a be
1 Recommendation
Kodzo Awoenam Adedzi
Université Laval
If the value of the Cronbach's alpha coefficient is greater than or equal to 0.70, the result is acceptable since the coefficient is equal to or exceeds the minimum threshold minimum threshold of 0.70 (Nunnaly 1978). But the threshold is arbitrary although it is widely accepted by the scientific community to obtain a satisfactory internal consistency.
Note that ccertains authors set a minimum threshold to 0.75 or 0.80, while others set it to 0.60. However, when the alpha value is 0.70, the standard error of measurement is equal to more than half (0.55) of the standard deviation of the distribution of the total score.
1 Recommendation
Richard Windsor
George Washington University
Dear Colleague,
Using a 0.70 or even lower represents poor measurement of any construct. I strongly recommend, have routinely apply in my NIH funded evaluation studies,  and have taught my graduate students that a study needs to set > 0.80 is a base Cronbach's Alpha. In Chapter 4, "Measurement and Analyses in Evaluation",  of my graduate level textbook "Evaluation of Health Promotion-Disease Prevention and Management Programs", 5th Edition, Oxford University Press, 2015 a detailed discussion and multiple examples of Scales with Factor and Psychometric Analyses confirming Validity and Reliability including Item Analyses, Internal Consistency Analyses, and Stability Analyses, are presented.
Meta-Analyses in the literature of Scales used by evaluation studies confirm that > 75% DID NOT produce evidence of the Validity and Reliability of the Scale used in the evaluation.
2 Recommendations
Hi Arjun
In general, I would say that .65 in three-item index or scale is sufficient enough, but unfortunately it would be a bit hard to substantiate in literature. Most of reliable sources, e.g. cited above by other colleagues, legitimate the .70 threshold. However, we have to remember that Cronbach's alpha is the measurement of (homo-)heterogeneity of the scale/index. In fact, it is all about the indicators (items) inter-correlations and their partial correlations to the whole scale/index you'd like to name after your construct. Nothing more, nothing less. As far as I know it doesn't matter what kind of construct you try to measure and what kind of indicators you'd use. It is just mathematics.
If you'd like to justify usage of an index which alpha is below .70 you can try to cite Field, A. P. (2009). Discovering statistics using SPSS (3rd ed.). Introducing statistical methods. London: Sage Publications, who analyse the issue of using Cronbach's alpha extensively. For example, Filed discussed there the problem of rising alpha along with adding items to index/scale; in other words - the more items, the highest alpha even if the 'real' reliability (I mean reliability based on a logic behind the set of indicators) is not rising.
I would also suggest you to check so called corrected item-total correlation (item-rest correlations) for each of your scale/index items. It should be over .50 which means that the item is at least moderately correlated to the whole scale/index.
The third problem is, in my opinion, if you asked literally these three questions: "All in all, I am satisfied with my job", "In general, I don't like my job" (R) and "In general, I like working here", and the Cronbach's alpha is below .70 you should double check the data, looking for typos and other errors (unnatural outliers, etc.). It seems highly unlikely that answers to such three nearly identical questions were not nearly perfectly correlated... The Cronbach's alpha should be over .85 here or even over .90; as each of these three seem to be focal indicators (as Holger Steinmetz  said above). I would bet that the problem occurred somewhere in the reversed item, either in data input (reversed was not reversed properly) or the participants had a problem to answer in the double negation (I don't like... and then the likert type item "not agree"...).
1 Recommendation
Arjun Kumar Shrestha
Kathmandu University School of Management
Hi Michal
While going back to data entry, I found some errors. After correcting errors and removing outliers, the value of Cronbach's alpha increased to .77. However, when I subjected this data to CFA using LISREL, I didn't get the values of Fit Indices, rather the output showed "The Model is Saturated, the Fit is Perfect". Wondering how to interpret it. Is something wrong with my analysis.
I have attached the output herewith. 
Holger Steinmetz
Universität Trier
Hi Arjun,
you should read a bit about identification in CFA/SEM models. Briefly, models are identified, if at least as many observed elements (i.e., covariances and variances of the observed variables) as estimated parameters exist. If there are more observed elements than parameters, the model is overidentified and can be tested. Your model is "just identified" - that means you have six observed elements (3 variances and 3 covariances) and also 6 parameters (3 error variances, 3 loadings; the latent variance is fixed to one).
Hence, you cannot test the model, as the df are 0. Enlarge the model by incorporate other observed/latent variables. Then you get a fit. As you already moved into the CFA world, you can easily calculate the composite reliablitity which is better than alpha as it does not require equal loadings, see
Fornell, C., & Larcker, D. F. (1981). Evaluating structural equation models with unobservable variables and measurement error. Journal of Marketing Research, 18, 39–50.
Best,
Holger
3 Recommendations
David E Drehmer
DePaul University
Arjun:  Let me try to answer your question.  The following formula specifies the relationship between validity and reliability.
     Rxy <= Sqrt(Rxx)
where Rxy is the validity of the relationship between your independent and dependent variable and Rxx is the correlation of your measure with itself, i., e., the reliability.. 
Stated very simply reliability is a prerequisite for validity.  Reliability is a necessary condition for validity.  Conversely, satisfactory validity means that the reliability was good enough. 
The point is that when you do your analysis, if you find a relationship between your IV and your DV, then the reliability of your predictor was sufficient (to detect the relationship.)  If you do not detect the relationship, then the reliability of your measure may be one of many reasons. 
As a sidelight one can solve the Spearman-Brown formula for the reliability of a single item given the reliability of the measure and the total number of items.  In your case, the reliability of a single item would be estimated to be 0.3823.  If we round that off at .4 a very simple formula can be derived to predict the reliability of a test when only the number of items is known.  Assuming the reliability of a single item is 0.4, then a test consisting of N parallel items is given by the formula
     Rxx = N / (1.5+N). 
Since the estimated reliability of your items are just a little bit under 0.4, the estimate provided by this expression will be biased on the high side.  Nevertheless, it may be useful a a quick way to get a crude estimate of the reliability.  For example, this formula would predict the following reliability for a particular number of parallel  items as follows:
Items Reliability
1      0.400
2      0.571
3      0.667
4      0.727
5      0.769
6      0.800
7      0.824
8      0.842
9      0.857
10    0.870
11    0.880
12    0.889
13    0.897
14    0.903
15    0.909
16    0.914
17    0.919
18    0.923
19    0.927
20   0.930
I hope this helps.
1 Recommendation
Frederick Dayour
SD Dombo University of Business and Integrated Development Studies
The stronger the Cronbach's Alpha Coefficient the better the inter correlation between items and constructs. However, different books have different cut-offs points for it which should be considered. 
1 Recommendation
Arjun Kumar Shrestha
Kathmandu University School of Management
Hi David
I'm not clear about it. Would you please suggest me books or articles. 
Larisa Nikitina
University of Malaya
     I agree with those who stated that it is arbitrary to demand that Cronbach’s alpha value exceed .70. Acceptable values would depend on the nature of a study (e.g., in an exploratory study it is ok if the Cronbach’s alpha is lower than .70 – see Hair et al., 2006) and also academic discipline (e.g., I have noticed that in education and business studies it is preferred if the value above .70). As some researchers here have commented, the length of scale would also affect the value, and a 3-item scale with Cronbach’s alpha of .70 is quite ok for me (if I were reviewing a paper).
     As to some literature on this topic, which you have requested to suggest, Nunally (1967) maintained that in theoretical studies, even modest reliabilities of .60 or .50 may be acceptable. To agree, Hair et al. (2006) proposed that though “generally agreed” lower limit for Cronbach’s alpha value is .70, it may decrease to .60 and still be acceptable, especially in exploratory studies and in research in the Social Sciences. Furthermore, Aron and Aron (1999) proposed that in research in psychology, Cronbach’s α of .60 or even lower could be adequate; however, values exceeding .7 are preferable (Aron & Aron, 1999).     
References
Aron, A., & Aron, E. (1999). Statistics for psychology (2 ed.). Upper Saddle River, NJ: Prentice Hall. 
Hair, J. F., Black, W. C., Babin, B. J., Anderson, R. E., & Tatham, R. L. (2006). Multivariate data analysis. Prentice Hall Pearson Education.
Nunnally, J. C. (1967). Psychometric theory. New York: McGraw-Hill.
Best luck with your research!
3 Recommendations
Richard Windsor
George Washington University
I believe that a Cronbach Alpha < 0.70 is an inadequate measure of any "construct" (#1). Calling a project, an Exploratory Study, does not eliminate or reduce the salience of measurement principles and standards. Multiple published Meta-Analyses (MA) of the "Validity and Reliability of Psych-Social Scales" consistently confirm > 75% DO NOT meet acceptable standards of measurement.  A critical  issue, defined  by the MA's, and raised in this discussion, is the typical lack of representativeness and the inadequacy of the samples (small sample size/no power) presented by most social and behavioral studies (SBS). It is an illusion that a "3 question scale" based on the responses of a small (<100 subjects)  non-representative (convenience) sample with  an Alpha = 0.60, produces any true measure of any psycho-social construct.
Based on 35 years of NIH-DHHS funded behavioral research as a Principal Investigator, I fully recognize that it is very challenging to measure a "Construct" for which there is no 'Objective Criterion Measure". However, defining/applying such low standards of measurement to document the status of a person in a defined population at risk eliminates  the  opportunity to make  a serious contribution  to the SBS evidence-base. The basic biological/health sciences have created  valid measures of "Health Status" by setting the highest standards of measurement validity. Social and Behavioral Studies need to define and far more consistently apply much higher standards to produce valid measures of anything. 
#1: Windsor, R. (MS PhD MPH), Evaluation of Health Promotion-Disease Prevention Programs: Improving Population Health Through Evidenced-Base Practices, 5th Edition, Oxford University Press, New York, April, 2015
(See Chapter 4: Measurement and Analyses in Evaluation)
1 Recommendation
Larisa Nikitina
University of Malaya
It is a fact that psychology and behavioural science research mostly (at least as for now) relies on small populations, which is well accepted and well justified  (Kaplan, 2000; Kassin, Fein, & Markus, 2011; Taylor, Peplau, & Sears, 2003; Whitley & Kite, 2013). Besides, there are laboratory experiments that simply cannot include a statistically representative population. 
Each method and each statistical procedure has its owns caveats and weaknesses. There are also wider epistemological issues. To claim that one has achieved a "true measure" with having a Croncha's alpha = .60 is a folly. As it would be with a Cronbach's alpha of .89.
1 Recommendation
Bruno Campello de Souza
Federal University of Pernambuco
The basic point to be addressed in this discussion is the meaning of the Cronbach Alpha score, and that has to do with reliability, which, in turn, is really a measure of how precisely are you measuring a construct. In a sense, one can interpret it as if it were an estimate of the noise-to-signal" ratio. A value of zero means that there is only "noise", whereas a value of one would indicate that a perfect assessment of the "signal" is being made.
That being the case, the question becomes one of how much "noise" one is willing to put up with, something which, of course, is quite subjective. In hypothesis testing there is a similar issue regarding the probability of the null hypothesis, which was solved by means of a consensus over an arbitrary value of p<.05. However, there is no such formal and explicit consensus that I know of regarding Cronbach Alpha scores.
Perhaps as important as the Alpha score one finds is how useful the construct being measured is shown to be. If one obtains a relatively low score for a construct, but then shows that such a construct still statistically associated with relevant dependent variables, it can be argued that, in spite of the high level of "noise" involved in the measurement, said construct has a robust enough relationship to the dependent variable so as to still be observable. On the other hand, if a low level of reliability is found and there are no significant associations to relevant variables, one cannot say whether this happened because such a relationship does not exist or simply because there is too much noise in the measurement.
Therefore, maybe the best way to deal with this would be to:
A) Accept any constructs with a Cronbach Alpha score of .60 or more, also accepting any associations, or lack thereof, with them;
B) Accept constructs with a lower Cronbach Alpha score, but still, say, of .50 or more, if they are shown to be statistically associated to relevant variables.
C) Reject any constructs with a Cronbach Alpha score of less than .60 if f they are not shown to be statistically associated to relevant variables.
Of course, the hard part is to get the publishers and reviewers of scientific journals and seminars on board with this. Regardless, it is what makes sense to me.
1 Recommendation
Dear Sir,
Cronbach’s Alpha is a commonly employed index of test reliability.
Nunnally (1967) defined reliability as "the extent to which [measurements] are repeatable and that any random influence which tends to make measurements different from occasion to occasion is a source of measurement error" (p. 206). Source Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of applied psychology, 78(1), 98.
Alpha coefficient ranges in value from 0 to 1 and may be used to describe the reliability of factors extracted from dichotomous (that is, questions with two possible answers) and/or multi-point formatted questionnaires or scales (i.e., rating scale: 1 = poor, 5 = excellent).
The higher the score, the more reliable the generated scale is.
Nunnaly (1978) has indicated 0.7 to be an acceptable reliability coefficient but lower thresholds are sometimes used in the literature. Source Santos, J. R. A. (1999). Cronbach’s alpha: A tool for assessing the reliability of scales. Journal of extension, 37(2), 1-5.
Nunnally changed his reliability recommendations from his 1967 edition of Psychometric Theory in his 1978 edition. In 1967, he recommended that the minimally acceptable reliability for preliminary research should be in the range of .5 to .6, where as in 1978 he increased the recommended level to .7 (without explanation). Source Peterson, R. A. (1994). A meta-analysis of Cronbach's coefficient alpha. Journal of consumer research, 21(2), 381-391.
Table 1 Selected recommended reliability Levels on Page number 382 in Peterson, R. A. (1994). A meta-analysis of Cronbach's coefficient alpha. Journal of consumer research, 21(2), 381-391.
The coefficients alpha ranged from .06 to .99 with a mean of .77 and a median of .79. Peterson, R. A. (1994).
Reliability is concerned with the ability of an instrument to measure consistently.
For example, if a test has a reliability of 0.70, there is 0.51 error variance (random error) in the scores (0.70×0.70 = 0.49; 1.00 – 0.49 = 0.51). As the estimate of reliability increases, the fraction of a test score that is attributable to error will decrease. Source Tavakol, M., & Dennick, R. (2011). Making sense of Cronbach's alpha. International journal of medical education, 2, 53.
Regards,
Chalamalla.Srinivas
1 Recommendation
Frederick Dayour
SD Dombo University of Business and Integrated Development Studies
I think 0.7 or more is often recommended as the best index for reliability abeit some studies use 0.6 or more. 
1 Recommendation
  • Nunnally (1978) recommends a minimum level of .7. Cronbach alpha values are dependent on the number of items in the scale.
  • When there are a small number of items in the scale (fewer than 10), Cronbach alpha values can be quite small.
  • In this situation, it may be better to calculate and report the mean inter-item correlation for the items.
  • Optimal mean inter-item correlation values range from .2 to .4 (as recommended by Briggs & Cheek 1986).
  • Source Pallant, J. (2013). SPSS survival manual. McGraw-Hill Education (UK).
  • Please Look at Table 1 Selected Recommended Reliability Level Peterson (1994, p.382)
  1. Starkweather, J. (2012). Step out of the past: Stop using coefficient alpha; there are better ways to calculate reliability. University of North Texas. Research and statistical support.
  2. Serbetar, I., & Sedlar, I. (2016). Assessing Reliability of a Multi-Dimensional Scale by Coefficient Alpha. Revija za Elementarno Izobrazevanje, 9(1/2), 189.
  3. http://data.library.virginia.edu/using-and-interpreting-cronbachs-alpha/
24 Recommendations
Richard Windsor
George Washington University
A Cronbach Alpha value of < 0.70 confirms  < 50% error. How to produce a reliability coefficients of > 0.80, Internal Consistency and Test-Retest, hav been established for > 50 years. I agree with Dr Srinivas about Item r values.
I strongly believe that the "Nunnally" recommendation for  r  > 0.70 should be increased to > 0.80. Why would you want to produce a set of question (Not a Scale ???) with an r < 0.80.  Ask: Do you really think you are measuring any construct with 3-4 questions with such a high level of error?Where is the discussion about "Validity"
As an US-NIH Principal Investigator and Professor in Academic Health Sciences Centers for 40+ years, This is one of many continuing reasons why the "Social Sciences (SS)" are held in such low esteem by the biological-basic sciences.
Ali B. Mahmoud
St. John's University
Some authors have considered alpha values of .6 or above acceptable for a multi-item measure to be internally consistent.
1 Recommendation
David E Drehmer
DePaul University
There is nothing mandatory about a reliability of .7 to test hypotheses.  The real issue in hypothesis testing of a relationship is validity.  Reliability of your measurement instruments, whether IV or DV, creates an upper bound on validity, however.  The relationship between validity and reliability can be expressed as follows:
                rxy <= sqrt(rxx)
If you find the relationship, then you have validity and the reliability was sufficient.  The only situation that comes to mind where reliability estimates are an issue is when you fail to find the relationship.  In that case you may not know whether a relationship exists or whether it was just failed to be detected due to low reliability.
A much more modern treatment is provided by Item Response Theory, specifically Rasch models.  The Rasch model gets at the conceptual issue of reliability by examining the ability of the scale to separate persons and or items.  Much more information is available through Rasch analyses than in classical test theory.  You might find it useful to consult one of the texts below or to explore http://www.winsteps.com.   The Englehard  book does a really nice job of comparing test score theory scaling theory applied to measurement.  Boone et al. will walk you through the construction and interpretation of graded response items such as Likert format items in building and testing invariant scales.  The two Wright books provide a very accessible introduction to Rasch models in the dichotomous case and in the rating scale case and other multiple response graded items case.       
Hope this is helpful,
David Drehmer
References:
Boone, W. J., Staver, J. R., & Yale, M. S. (2014). Rasch analysis in the human sciences. Springer.
Engelhard, G. (2012). Invariant measurement: Using Rasch models in the social, behavioral, and health sciences. New York, N.Y. : Psychology Press
Wright, B. D., & Masters, Geoff.  (1982).   Rating Scale Analysis.  Chicago:  Mesa Press. 
Wright, B. D., & Stone, M.H. (1969).  Best Test Design, Chicago: Mesa Press.
14 Recommendations
Bruno Campello de Souza
Federal University of Pernambuco
It would seem that the basic point is what you want to achieve. For the goal of having a psychometric instrument that is precise and trustworthy enough for clinical use (or something analogous to that in a non-psychological context), it would probably be best to go for higher values of Cronbach Alpha (say, .70 or .80+). However, if the aim is to measure a trait with enough accuracy to establish the existence of a relationship with other traits, for research purposes (e.g., to test a model or to find results to guide the drafting of one), then you could very well accept lower values (something like .60 or even less) IF an acceptable  correlation occurs. On the other hand, if the Alpha score is low AND you have no significant correlation with other variables of interest, then you got no way of determining whether the absence of the correlation was due to the lack of an association or simply due to error of measurement.
Bottom line is that, as always, the saying from George Edward Pelham Box applies:
Remember that all models are wrong; the practical question is how wrong do they have to be to not be useful.
One might also remember another important quote from Box:
Since all models are wrong the scientist cannot obtain a "correct" one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity.
Basically, you go with what can be shown to work.
4 Recommendations
Bruno Campello de Souza
Federal University of Pernambuco
Adding to what I said before, it is also important to distinguish between agreeing with an author such as Nunnally, Hair et al, or whomever that might be and blindly accepting said author's statements based only on his/her personal authority. One has to have in mind what is being done mathematically and logically, plus the scientific implications.
A completely separate matter is the adherence to a strong consensus when there is one, such as is the case of p<.05. Here, a community of scientists has agreed upon some standard, usually based on convenience and practical use, and one has to comply in order for one's contribution's to be accepted by said community. It seems to me that a Cronbach Alpha score of .70 is not that strong of a consensus, as one could infer from the discussion we are all having on the subject right now.
2 Recommendations
Pedro Callado
University of Lisbon
Also, remember that a very high Alfa might indicate that the variables are measuring exactly the same question. If there's a correlation between different variables (i.e. significantly different in their content and formulation) I don't see a problem with an Alfa bellow .7.
1 Recommendation
Z. A. Al-Hemyari
University of Nizwa
Yes, and please refer to:
1. Landau, S. and Everitt, B. S.(2004). A Handbook of Statistical analyses using SPSS.
2.Howitt, D. and Cramer, D.(2008). Introduction to SPSS.
Regards,
Zuhair
Shahah ALTAMMAR
Public Authority for Applied Education and Training
"There is no sacred level of acceptable or unacceptable level of alpha. In some cases, measures with (by conventional standards) low levels of alpha may still be quite useful" Ref: Use and Abuses of Coefficient Alpha, Schmitt 1996
2 Recommendations
Laura Hämmäinen
University of Turku
Would anyone have a source to cite regarding the acceptability of a >.6 alpha?
3 Recommendations
Holger Steinmetz
Universität Trier
Hi Laura,
unfortunately, even the .70-"threshold" seems to be a myth, see
Lance, C. E., Butts, M. M., & Michels, L. C. (2006). The sources of four commonly reported cutoff criteria: What did they really say? Organizational Research Methods, 9(2), 202-220. doi:10.1177/1094428105284919
Having said that most examples of low alpha simply result from divergences of its basic assumption (essential tau-equivalence which is - in factor parlance a) the items reflect the same latent variable and b) their loadings are equal.
If "a" holds but not "b", computing the composite reliability may be a better solution and gives a more realistic and often substantially higher alpha. Most cases of low alpha, however, are a result of not even "a" being true - that is the scale is a mess of different things lumped together.
Hence, check the factor structure of the scale.
Best,
Holger
2 Recommendations
Robert Trevethan
Independent author and researcher
I suspect that some notions about Cronbach's alphas fall into the category of myths. One of these is that the higher an alpha value, the better. A high alpha value can be quite meaningless if it is based on a large number of items, say, more than 20, yet researchers frequently cite high alpha values as if those values are somehow great achievements or reassurances (reassurances of who knows what).
It would appear that many researchers cite Cronbach's alpha values simply because many other researchers do so. In my view, this is a practice, based primarily on momentum, that urgently requires re-examination, particularly because many alpha values are likely to be meaningless.
If alpha values are very low, depending on the number of items in scales, they could indicate an undesirable lack of association among individual items - so they can serve a useful purpose.
I notice that a reference to the following article has already been provided above. Here is the full reference.
Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8, 350–353. http://dx.doi.org/10.1037/1040-3590.8.4.350
I strongly recommend it.
t would appear that many researchers cite Cronbach's alpha values simply because many other researchers do so. In my view, this is a practice, based primarily on momentum, that urgently requires re-examination, particularly because many alpha values are likely to be meaningless.
If alpha values are very low, depending on the number of items in scales, they could indicate an undesirable lack of association among individual items - so they can serve a useful purpose.
Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74, 107–120. ����
The following references might also be helpful:
Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98–104. http://dx.doi.org/10.1037/0021-9010.78.1.98
Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74, 107–120.
Both authors provide revealing insights about the nature of alpha as well as helpful background information.
3 Recommendations
Abdulwakil Olawale Saba
Lagos State University
Very beneficial responses. Thanks all.
If your alpha is too low (<.07) you the delete items until you get about .07
Srikant Manchiraju
Florida State University
You can proceed ahead with testing study hypotheses. See also, Nunnally (1968/69) for other alpha values.
1 Recommendation
Veloo Doraisamy
Universiti Utara Malaysia
I am studying on safety performance in manufacturing. there are 7 variables, the alpha shows lower than 0.6 in two of the variable however i have reverse coded some of the items ( negatives) and removed outliers based on the box plot. If I remove some of the negative items then I will definitely can achieve greater than 0.7. But i do not intend to remove .How to go about achieving greater than 07. alpha. The sample requires is 360 however I manage to get 465.
Najam Sahar
University of Malaya
Having 0.7 is good reliability but there is disagreement to consider it as gold standard. A number of factors can affect the reliability value of a newly developed scale such as number of items in the instrument, spread of response options and also theoretical model you are using to operationally define the construct. In psychological measurement you can find some references supporting <.7 reliability. Here I am sharing one reference for your convenience. Hope it would be helpful.
Shrout, P. E. (1998). Measurement reliability and agreement in psychiatry. Statistical Methods in Medical Research, 7, 301–317.
Holger Steinmetz
Universität Trier
Veloo,
could you post the question wordings of the items? No 1 reason for low internal consistency measures is lack of homogeneity, not lack of reliability. Only under very rigid circumstances (all items measure the same factor and with equal strength), alpha is a reasonable measure of reliability. Especially the latter (equal strenght/loadings) is seldom the case. Hence, OMIT calculating alpha and calculate alternatives, like omega (if homogeneity/unidimensionality holds).
McNeish, D. (2018). Thanks coefficient alpha, we’ll take it from here. Psychological Methods, 23(3), 412.
Deng, L., & Chan, W. (2017). Testing the difference between reliability coefficients alpha and omega. Educational and Psychological Measurement, 77(2), 185-203.
Best,
Holger
3 Recommendations
John-Kåre Vederhus
Sørlandet Hospital
This Picture says more than thousand Words.
Best
John-Kåre
Adeel Luqman
Shenzhen University
Its actual lack consistency among items, not lack of reliability. A generally accepted rule is that α of 0.6-0.7 indicates an acceptable level of reliability, and 0.8 or greater a very good level. However, values higher than 0.95 are not necessarily good, since they might be an indication of redundance
1 Recommendation
Veloo Doraisamy
Universiti Utara Malaysia
Mr. Holger, Those questions are adopted from the current journal. There are some items reverse coded ( negative) If I drop those negative items my alpha goes as high as 0.78 to 0.85. Is the negative question redundant.
Holger Steinmetz
Universität Trier
Veloo, this is not informative.
I asked for the exact posting of the question wordings.
Beyond that, if you have reversely coded items, you (of course) have to recode them in such a way that high numerical values correspond with the dimension of interest. For instance, assume I want to measure job satisfaction, and I have two items
a) I am satisfied with my job (with categories 1 ("disagree") to 5 ("agree")
b) I hate my job (reverse; again with the categories 1 ("disagree") to 5 ("agree") )
Of course, both will strongly negatively correlate. In a set of other items *of the same job satisfaction dimensions*, this will push alpha down. Hence, you would recode the second item such that it becomes
"I hate my job" (1 (AGREE) to 5 (DISAGREE).
Such a recode would lead to a positive correlation because for both items, a high numerical values of say 5 would correspond to high satisfaction and high "non-hate" (=satisfaction).
Whether both items are really measures of the same underlying job satisfaction (that is is "high non-hatred" a reflection of high satisfaction) is a totally different animal and concerns validity and not reliability. I doubt that, because the absence of a negative feeling does not mean the presence of a positive feelings. Consequently, factor model containing both forms of items almost never fit. Yes, mixing different versions reduces the chance of stepping into the acquiescence bias trap but you cure skin rash by cutting of your leg.
Best,
Holger
M. Nazmul Islam
BRAC University
Cronbach’s alpha value 0.7 and above is better. However, alpha value less than 0.7 is also acceptable (Hair et al., 2006) when you are measuring latent variable with three items and if LVs are correlated. Thank you.
Robert Trevethan
Independent author and researcher
Ping Chen, in case you don't hear back in relation to your query, the following might be helpful:
Hair, J. F., Jr., Black, W. C., Babin, B. J., & Anderson, R. E. (Eds.). (2014). Multivariate data analysis (7th ed.). Harlow, UK: Pearson.
There might be a more recent edition.
Robert Trevethan
Independent author and researcher
Ping Chen, you're welcome.
In case you're interested in some of the ins and outs of coefficient alpha (which is what it is more appropriately called than Cronbach's alpha), the following might help.
Cho, E. (2016). Making reliability reliable: A systematic approach to reliability coefficients. Organizational Research Methods, 19(4), 651–682. https://doi.org/10.1177/1094428116656239
Cho, E., & Kim, S. (2015). Cronbach’s coefficient alpha: Well known but poorly understood. Organizational Research Methods, 18(2), 207–230. https://doi.org/10.1177/ 1094428114555994
Cortina, J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78(1), 98–104. https://doi.org/10.1037/0021-9010.78.1.98
Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8(4), 350–353. https://doi.org/10.1037//1040-3590.8.4.350
Sijtsma, K. (2009). On the use, the misuse, and the very limited usefulness of Cronbach’s alpha. Psychometrika, 74(1), 107–120. https://doi.org/10.1007/s11336-008-9101-0 [There are commentaries additional to this article in the same issue of the journal.]
Taber, K. S. (2018). The use of Cronbach’s alpha when developing and reporting research instruments in science education. Research in Science Education, 48(6), 1273–1296. https://doi.org/10.1007/s11165-016-9602-2
There is more to alpha than meets the eye, and there are a lot of misconceptions concerning it. In my view, it is worshipped much more than it should be - particularly by researchers who, like sheep, just do what everyone else is doing. As a result, naivety and misconceptions are consolidated and perpetuated.
All the best with your research!
1 Recommendation
Holger Steinmetz
Universität Trier
Hi Ping,
I will hit in the same notch as Robert and add a few things:
1) I looked up the famous Hair et al. reference everyone seems to adore like a religious manifesto. First, the authors *report* that .7 is usually been seen as acceptable. Thus, this is a description of usual perception and practice and not a recommendation (even more, as they present no arguments). Second, besides, while I don't doubt the authors' competency, I was really puzzled by their definition of reliability as the "degree of consistency between multiple measurements of a variable." (2014, p. 123). Then they say that one form is re-test reliability. It is the other way around: Reliability is the ability to measure the thing of interest twice and to get the same result of with re-test reliability is the most direct operationalization. Internal consistency matches this form IFF (if and only if) certain rigid and unrealistic assumptions are met (convergent measurement and equal factor loadings, called "essential tau-equivalence).
2) The .70 threshold is a rule of thumb like so many others and there is no justification for that. I know we need decision criteria but this does not mean to blindly follow them. Rather, you should cultivate a mindset that reflects the nonsensical nature of these rules. This is the case for alpha, loadings, multicolinearity, and p-values alike. The fact is: If an independent variable is unreliable (i.e., has random measurement error) the effect of this variable in a regression or path model will be underestimated--with a constant increase of the bias with decreasing reliability. There is no magical border when problems start.
3) If the dependent (instead the independent) variable is unreliable, the unstandardized estimate will NOT be biased, as the random error becomes part of the error term. However, the standard error of the effect will increase, thus, reducing the power of the estimate and enlarging the confidence interval. If you are interested in standardized estimates, these will be lowered as the variance of the DV gets larger (to the addition of the error variance) and this reduces the standardized estimates as Beta = B*SD_IV / SD_DV
4) The differentiation between the role of a certain alpha in exploratory vs. confirmatory research is nonsensical. If an explored regression model fails due to lack of reliability then the research area being explored will probably die. Why is this less relevant than failed hypothesis tests that have a better founding and probably will be repeated/followed anyway (given the resistance of people/researchers to give up beloved ideas)?
5) A latent variable model is often treated as a remedy of problems of unreliability but this is only the case if the model is correctly specified. Hence, the precondition for estimating anything (even composite "reliability") is that the model is correct and a super-duper-clean fit shows that. That's one reason why the combination of "ignoring misfit" and "relying on the AVE" is nonsensical because if the model does not fit, estimation of all the quantities on which AVE rests, is useless.
HTH
--Holger
5 Recommendations
Ahmed Ali Jaleel
Management and Science University
Dear Ping Chen
Here is a useful article which is based on empirical findings on the question: if there is a particular level of alpha that is desired or adequate.
An empirical analysis of alleged misunderstandings of coefficient alpha
R. Hoekstra, J. Vugteveen, M. J. Warrens & P. M. Kruyen
To cite this article: R. Hoekstra, J. Vugteveen, M. J. Warrens & P. M. Kruyen (2019) An empirical analysis of alleged misunderstandings of coefficient alpha, International Journal of Social Research Methodology, 22:4, 351-364, DOI: 10.1080/13645579.2018.1547523
Sincere best wishes
2 Recommendations
Z. A. Al-Hemyari
University of Nizwa
Dear Arjun
I would like to mention the following points:
1. The reliability measures could be developed for the whole questionnaire or for each dimension of the questionnaire, and the acceptable value should be greater than or equal 0.7.
2. Reliability consists of several measures:
Item alpha reliability and split-half reliability assess the internal consistency of the items in a questionnaire – that is, do the items tend to be measuring much the same thing?
Split-half reliability on SPSS refers to the correlation between scores based on the first half of items you list for inclusion and the second half of the items. This correlation can be adjusted statistically to maintain the original questionnaire length.
Coefficient alpha is merely the average of all possible split-half reliabilities for the questionnaire and so may be preferred, as it is not dependent on how the items are ordered. Coefficient alpha can be used as a means of shortening a questionnaire while maintaining or improving its internal reliability.
Inter-rater reliability (here assessed by kappa) is essentially a measure of agreement between the ratings of two different raters. Thus it is particularly useful for assessing codings or ratings by 'experts' of aspects of open-ended data; in other words, the quantification of qualitative data. It involves the extent of exact agreement between raters on their ratings compared with the agreement that would be expected by chance. Note then that it is different from the correlation between raters, which does not require exact agreement to achieve high correlations but merely that the ratings agree relatively for both raters.
3. For more details, please refer to
1. Cronbach, L.J. (2004), ‘My current thoughts on coefficient alpha and successor procedures’, Educational and Psychological Measurement, Vol. 64, No. 3, pp.391-418.
2. Howitt and Cramer (2008). Introduction to SPSS, Pages 249-258.
3. Z. A. Al-Hemyari and A. M. Al-Sarmi (2016). Validity and Reliability of Students and Academic Staff’s Surveys to Improve Higher Education. Educational Alternatives, Journal of International Scientific Publications, Vol.14, pp. 242-263
4.A. M. Al-Sarmi and Z. A. Al-Hemyari (2014). Quantitative and qualitative indicators to assess
the performance of higher education institutions. Int. J. of Information and Decision sciences,
Vol.6, No. 4, pp.369-392 (Inderscience).
5. Z. A. Al-Hemyari and A. M. Al-Sarmi (2014). Statistical characteristics of performance indicators. Int. J. of Quality and Innovation. Vol.2, No.3-4, pp. 385-309 (Inderscience).
Regards,
Zuhair
Z. A. Al-Hemyari
University of Nizwa
Holger Steinmetz be noted that my answer is addressed for the main question given by Arjun Kumar Shrestha.
Holger Steinmetz
Universität Trier
Yes, that was my point. You directly answered the (years-old) question of Arjun without considering the discussion. This is way threads go back and forward for years and turn in cycle. If this sounds rude, it is not meant so!
All the best,
--Holger
1 Recommendation
Robert Trevethan
Independent author and researcher
Z. A. Al-Hemyari, please permit me to comment on your post about four posts above here.
I begin by noting that within your first point, you mentioned calculating coefficient alpha across a whole questionnaire (by which I suspect you mean scale ). This can often be quite misguided. It is quite possible to have a high alpha value despite subscales within a scale being highly unrelated to each other simply because of the large number of items involved. So calculating alpha under those circumstances is meaningless and can be very deceptive.
Within your first point, also, I think it's also important to realise that the alpha value of .70 being acceptable is often a misleading rule of thumb. Under some circumstances (e.g., when there are only a few items) a lower alpha value might be acceptable, and often an alpha above .90 can be produced from a large number of items (say, more than 25), many of which are not much related to each other at all.
Within your second point, you have dealt with interrater reliability, which is unrelated to the context in which coefficient alpha is used.
Like Holger Steinmetz, I do not want to be rude, but I think it's important to avoid either misleading or confusing people who are seeking authoritative information about a topic.
Z. A. Al-Hemyari
University of Nizwa
Holger Steinmetz Thanks for your comment (not your answer) ... so logical!!
Z. A. Al-Hemyari
University of Nizwa
Robert Trevethan please let me mention the following:
Thank you for reminding me of some well-known information on the Cronbach's alpha. I hope that the topic is not a debate, and it should not be considered for the purpose of argument.
My answer was simple and clear to a simple and clear question; and based on my experience and some of the references given below and will surely benefit to any reader. In addition, I don't think that there is a need to expand on the topic.
1. Bendermacher, Nol (2010) "Beyond Alpha: Lower Bounds for the Reliability of Tests," Journal of Modern Applied Statistical Methods:
Vol. 9 : Iss. 1 , Article 11.
2. Douglas G. Bonett and Thomas A. Wright (2014). Cronbach’s alpha reliability: Interval estimation, hypothesis testing, and sample size planning. Journal of Organizational Behavior, J. Organiz. Behav.
3. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297-334.
4. Lee J. Cronbach, Richard J. Shavelson (2004). My Current Thoughts on Coefficient Alpha and Successor Procedures. https://doi.org/10.1177/0013164404266386
5. Keith S. Taber (2016). The Use of Cronbach’s Alpha When Developing and Reporting Research Instruments in Science Education. Res Sci Educ.
6. Osburn, H.G. (2000). Coefficient alpha and related internal consistency reliability coefficients. Psychological Methods 5(3):343-55.
.....
Omid Mahdieh
University of Zanjan
I agree with Robert Trevethan.
Taha Anwar Taha
Gifted School in ALAnbar
0.70 or above
Holger Steinmetz
Universität Trier
Please: If you post, read the damned thread first
1 Recommendation
Binod Kumar Singh
University of Petroleum & Energy Studies
0.6 and above is ok
There are also reports that consider 0.65 as acceptable. A general accepted rule is that α of 0.6-0.7 indicates an acceptable level of reliability, and 0.8 or greater a very good level. However, values higher than 0.95 are not necessarily good, since they might be an indication of redundance. Addinsoft, 2021
ref: XLSTAT | Statistical Software for Excel
2 Recommendations
Holger Steinmetz
Universität Trier
I would strive or .80-.90 PRESUMED that a preceding fitted factor model in a CFA has showed decent fit with a non-significant chi-square and EQUAL factor loadings.
If the factor loadings are not equal, use Omega. If the factor model is inappropriate, AVOID internal consistency estimates and rely on the good old test-retest reliability.
Using thresholds and rules of thumb is --as almost always-- nonsensical:
a) If the independent variable has a non-perfect reliability, decreasing alphas will continuously increase the bias of your regression estimate. THERE IS NOT MAGICAL BORDER. It all depends on your decision which bias is acceptable for you
b) if the dependent variable has non-perfect reliability, decreasing alphas will NOT bias the unstandardized estimate but increase the standard errors. Hence you loose power. The standardized estimates will be effect as they result from the standard deviations of the IV and DV. Hence, again, the size of alpha strived for will depend on your N and goals wrt standardization.
Of course, all of this does not mean that one should strive for high alphas at all costs. As I said: the validity of the factor model is essential.
Marián Čvirik
University of Economics in Bratislava
Cronbach's alpha is not suitable for hypothesis testing. It only gives us the inner consistency of the individual statements. Interpretively, it can be used mainly within the reliability of the research tool. In scientific practice, it is stated to be an acceptable value above 0.700, but the number of statements must be taken into account
1 Recommendation
Many methodologists recommend a minimum alpha coefficient between 0.65 and 0.8 (or higher in many cases); 0.65 is acceptable but not good or excellent. Addinsoft, 2021
ref: XLSTAT | Statistical Software for Excel
4 Recommendations
Asad Abbas
Tecnológico de Monterrey
Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. psychometrika, 16(3), 297-334. doi:10.1007/bf02310555
2 Recommendations

Similar questions and discussions

Related Publications

Got a technical question?
Get high-quality answers from experts.