Question
Asked 4th Jan, 2022

What statistical tool (data analysis method) should I use when I would like to see the relationship between a yes-no and a Likert scale variable?

Hi! I'm a fourth-year college student. This is my first time doing quantitative research with nominal and ordinal data.
I would like to ask for your help and/or advice regarding the statistical tool that I should use when I want to see the relationship between the technology access of students, with 21 statements answerable by yes or no, and student attitudes with statements rateable within a Likert-scale.
In addition, I would like to ask for any ideas on how I could possibly interpret the results or their relationship, with both variables (technology access and student attitude) having three indicators each?
Thank you so much.

Most recent answer

Qijia Liao
University of Liverpool
If your dependent variable is a binary (yes or no) question, then choose binary logistic regression; if your dependent variable is a continuous variable or original variable, use a linear regression model.
Next, for your independent variables, if the data is normal data, you can convert yes into 1, and no to 0 (dummy coding), and for remaining IVs with ordinal data (1-5 scale), you don't need to do anything extra. But for ordinal data, you have to analyze the data quality using SPSS reliability test (rule of thumb: Cronbach alpha greater than 0.7 ).

All Answers (24)

Sal Mangiafico
Rutgers, The State University of New Jersey
Are you combining the 21 yes-no questions into a single number? Or looking at each question separately? ... Are you combining the Likert-type items into a single number (scale)?
David L Morgan
Portland State University
I agree with Salvatore S. Mangiafico that whether you are combining your items into scale is an important question. At a minimum you could assess the feasibility of combining the Likert-scored items into a scale by using coefficient alpha.
Note, however, that alpha does not apply to binary items. In that case, you could appeal to "face validity" to argue that these items make up a score on something like "technology usage."
D. Eastern Kang Sim
University of California, San Diego
If you are interested in the response pattern, consider latent class (profile) analysis. It might produce a deeper insight related to how participants are heterogenous groups.
Ronán Michael Conroy
Royal College of Surgeons in Ireland
David L Morgan – I've heard this before, that alpha does not apply to binary items. In fact, the KR20 coefficient, which was developed for binary items, is mathematically equivalent to alpha.
David L Morgan
Portland State University
Ronán Michael Conroy thank you for that information
Mohialdeen Alotumi
Sana'a University
For correlating your ordinal scale (i.e., student attitude) and binary scale (i.e., technology access), you could report Spearman correlation coefficient. The following might be of interest.
Chalmers, R. P. (2018). On misconceptions and the limited usefulness of ordinal alpha. Educational and Psychological Measurement, 78(6), 1056–1071. https://doi.org/10.1177/0013164417727036
de Winter, J. C. F., Gosling, S. D., & Potter, J. (2016). Comparing the Pearson and Spearman correlation coefficients across distributions and sample sizes: A tutorial using simulations and empirical data. Psychological Methods, 21(3), 273–290. https://doi.org/10.1037/met0000079
Good luck,
Sal Mangiafico
Rutgers, The State University of New Jersey
Mohialdeen Alotumi , --- I don't know if the topic is applicable for this question ---, but on the topic of finding the association between a dichotomous variable and an ordinal variable, I would recommend the Glass rank biserial correlation(Rg) over measures like Spearman correlation or Kendall correlation.
One simple reason is that Rg is designed for this purpose, whereas correlation is typically used to, well, determine the correlation between two continuous or ordinal variables.
I think this can cause some confusion when using correlation for the effect size of two groups. If we have values for two groups, say, A and B, the natural inclination is to find the correlation between A and B. Whereas to use correlation as the effect size for the difference between the groups, we would need to find the correlation between the combined values of A and B, and the numeric equivalent of the two groups.
Rg is also directly related to the probability that an observation in one group is larger than an observation in the other group. (Compare Cliff's delta, Vargha and Delaney's A, and the common language effect size). So it's actually quite easy to interpret.
Finally, using correlation in this manner returns a result that has a sign opposite of the usual sign of effect size statistics and similar statistics. Typically, if the second group has larger values than the first group, the statistic is negative. You can see this with the t statistic, z statistic, and signed effect size statistics like Cohen's d. Results for Rg should be in accord with this convention, whereas correlation will return the opposite of this convention.
Muhammad Zia Aslam
Superior University
Elaine Robledo I think you can simply combine Student Attitude and take it as a continuous variable and perform comparison of means (T-Test) for your Yes/No groups. The results might support the common perception that ones who have access to technology shows better attitude towards goal achievements. Tq.
Ronán Michael Conroy
Royal College of Surgeons in Ireland
Muhammad Zia Aslam thinks that "you can simply combine Student Attitude and take it as a continuous variable". I don't. You have no reason to believe that the attitudes form a unidimensional scale. (You may have hopes – we all have – but beliefs require data.)
If you want to explore the structure of student attitudes ( I have a fondness for exploring the structure of questionnaires) I recommend Mokken scaling, which is a nonparametric procedure for building one or more unidimensional scales from a pool of items. I'm afraid that the best way of doing it is using R, but the R package concerned – called mokken – is really easy to use and has splendid documentation.
1 Recommendation
David L Morgan
Portland State University
Elaine Robledo There are many ways to assess whether a set of items form a scale. The classic approach is to begin with coefficient alpha, which is a conservative test of whether all the items measure the same underlying construct (i.e., they are highly inter-correlated).
Ronán Michael Conroy
Royal College of Surgeons in Ireland
David L Morgan 's advocacy of alpha must be taken with a caveat. Alpha is the average of all possible split-half correlations. For that reason, alpha can be high when the items are made up of several uncorrelated unidimensional scales. It does not guarantee unidimensionality, and, indeed, assumes unidimensionality. In the case of a scale that measures several constructs, the interpretation of alpha is problematic.
So alpha is like a friend of mine, who claimed that he knew nothing about good music, but could instantly recognise bad music. A low alpha is a useful indicator that your items lack internal consistency, but a high alpha is not an indicator that your items are a meaningful scale.
1 Recommendation
David L Morgan
Portland State University
I agree that alpha is not a guarantee of unidimensionality, but the way I see it the separate uni-dimensional scales have to be highly correlated themselves before they can generate a high value of alpha (I personally consider .8 the best cut-off for alpha, but most journals will accept .7).
1 Recommendation
Ronán Michael Conroy
Royal College of Surgeons in Ireland
This isn't true. Imagine a scale made up of equal numbers of items from two completely uncorrelated scales.
Now imagine a split-half reliability. Each of the halves will contain a number of items from each of the subscales – in fact, only one possible split half will produce a correlation of zero because it will separate the two sets of items perfectly. You can see the problem! The scores on each of the halves will tend to correlate well. See
Huysamen, G.K., 2006. Coefficient alpha: Unnecessarily ambiguous; unduly ubiquitous. SA Journal of Industrial Psychology, 32(4)
As for the threshold values of alpha, they too are folklore. Researchers frequently invoke the authority of Nunally (Nunnally & Bernstein 1994) to justify the use of an alpha of 0·7 or more as indicating an acceptable level of scale reliability. As Lance points out, Nunally simply didn't say this (Lance et al. 2006). And it is worth quoting what Nunally did say:
"In the early stages of research… one saves time and energy by working with instruments that have only modest reliability, for which purpose reliabilities of ·70 or higher will suffice… In contrast to the standards in basic research, in many applied settings a reliability of ·80 is not nearly high enough… In many applied problems, a great deal hinges on the exact score made by a person on a test… In such instances it is frightening to think that any measurement error is permitted. Even with a reliability of ·90, the standard error of measurement is almost one-third as large as the standard deviation of the test scores."
Lance, C.E., Butts, M.M. & Michels, L.C., 2006. The Sources of Four Commonly Reported Cutoff Criteria: What Did They Really Say? Organizational Research Methods, 9(2), pp.202–220.
1 Recommendation
Muhammad Zia Aslam
Superior University
Respected Prof. Ronán Michael Conroy I really recommend your "statistically rigorous" approach to the issue BUT how a fourth year degree student would digest and comprehend "Mokken Scaling" procedure using R-package for their possibly end of semester research assignment. As a commonly accepted measure of reliability of an existing scale of a latent variable, I still think alpha coefficient would be good enough to move forward to the main objectives of the study. Tq.
1 Recommendation
David Eugene Booth
Kent State University
Cut to the chase and use logistic regression with likert variable as your IV. Follow David L Morgan on scale construction.. good luck, David Booth
2 Recommendations
Ronán Michael Conroy
Royal College of Surgeons in Ireland
David Eugene Booth – thank you for introducing a little clarity into what had become a pretty arcane discussion! And thank you, Muhammad Zia Aslam for pointing out that this poor student has probably enough to do without getting involved in Mokken scaling!
Elaine Robledo
University of Southeastern Philippines
Sal Mangiafico Sorry for the late response.
I would like to at each question (binary items) separately since this is also for profiling purposes like what percentage of the students have smartphones, computers... use only mobile data, or have internet access at home, etc.
Thus, I would like to look into if these have an effect on the overall attitude of the students regarding online learning (measured by the Likert Scale).
I don't have much experience or knowledge in doing stats. So, I am hoping that you could suggest a simple analysis method for this kind of situation, or in handling these kinds of data?
Thank you.
Elaine Robledo
University of Southeastern Philippines
Thank you so much for your inputs and suggestions — David L Morgan , D. Eastern Kang Sim , Ronán Michael Conroy , Mohialdeen Alotumi , Muhammad Zia Aslam , Oluwaseyi Ayorinde Mohammed , and David Eugene Booth . I would look into these data analysis methods and try to get back to you if I have found the data analysis method that I would use or if I have further questions.
Again, thank you so much. You are all a big help to the success of our thesis.
Sal Mangiafico
Rutgers, The State University of New Jersey
Elaine Robledo , your question isn't entirely clear to me, but I'll try to make some comments.
  1. Obviously for each of the yes/no questions you can calculate and report the percentage of "yes" answers, e.g. percentage of people who use a computer, smartphone, and so on. It is usually a good idea to report data like this, almost as if it were demographic data.
  2. How you analyze the connection between the yes/no questions and the Likert-type items depend on if you will combine the Likert-type items into a single scale or treat them individually.
  3. With a binary independent variable and an ordinal or continuous dependent variable, a Wilcoxon-Mann-Whitney test will work well for a hypothesis test, and any of Cliff's delta, Vargha and Delaney's A, or Glass rank biserial correlation, will work as an effect size statistic. (In the end, these three measure the same thing.) ... The tests and effect size statistics here all measure if a response in one group are likely to be greater than an observation in another group. They don't address means or medians usually.
  4. If you are interested in means or medians, there are tests and effect size statistics that may be applicable.
Maamir Abdellatif
University of Oran 2 Mohamed Ben Ahmed
The best method is the SEM method, which depends on the interpretation of the concepts that are the variables of the study
Bachir Abdelhamid
Université Mohamed Chérif Messaadia de Souk-Ahras
Hello everybody, i think, (t-test)
Qijia Liao
University of Liverpool
If your dependent variable is a binary (yes or no) question, then choose binary logistic regression; if your dependent variable is a continuous variable or original variable, use a linear regression model.
Next, for your independent variables, if the data is normal data, you can convert yes into 1, and no to 0 (dummy coding), and for remaining IVs with ordinal data (1-5 scale), you don't need to do anything extra. But for ordinal data, you have to analyze the data quality using SPSS reliability test (rule of thumb: Cronbach alpha greater than 0.7 ).

Similar questions and discussions

Related Publications

Article
Full-text available
This research is a descriptive study that aims to determine student attitudes towards mathematical modeling in secondary schools in Palembang. Subjects of this study involves graders VII.6 of SMP Negeri 1 7 Palembang, graders X.IPA.1 Palembang SMA 02 and SMA 10 XI.IPA.5 class Palembang. The data collection technique in this study was a written ques...
Article
Full-text available
Service quality is commonly measured with instruments that use the Likert scale, which corresponds to a qualitative variable (ordinal or nominal) that in turn is associated with a quantitative one (discrete or continuous). This approximation allows performing operations and comparisons but has generated controversy over statistical management and d...
Chapter
Measurement procedures can be placed on a continuum stretching from representational at one end to pragmatic at the other. Measurements in the physical sciences tend to have a heavier representational aspect, and those in the social and behavioural sciences more pragmatic. In many cases, measurement consists of a mixture of the two extremes. ‘What...
Got a technical question?
Get high-quality answers from experts.