ArticlePDF Available

Sex Differences in Intelligence: Developmental Origin Yes, Jensen Effect No

September 2017
Mankind Quarterly 58(1):101-108

September 2017
58(1):101-108

DOI:10.46469/mq.2017.58.1.8

Authors:

Ross University, School of Veterinary Medicine

Richard Lynn’s developmental theory of sex differences in intelligence is evaluated using the administration of the Armed Services Vocational Aptitude Battery in the NLSY79. Score increases between age 15 and age 23 are found to be greater in males than in females, supporting an essential element of the theory. On the other hand, neither the sex differences themselves nor their developmental changes are related in any consistent way to the g loadings of the subtests. Therefore sex differences should not be conceptualized as differences in “general” intelligence (g).

Content uploaded by Gerhard Meisenberg

Content may be subject to copyright.

MANKIND QUARTERLY 2017 58:1 101-108

101

Sex Differences in Intelligence:

Developmental Origin Yes, Jensen Effect No

Gerhard Meisenberg*

Ross University Medical School, Dominica

*Address for correspondence: Gmeisenberg@rossu.edu

Richard Lynn’s developmental theory of sex differences in

intelligence is evaluated using the administration of the Armed

Services Vocational Aptitude Battery in the NLSY79. Score

increases between age 15 and age 23 are found to be greater in

males than in females, supporting an essential element of the theory.

On the other hand, neither the sex differences themselves nor their

developmental changes are related in any consistent way to the g

loadings of the subtests. Therefore sex differences should not be

conceptualized as differences in “general” intelligence (g).

Key Words: ASVAB, NLSY, Intelligence, Sex differences, g

loadings, Development

The theory outlined by Richard Lynn in his target article makes two important

testable assumptions. First, it proposes that there are cognitive sex differences

that can be conceptualized meaningfully as differences in general intelligence.

The concept is operationalized either as an IQ calculated as the average (or,

more bombastically, a “unit-weighted factor score”) of subtest scores on a

complex test battery such as the Wechsler tests, or as the unrotated first factor

or first principal component from a factor analysis or principal components

analysis on the subtest scores. The second claim is that sex differences are age-

dependent, with minimal and inconsistent differences in childhood and a male

advantage developing gradually from about age 15 or 16. This developmental

trend is assumed to be related to the later timing of puberty in males than females,

which is associated with later and more prolonged male brain maturation as well

as physical maturation. In the following, I will examine these claims using the

1980 administration of the Armed Services Vocational Aptitude Battery (ASVAB)

in the National Longitudinal Survey of Youth 1979 (NLSY79).

MANKIND QUARTERLY 2017 58:1

102

Materials

1. The NLSY79 sample

The National Longitudinal Survey of Youth was launched as a prospective

longitudinal survey by the US Department of Labor in 1979. Subjects aged 14-22

years were enrolled. The sample is not entirely representative of the US

population because those from lower socioeconomic backgrounds and some

ethnic/racial minorities were oversampled. However, males and females were

sampled in proportion from each group. The Armed Services Vocational Aptitude

Battery (ASVAB) was administered to the entire cohort in 1980. Complete test

results are available for 5975 males and 5939 females.

2. Properties of the ASVAB

The ASVAB is a vocational aptitude test that is used for screening of

prospective recruits and for assignment to diverse military duties and training

programs in the US armed forces. It is composed of 10 subtests:

1. General Science: Knowledge of physical and biological sciences.

2. Arithmetic Reasoning: Word problems that emphasize reasoning rather than

mathematical knowledge.

3. Word Knowledge: Understanding the meaning of words.

4. Paragraph Comprehension: Understanding the meaning of paragraphs.

5. Numerical Operations: A speed test of mental addition, subtraction,

multiplication and division.

6. Coding Speed: A speed test to match words and numbers.

7. Auto and Shop Info: Knowledge of automobiles, shop practices and use of

tools.

8. Mathematics Knowledge: Knowledge and skills in algebra, geometry and

fractions.

9. Mechanical Comprehension: Understanding of mechanical principles such

as gears, pulleys and hydraulics.

10. Electronics Info: Knowledge of electricity, radio principles and electronics.

These descriptions of the subtests are from Maier & Grafton (1981).

Psychometrically, the ASVAB is a test of crystallized intelligence: acquired

knowledge and skills rather than context-free reasoning ability. As such, it is

closely related to tests of literacy (Marks, 2010).

MEISENBERG, G. DEVELOPMENTAL ORIGIN YES, JENSEN EFFECT NO

103

3. Scaling of scores

Because scores on all subtests increased with age in an approximately linear

fashion, the subtest raw scores were residualized for age and scaled to the IQ

metric, with a mean of 100 and standard deviation of 15. Principal components

analysis of these age-residualized scaled scores produced an unrotated first

principal component (g factor) accounting for 66.1% of the total variance. This g

factor, scaled to the IQ metric, was used as a measure of general intelligence.

Results

1. The (un)importance of g

Table 1 shows the g loadings (correlations with g) of the scaled subtest

scores, and the male and female means and standard deviations on each subtest

and the g factor. The scaling implies that because of the nearly equal numbers of

males and females, male and female scores average out to 100. Because of the

large sample sizes almost all sex differences (d) are statistically significant.

Therefore interpretation of the results should be based on the magnitude of the

differences rather than their statistical significance.

Table 1. g loadings of ASVAB subtests, and sex differences. N = 5,975 males

and 5,939 females. d = standardized sex difference: (♂ mean - ♀ mean) /

averaged standard deviation; ** p<.01; *** p<.001, two-tailed. Δ age trend is the

extent to which males gain more than females per year, expressed on the IQ

scale.

g loading

♂ mean

± SD

♀ mean

± SD d Δ age trend

1. Science

0.887

102.0±15.9

98.0±13.8

0.27***

0.229

2. Arithmetic

0.872

101.5±15.5

98.5±14.3

0.20***

0.354

3. Words

0.891

99.8±15.4

100.2±14.6

-0.03

0.112

4. Comprehension

0.839

98.6±15.4

101.4±14.5

-0.19***

0.294

5. Numerical Ops.

0.737

98.3±14.9

101.7±14.9

-0.23***

0.587

6. Coding

0.673

96.9±14.3

103.1±15.0

-0.42***

0.286

7. Auto & Shop

0.732

106.7±15.6

93.3±10.7

1.02***

1.029

8. Math knowledge

0.833

100.4±15.4

99.6±14.6

0.05**

0.458

9. Mechanical Compr.

0.806

104.7±15.9

95.3±12.3

0.67***

0.707

10. Electronics Info

0.830

104.5±15.7

95.5±12.7

0.63***

0.518

101.7±16.0

98.3±13.8

0.23***

0.549

The results confirm that males do indeed have higher g than females.

However, we also see that sex differences on 5 of the 10 subtests are larger, and

MANKIND QUARTERLY 2017 58:1

104

in some cases far larger, than the differences in g. Average absolute sex

differences are 0.37d (5.6 IQ points) on the subtests, as opposed to 0.23d (3.5

IQ points) on the general factor. This is not expected if the sex differences are

only or even mainly on g.

Another prediction of the hypothesis that sex differences are mainly on

general intelligence is that the sex differences favoring males are larger on those

tests that are the best measures of the general factor, meaning those that

correlate best with g. The actual correlation of the subtest g loadings with the d

values is +.142, which is in the expected direction but not nearly significant:

essentially, a null result. Inspection of the ASVAB subtests shows the nature of

the sex differences. There are five tests with primarily academic content: Science,

Arithmetic, Words, Comprehension, and Math Knowledge; two tests of

psychomotor speed: Numerical Operations, and Coding; and three tests of

vocational knowledge and skills: Auto & Shop Info, Mechanical Comprehension,

and Electronics Info. Sex differences favor males on the vocational tests, females

on the speed tests, and sex differences are small on the academic tests.

Because it can be argued that the vocational subtests are related to specific

experiences and knowledge that men are more exposed to than women, let’s see

what happens to the sex differences when these three tests are omitted. In that

case, the remaining 7 subtests produce a g factor on which females outscore

males by 0.8 IQ points. However, this time the correlation between g loadings and

sex differences is +.693, which comes close to conventional statistical

significance (p=.084). This suggests that males tend to do better on highly g-

loaded tests, and females do better on tests with lower g loadings.

However, we can also argue that psychomotor speed is conceptually

different from intelligence. In dual-processing theories of cognition, quick

responses require automatic processing while intelligence is a property of a slow

processing system (Evans, 2008). What happens to sex differences when the two

speed tests are removed but all others are retained? As expected, the male

advantage on the g factor extracted from the remaining eight subtests is

enhanced: from 3.4 points in the complete ASVAB to 5.2 points when the speed

tests are deleted. In addition, the sign of the correlation between g loadings and

sex differences reverses, to -.447. Thus the answer to the question of whether

tests with higher g loadings favor males or females depends very much on the

composition of the test battery.

2. Changes with age

Let us now examine the developmental trajectory that is proposed by Lynn’s

theory. Table 2 shows how sex differences on the general factor, extracted from

MEISENBERG, G. DEVELOPMENTAL ORIGIN YES, JENSEN EFFECT NO

105

all 10 subtests, change with age. There is no sex difference at age 15, but males

pull ahead of females as they get older. At age 20 and beyond they outscore

females by almost one third of a standard deviation, or 5 IQ points.

Table 2. Sex differences on the general factor extracted from the ASVAB

subtests, by age. d = standardized sex difference: (♂ mean - ♀ mean) / averaged

standard deviation.

Age

♂ mean ± SD

♂ N

♀ mean ± SD

♀ N

99.7±14.6

488

100.3±12.7

431

-0.04

101.6±15.2

784

98.5±12.6

725

0.22

102.0±16.1

750

98.8±13.0

752

0.22

100.8±16.0

705

99.0±13.3

721

0.12

100.4±16.7

761

97.5±14.0

757

0.19

102.4±16.3

741

97.5±14.4

813

0.32

103.4±16.1

751

99.0±14.2

774

0.29

102.0±16.2

765

97.2±14.9

798

0.31

102.9±15.2

230

96.7±14.2

168

0.42

We saw before that, ignoring age, the pattern of sex differences on the

subtests shows no consistent relationship with the subtests’ g loadings. It is

nevertheless possible that, for example, prenatal androgen action creates the

strengths and weaknesses of the sexes on specific subtests while continued brain

development after the age of 15 years creates an omnibus male advantage that

is strongest on tests with higher g loadings. To test whether the greater male than

female improvement in test performance after age 15 is related to the subtests’ g

loadings, simple regressions were performed predicting subtest score with age,

separately for males and females. The unstandardized B coefficients were

recorded for each regression, and the female B coefficient was subtracted from

the male B coefficient. This difference score is taken as the difference in score

gains between males and females, expressed as IQ points gained or lost per

year.

The last column in Table 1 shows the results. On each subtest and the

common factor, the signs are positive indicating that the increase in performance

with rising age is greater in males than in females. The extent of this sex

difference is smallest for Word Knowledge, where it amounts to 0.112 IQ points

per year. This means that between the ages of 15 and 23 years, males gain 0.112

* 8 = 0.896 points relative to females. At the other extreme, male gains on Auto &

Shop Info exceed female gains by as much as 8.232 points. In other words,

MANKIND QUARTERLY 2017 58:1

106

between these ages males and females acquire new word knowledge at similar

rates, but males acquire auto and shop knowledge at much higher rates than do

females. Gains on the other subtests are in between, and between the ages of

15 and 23 years males gain 4.39 IQ points relative to females on the common

factor.

When the sex difference in score gains in the last column of Table 1 is

correlated with the g loadings of the subtests, we obtain a Pearson’s r of -.492,

which is non-significant. As before, we can exclude the vocational tests and the

speed tests from the analysis. Without the three vocational tests we obtain r = -

.357, and without the two speeded tests we obtain r = -.785. The last of these

correlations is statistically significant at p = .021 with a sample size of 8 tests. The

negative signs of these correlations show that, if anything, the extent to which

score gains of males outpace those of females between the ages of 15 and 23

years tends to be greater on tests with lower g loading. This contradicts the view

that males gain on females in general intelligence during late adolescence.

A look at the first and last data columns in Table 1 shows the reasons for the

negative signs obtained in this exercise. We see that the vocational tests are

those on which males gain much faster than females in late adolescence. These

tests have g loadings that are rather low (Auto & Shop Info) or middling

(Mechanical Comprehension, Electronics Info). After excluding the vocational

tests, the low-g speeded tests show somewhat greater male-versus-female gains

than the academic tests; and when the speeded tests are excluded but the

vocational tests are retained, there is a fairly consistent pattern of vocational tests

having larger male-versus-female gains with age while also having somewhat

lower g loadings.

Conclusions

The results presented in this comment illustrate two aspects of Richard

Lynn’s developmental theory of sex differences in intelligence. The first is that sex

differences are small and/or variable up to the age of about 15 years but that

males tend to pull ahead of females after that age. This part of the theory is

supported, as indicated by the d values in Table 2. Even the final magnitude of

the sex difference, of nearly 5 IQ points, agrees well with results from many other

studies compiled by Lynn. Furthermore, there is some generality to the greater

male than female gains between the ages of 15 and 23 years, in the sense that

these are observed on all subtests (last column of Table 1).

On first sight, the results confirm Lynn’s conclusion that in adulthood, males

outscore females by 4 to 5 points in general intelligence. However, a closer look

at the results shows that the male advantage on the ASVAB is due to the

MEISENBERG, G. DEVELOPMENTAL ORIGIN YES, JENSEN EFFECT NO

107

presence of three subtests that concern vocational skills and knowledge. Without

these three tests, the sex difference is virtually zero. Even in the 20-23 years age

group, where males outscore females by 4.8 points on the complete test, they

score only a negligible 0.3 points higher than females when the vocational tests

are omitted. Furthermore, there is no consistent relationship between the g

loadings of subtests and their sex differences. Sex differences do not show a

Jensen effect. Spearman’s hypothesis, which proposes that score differences

between racial and ethnic groups are largest on the most g-loaded tests (Jensen,

1985), does not apply to sex differences. Therefore sex differences cannot be

explained as differences in a general ability factor, but only as differences in

specialized abilities, at least for the range of abilities that are tested with the

ASVAB.

Also the sex differences in subtest score gains, presented in the last column

of Table 1, do not show a Jensen effect. This sex difference is general only in the

sense that males gain faster than females on all subtests, but it cannot be

conceptualized as g. Specifically, we observe that the extent to which yearly gains

are greater in males than females is most pronounced on the three vocational

tests and on numerical operations (mental arithmetic). This suggests that

accelerated male development in these domains is not only the result of faster

overall brain maturation, which would presumably affect all abilities in proportion

to their g loadings. It is better explained by content-specific factors such as greater

male than female exposure to or interest in tools, engines, gears, hydraulics and

numbers.

On the other hand, we observe that male gains with age exceed those of

females also on the other ASVAB subtests. This indicates that there is a general

component to the sex difference in cognitive trajectories during late adolescence,

although this general component cannot be conceptualized as g. We saw that on

the g factor extracted from all 10 subtests, this difference in developmental

progression accounts for 4.39 IQ points between the ages of 15 and 23 years.

When the g factor is extracted from the seven non-vocational subtests only, this

developmental difference is reduced to 3.13 points. These results suggest that

the true developmental component in Lynn’s developmental theory amounts to

approximately 3 IQ points that males gain on females between the ages of 15 and

23 years, at least on a composite of those abilities that are tested with the ASVAB.

MANKIND QUARTERLY 2017 58:1

108

References

Evans, J.S.B.T. (2008). Dual-processing accounts of reasoning, judgment, and social

cognition. Annual Review of Psychology 59: 255-278.

Jensen, A.J. (1985). The nature of the black-white difference on various psychometric

tests: Spearman’s hypothesis. Behavioral and Brain Sciences 8: 193-263.

Maier, M.H. & Grafton, F.C. (1981). Aptitude Composites for ASVAB 8, 9, and 10 (No.

ARI-RR-1308). Army Research Institute for the Behavioral and Social Sciences,

Alexandria VA.

Marks, D.F. (2010). IQ variations across time, race, and nationality: An artifact of

differences in literacy skills. Psychological Reports 106: 643-664.

Through the Lens of a Millennial: Opening the Aperture of Millennials’ Views of Leaders

Article

Full-text available

Jul 2020

As more women joined the workforce in the last few decades, scholars have continued to research why women do not occupy more senior levels of leadership. While many variables have been researched, a pervasive theory is that women are expected to act in communal ways, but leadership is described as agentic; typically attributed to male behaviors. Namely, women in more senior roles must display male, agentic behaviors to be perceived as a credible leader, yet still maintain their communal traits to avoid being perceived as duplicitous. With more females in the workplace, acting as new exemplars for the millennial workforce, have the views of leadership changed to be less agentic? This quantitative study investigated; whether male millennials in the workforce maintain as agentic a view of leadership as their predecessors, whether female millennials in the workforce maintain as agentic a view of leadership as their predecessors, and whether the presence of women in leadership roles has influenced leadership behaviors in either gender. In this study, millennials are surveyed regarding the most important leadership characteristics and how gender undulates through the perceived effectiveness. The researchers found that leadership descriptors are more gender-agnostic, influencing a broader view of how leadership is seen across both genders. The implications for this finding are that millennials are softening the more traditional view of agentic leadership and expanding leadership to include more communal traits.

The possible role of field independence/dependence on developmental sex differences in general intelligence

Article

Mar 2022
INTELLIGENCE

Real-life outcomes for men and women suggest the existence of cognitive sex differences, but the evidence for a sex difference in general intelligence is equivocal. Here, we examine the role of spatial ability for IQ test performance, in light of the developmental hypothesis that male performance increases more than female across adolescence. Using longitudinal data from Block and Block data set on the Wechsler scales and the rod-and-frame test (RFT) for ages 4 (N = 108), 11 (N = 101), and 18 years (N = 100), we find that males' performance becomes greater than females' with age, both on IQ and the RFT. At 18 years of age, males' mean IQ and RFT score was 116.4 and 4.05 (lower scores representing less error), as compared to111.5 and 7.85 for females. Importantly, we found that the RFT mediates the sex difference in IQ, and that the factor loadings of the RFT on the g factor increases with age, from −0.06 at age 4 to −0.52 at 11 and −0.67 at age 18. In conclusion, g becomes more integrative of spatial ability across time and this finding may explain sex differences in g after puberty and potentially has interesting implications for the understanding of the development of intelligence. One important direction for future research is to incorporate biologically based pubertal neural changes into our understanding of developmental sex differences in intelligence.

IQ Variations across Time, Race, and Nationality: An Artifact of Differences in Literacy Skills

Article

Full-text available

Jun 2010

David F Marks

A body of data on IQ collected over 50 years has revealed that average population IQ varies across time, race, and nationality. An explanation for these differences may be that intelligence test performance requires literacy skills not present in all people to the same extent. In eight analyses, population mean full scale IQ and literacy scores yielded correlations ranging from .79 to .99. In cohort studies, significantly larger improvements in IQ occurred in the lower half of the IQ distribution, affecting the distribution variance and skewness in the predicted manner. In addition, three Verbal subscales on the WAIS show the largest Flynn effect sizes and all four Verbal subscales are among those showing the highest racial IQ differences. This pattern of findings supports the hypothesis that both secular and racial differences in intelligence test scores have an environmental explanation: secular and racial differences in IQ are an artifact of variation in literacy skills. These findings suggest that racial IQ distributions will converge if opportunities are equalized for different population groups to achieve the same high level of literacy skills. Social justice requires more effective implementation of policies and programs designed to eliminate inequities in IQ and literacy.

Aptitude Composites for ASVAB 8, 9, and 10

Article

May 1981

Aptitude composites for the Armed Services Vocational Aptitude Battery (ASVAB) were developed using training success and Skill Qualification Test (SQT) scores, measures of job proficiency, as the criterion. The aptitude composites had high validity in the range .52 to .75 for predicting job proficiency. Criticisms of the usefulness of SQTs as measures of job proficiency are addressed.

The nature of the black-white difference on various psychometric tests: Spearman's hypothesis

Article

Jul 1985
BEHAV BRAIN SCI

Arthur R. Jensen

Although the black and white populations in the United States differ, on average, by about one standard deviation (equivalent to 15 IQ points) on current IQ tests, they differ by various amounts on different tests. The present study examines the nature of the highly variable black–white difference across diverse tests and indicates the major systematic source of this between-population variation, namely, Spearman's g. Charles Spearman originally suggested in 1927 that the varying magnitude of the mean difference between black and white populations on a variety of mental tests is directly related to the size of the test's loading on g, the general factor common to all complex tests of mental ability. Eleven large-scale studies, each comprising anywhere from 6 to 13 diverse tests, show a significant and substantial correlation between tests' g loadings and the mean black–white difference (expressed in standard score units) on the various tests. Hence, in accord with Spearman's hypothesis, the average black–white difference on diverse mental tests may be interpreted as chiefly a difference in g, rather than as a difference in the more specific sources of test score variance associated with any particular informational content, scholastic knowledge, specific acquired skill, or type of test. The results of recent chronometric studies of relatively simple cognitive tasks suggest that the g factor is related, at least in part, to the speed and efficiency of certain basic information-processing capacities. The consistent relationship of these processing variables to g and to Spearman's hypothesis suggests the hypothesis that the differences between black and white populations in the rate of information processing may account for a part of the average black–white difference on standard IQ tests and their educational and occupational correlates.

Dual-Processing Accounts of Reasoning, Judgment, and Social Cognition

Article

Feb 2008

Jonathan St. B. T. Evans

This article reviews a diverse set of proposals for dual processing in higher cognition within largely disconnected literatures in cognitive and social psychology. All these theories have in common the distinction between cognitive processes that are fast, automatic, and unconscious and those that are slow, deliberative, and conscious. A number of authors have recently suggested that there may be two architecturally (and evolutionarily) distinct cognitive systems underlying these dual-process accounts. However, it emerges that (a) there are multiple kinds of implicit processes described by different theorists and (b) not all of the proposed attributes of the two kinds of processing can be sensibly mapped on to two systems as currently conceived. It is suggested that while some dual-process theories are concerned with parallel competing processes involving explicit and implicit knowledge systems, others are concerned with the influence of preconscious processes that contextualize and shape deliberative reasoning and decision-making.

Sex Differences in Intelligence: Developmental Origin Yes, Jensen Effect No

Abstract

Recommended publications

INDIGENOUS KNOWLEDGE FOR SUSTAINABLE VOCATIONAL EDUCATION

A forty-year follow-up on the vocational interests of psychologists and their relationship to career...

Differences between normal and underachievers of superior ability

The Effects of Mode of Test Administration on Test Performance