Content uploaded by Deborah Gebhardt
Author content
All content in this area was uploaded by Deborah Gebhardt on Sep 12, 2019
Content may be subject to copyright.
Work xx (20xx) x–xx
DOI:10.3233/WOR-192964
IOS Press
1
Historical perspective on physical
employment standards
Deborah L. Gebhardt∗
Human Resources Research Organization (HumRRO), Alexandria, VA, USA
Received 01 April 2019
Accepted 01 April 2019
Abstract.
BACKGROUND: When one thinks of jobs with physical employment standards, the first thoughts typically center around
firefighting, law enforcement, and military jobs. However, there are 100s of arduous jobs that exist in the public and private
sectors that range from moderately demanding to strenuous. The Bureau of Labor Statistics reported that 28% of the workforce
in the United States performs physically demanding jobs that involve construction, machinery installation and repair, public
safety, and other professions.
OBJECTIVE: This paper provides a historical perspective of physical employment standards for hiring workers into
these arduous jobs, how we arrived at our current knowledge base, and the challenges faced today when determining and
implementing physical employment standards.
METHOD: This narrative review draws on evidence from 62 published sources.
RESULTS: This paper focuses on the need for a multidisciplinary approach to identifying job requirements, the professions
(e.g., medical, psychology, physiology) that underpin the methodologies, and the knowledge used by current researchers.
Descriptions of test and cut score development, legal issues, and challenges for the future also are highlighted.
Keywords: Physical employment standards, physical test
1. Introduction
When one thinks of jobs with physical employ-
ment standards, the first thoughts typically center
around firefighting, law enforcement, and military
jobs. However, there are 100s of arduous jobs that
exist in the public and private sectors that range
from moderately demanding to strenuous. Many jobs
with physical demand have become more complex in
that workers need computer and high-level technical
skills to install, troubleshoot, calibrate, operate, and
repair all types of equipment and monitoring devices.
Although automation has made some jobs less
∗Address for correspondence: Deborah L. Gebhardt, Human
Resources Research Organization (HumRRO), 66 Canal Center
Plaza, Alexandria, VA 22314, USA. Tel.: +1 703 706 5636; E-mail:
dgebhardt@humrro.org.
arduous, physical demand is present for many other
jobs. For instance, jobs in the electric and telecom-
munications industries require climbing to heights
over 30.5 meters (m) above the ground, digging holes
in the ground, crawling in attics, and lifting heavy
objects, along with installing equipment to ensure
transmission of electrical signals to residential and
commercial properties.
The Bureau of Labor Statistics reported that 28%
of the workforce in the United States performs
physically demanding jobs that involve construction,
machinery installation and repair, public safety, and
other professions. In many instances these are the
highest paying jobs in a geographic location. Fig-
ure 1 shows the percent of jobs with medium and
heavy physical demand across industries [1]. Of the
jobs in the construction, installation/maintenance,
and transportation industries, 32.3% to 45.5% have
1051-9815/19/$35.00 © 2019 – IOS Press and the authors. All rights reserved
Corrected Proof
2D.L. Gebhardt / Historical perspective on physical standards
Fig. 1. Percentage of civilian jobs requiring different strength levels in selected United States occupations in 2016. Bureau of Labor
Statistics, U.S. Department of Labor. (2017), The Economics Daily: Physical strength required for jobs in different occupations in 2016.
[online], Available: http://www.bls.gov/opub/ted/2017/physical-strength-required-for-jobs-in-different-occupations-in-2016.htm.
heavy physical demand. Over 50% of the jobs in the
food preparation/serving, building and grounds main-
tenance, and production occupations have medium
physical demand. Thus, physical work is still present
in the United States.
The purpose of this paper is to provide a his-
torical perspective related to physical employment
standards, how we arrived at our current knowl-
edge levels, and the challenges faced today when
determining and implementing physical employment
standards. Although many organizations use the term
fitness standards, a more appropriate terminology
is physical employment standards (PES) or physi-
cal performance standards. Use of the word fitness
is related to general physical fitness and is not as
accurate in terms of setting job-related employment
standards.
2. Physical performance assessment –
The early years
Assessment of work and physical performance has
a historical base in the fields of industrial/mechanical
engineering, industrial-organizational psychology,
medicine, applied/exercise physiology, and biome-
chanics/ergonomics. During the 1800s, many coun-
tries supported general physical fitness to engage in
military war activities. The fitness centered on a gym-
nastics approach with well-known historical figures
such as Frederick Jahn in Germany with the Turn-
vereins (German gymnastics), Franz Nachtegall in
Denmark with the Institute of Military Gymnastics,
and Pehr Henrik Ling in Sweden at the Royal Military
School [2]. However, these assessments were general
in nature and not specific to job tasks.
Some of the first workplace assessments were
completed in the early 1900s by Frank and Lillian
Gilbreth, who were both mechanical engineers. Frank
Gilbreth initially worked as a bricklaying helper
and observed differences in task performance across
workers [3]. After attaining his engineering degree,
he started his own consulting firm with his wife,
which led to observational and time and motion stud-
ies targeted at improving work performance from an
ergonomic perspective. The Gilbreths evaluated work
in manufacturing and clerical settings and developed
work aids such as vertical scaffolding that allowed
bricklayers immediate access to the bricks. Frank
Gilbreth developed techniques used by armies around
the world to quickly disassemble and reassemble
weapons. He and Lillian also addressed fatigue fac-
tors in the workplace due to inefficient movement
patterns [4]. Their studies improved work productiv-
ity by defining best practices for performing work
tasks, redesigning the workplace, and developing
work aids.
In the early 1900s, railroads in the United States
sought to increase worker efficiency. Frederick Tay-
lor, a contemporary of the Gilbreths, developed
the scientific management approach, which included
time and motion studies [4]. His studies found a pro-
ductivity relationship between time spent under load,
such as lifting and carrying objects, and time spent
at rest. He found that workers could lift and/or carry
pieces of pig iron weighing 41.7 kg for 43% of the day
before they had to revert to lighter pieces. However,
if the pig iron weight was reduced, the worker could
Corrected Proof
D.L. Gebhardt / Historical perspective on physical standards 3
lift 20.9 kg for 58% of the day. Taylor strove for accu-
rate workplace measurement that continues today.
For example, grocery and product distribution cen-
ters worldwide are engineered to provide the greatest
efficiency in picking and transporting products to a
truck for delivery to a store. Workers in the distri-
bution centers must achieve a specific percentage of
the center’s engineered standard for productivity per
shift [5].
3. Assessment instrumentation
Several pioneers were responsible for develop-
ing methods to assess physical performance. Dudley
Sargent, a physician, developed the vertical jump
test that is still used today in many contexts [6].
Further, he contended that there needed to be a
means to equitably compare people’s performance
and laid out test criteria. The tests encompassed
measures of strength, speed, and “endurance” that
included elbows to knees (straight leg sit-ups), mod-
ified pull-ups, push-ups, squats, and other tests. To
provide an overall assessment of an individual’s fit-
ness level, Sargent converted the test scores to joules
and summed the joules across tests to classify the
minimum, average, and maximum percent of work
completed during the testing.
Physical measurement continued to evolve with
development of instruments such as the universal
dynamometer. In the late 1800 s, Kellogg [7] used
the dynamometer to measure strength deficits in his
patients after orthopedic surgery and for assessment
of infantile paralysis. E. G. Martin [7] expanded
dynamometer usage to testing muscles of the feet,
hips, knees, shoulders, forearms wrists, fingers, and
thumb, along with identifying the best order for test-
ing. Up until this time, medical doctors were the
primary researchers and inventors of instruments to
measure physical performance.
Static strength testing was taken to another level by
H. Harrison Clarke [8] who used cable tensiometers
to measure strength in a more precise manner. The
cable tensiometer was an adaptation of an instrument
that measured the tension of aircraft control cables.
Using the cable tensiometer, he developed procedures
to measure strength in 38 muscle groups impacted
by orthopedic disabilities in hospitals and Veterans
Administration centers. Clarke expanded his work to
compare the cable tensiometer to other measurement
devices such as a strain gauge or spring scale. Thus,
these developments contributed to the types of instru-
ments we use today for measuring force. Specifically,
most force platforms use strain gauge technology and
dynamometers are now interfaced with software that
records the data instantaneously.
The historical measurement tool for gathering
expired gases to determine aerobic capacity and
other parameters was the Douglas Bag [9]. Gordon
Douglas, a British physiologist and physician, devel-
oped the bag to collect and measure gas respiratory
exchange for medical purposes. Robert Bruce, Bruno
Balke, and others expanded the use of the Douglas
Bag to sport and work settings and standardized pro-
tocols to assess cardiac function and maximal oxygen
consumption [10]. Applied or exercise physiologists
have continued work in the areas strength and aerobic
assessment and have made great strides in measure-
ment precision.
In the 1950s and 1960s physiologists and psychol-
ogists identified dimensions of physical performance
related to work and sport performance. Psycholo-
gists’ interest in physical performance waned until the
early 1960s when Edwin Fleishman identified a tax-
onomy of physical factors that contributed to job task
performance such as static and dynamic strength [11].
At the same time exercise physiologists such as A.
Jackson [12], J. W. Borchart [13], and T. Baumgartner
and M. Zuidema [14] conducted similar work identi-
fying the physical abilities that were later targeted in
the work setting. Similarly, industrial engineers such
as Stover Snook and others assessed the strength and
aerobic demands of the work place [15]. Per-Olof
˚
Astrand, Irma ˚
Astrand, and Karl Rodahl were some
of the first applied physiologists to gather data related
to job task performance in the fishing, steel, and other
industries [16, 17].
In summary, PES research emerged from five dif-
ferent professions: industrial/mechanical engineer-
ing, industrial-organizational psychology, medicine,
applied physiology, and biomechanics/ergonomics.
Researchers in these professions provided the foun-
dation for the current multidisciplinary approaches
used in PES research today.
4. Job analysis – The foundation of physical
employment standards
Defining job requirements was critical to early
researchers and laid the foundation for gathering,
organizing, analyzing, and documenting information
about the workplace. The framework for job anal-
ysis was initially conceptualized by psychologists
Corrected Proof
4D.L. Gebhardt / Historical perspective on physical standards
Lillian and Frank Gilbreth and Frederick Taylor who
observed work and wanted to improve efficiency and
productivity in the early 1900s [4]. This approach
was expanded over the years by many industrial-
organizational (I-O) psychologists who developed the
job analysis methods used today to identify physi-
cal job requirements in terms of essential/critical job
tasks, worker requirements, physical abilities, and
ergonomic parameters.
John Flanagan [18] developed the critical inci-
dence technique that involves a set of procedures for
observing and gathering information about a specific
human activity that occurs for a purpose and has con-
sequences related to a worker’s action or inaction.
Flanagan’s initial research was used to select avia-
tors during World War II and the Korean War and
was expanded to addressing pilot selection and clas-
sification in relation to aircraft requirements. We use
the critical incidence technique today as part of the
job analysis to collect information about job tasks
and the consequences if not performed properly. For
instance, workers will explain the physical demand of
driving railroad spikes and repairing track. However,
it is the interviewer’s responsibility to elicit informa-
tion about the consequences of not performing these
tasks successfully, such as train derailment.
Sidney Fine [19] created a structure for task state-
ments by defining a task as an action or action
sequence designed to contribute to a specified result
within a time period. He described job analysis in
terms of data, people, and things in a hierarchical
manner ranging from simple to complex actions. The
task or task sequence may be primarily physical such
as carrying objects or mental such as analyzing data.
First, the action the worker is performing should be
defined such as lift a carton. Second, one should
include the result of the worker action such as lift
and load cartons onto a truck for delivery. In other
words, use action verbs and define the “to do what”
purpose. An example from the shipbuilding industry
contains Fine’s task structure: Use open/closed end
wrenches and socket sets to tighten and loosen bolts
on machinery and equipment (e.g., pumps) [20].
Edwin Fleishman [11] took another approach and
created a job analysis taxonomy that was abil-
ity oriented. His book, Structure and Measurement
of Physical Fitness [11] was cited worldwide and
formed the basis of a larger taxonomy that included
physical abilities such as static strength, along with
psychomotor (e.g., reaction time), cognitive, and fine
motor abilities. Unlike other researchers who defined
physical abilities during the same time period, Fleish-
man generated 7-point Likert scales for each ability
that allows workers to identify a level of ability
demand that corresponds to everyday tasks [21]. For
instance, a moderate level of static strength (4 on a
7-point scale) equates to lifting 18.1 kg. These scale
ratings form the basis for classifying all jobs across
the physical abilities and other abilities (e.g., cogni-
tive, psychomotor) in the U.S. Department of Labor
O*NET system. In 2013 the European Centre for
the Development of Vocational Training (Cedefop)
used the O*NET and European, German, Italian, and
Czech skills and social surveys to generate the Euro-
pean skills, competencies and occupations taxonomy
(ESCO) [22].
Use of O*NET and Cedefop taxonomies provides
an avenue for developing physical employment stan-
dards for multiple jobs within an organization by
grouping jobs with similar demands. Organizations
who institute PES for multiple jobs typically want
the same assessments for these jobs. Use of indi-
vidual tests for each job would be inefficient and
costly. Thus, they opt for an abilities approach and
use the same cut scores for jobs with moderate phys-
ical demand and different cut scores for jobs with
higher demand. For example, a study in the ship-
building industry for over 30 jobs found that workers
lift and carry 9.1–22.7 kg, drag heavy welding and
air lines, and climb ladders and scaffolding multiple
times daily [23]. Creating a master task list across
all jobs as illustrated in Fig. 2 allowed for use of job
specific tasks, while equating tasks with compara-
ble physical demand. For instance in Figure 2, if a
job (e.g., Rust Machine Operator-16) has the “none”
recorded for a task, it is not critical for that job. If a
color is recorded, the task was critical to the job. This
approach facilitated use of a physical ability taxon-
omy to identify the levels of the abilities for each
job and classify 30+jobs by physical demand (e.g.,
strength, aerobic capacity).
In summary, job analysis provides the founda-
tion for physical employment standards by defining
the purpose and outcomes of a job, along with
the worker functions, performance techniques, and
equipment used to perform job tasks. Information
may be gathered from incumbents, supervisors, job
standard operating procedures and policies, and train-
ing materials. Although the approach for gathering
job analysis information may vary by profession (e.g.,
I/O psychology, ergonomics), multiple standardized
methods (e.g., interviews, surveys) should be used
to ensure legal defensibility of the physical employ-
ment standards. The published literature outlines a
Corrected Proof
D.L. Gebhardt / Historical perspective on physical standards 5
Burner
Carbon
Arc
1
Cableman
2
Carpenter
3
Electric
Maint
4
Joiner
8
Machin
Outside
10
Mech
Heavy
Duty
19
Mech
Heav
Equip
Tech
21
Pipe-
fitter
25
Pipe-
welde r
26
Rust
Machine
Operator
16
Shipfitter
28
Test &
Trials
Mech
31
Wel de r
32
TASK STATEMENT
PUSH/PULL
none none none none none none none none none
Pull welding li nes/cables to wor k location fo r cutting or
weldi ng.
none none none
Drag air lines/ cable to work location for operation of
pneumatic tools or removal of dust/debris.
CLIMB
none none
Climb wooden ladders (6-20 feet) in the shop, bays, or on
the ship to access work area.
none
none
Climb vertical metal ladders up to 20 feet in the shop,
bays, or on the shi p to access work area.
Fig. 2. Example of tasks with the same movement pattern and physical demand across jobs but written to reflect specific criteria for a job.
Gebhardt, DL, Baker, TA, Volpe, EK, St. Ville, KA. Job analysis of Huntington Ingalls shipbuilding jobs. Volume 1: Job analysis. Beltsville,
MD: Human Performance Systems, Inc.; 2015. If Black & White needed, see below.
variety of job analysis techniques ranging from job
observations, questionnaire design, data analysis, and
critical task identification. The goal is to identify job
requirements that are critical to successful job per-
formance and define levels of physical performance,
where feasible.
5. Work-related physical performance
research
The second segment of job analysis involves the
identification of physical work demands by exer-
cise physiologists, biomechanists, and ergonomists.
Much of the continuing work to determine the phys-
ical demand of tasks in a variety of jobs occurred
in Germany, United Kingdom (UK), and the United
States (U.S.) after World War II. This was partially
due to the war effort and women working in a vari-
ety of male dominated professions (e.g., munitions
plant).
In the 1950s Turner [24] determined the energy
costs of selected light and heavy industrial jobs that
involved working with plastic and hard rubber molds.
The energy expenditures for heavy jobs ranged from
6.0 kilocalories per minute (kcal·min–1) for loading
chemicals into a mixer to 4.6 and 3.6 kcal·min–1for
straightening lead contact bars and working with hard
rubber molds, respectively. During the same time
frame, British researchers found the energy expen-
ditures of Scottish coal miners range from 3.8 to
7.1 kcal·min–1for tasks involving use of picks and
shovels [25].
Others such Per-Olof and Irma ˚
Astrand contributed
to this early research. Irma ˚
Astrand et al. [16] deter-
mined the energy output for fishermen using oxygen
uptake and heart rate. The demand of the tasks
Table 1
Classification of workloads
Oxygen Uptake
(liter•min–1)
Light Work 0.0–0.5
Moderate Work 0.5–1.0
Heavy Work 1.0–1.5
Very Heavy Work 1.5–2.0
Extremely Heavy Work 2.0+
Adapted from ˚
Astrand PO, Rodahl K. Textbook of work physiol-
ogy: Physiological basis of exercise.New York,NY: McGraw-Hill;
1977. p. 462.
ranged from 2.5–5.0 kcal·min–1for handling lines,
baiting lines, and steering to 10.5–14.4 kcal·min–1
for pulling in nets. This research showed the average
energy expenditure during work on board the ship was
approximately 39% of the fishermen’s VO2max with
some activities reaching 80% of maximum oxygen
uptake.
Researchers who conducted early aerobic demand
studies laid the groundwork for identifying physio-
logical work demands. In addition, their investiga-
tions focused on assessing different quantities and
rates of work in relation to an individual’s ability to
safely perform a job without undue fatigue. This led
to classifications of industrial work demand (Table 1)
in the late 1970s [17]. Thus, physiological measures
such as oxygen uptake can be combined with other
job analysis information to accurately classify jobs
by physical demand across a job family or a total
organization.
On-the-job injuries triggered much of the initial
work that identified physical workplace requirements
that were costly to both the employer and employee.
Manual materials handling jobs accounted for a high
percentage of low back and other musculoskele-
tal injuries. As worker injuries increased, physical
Corrected Proof
6D.L. Gebhardt / Historical perspective on physical standards
abilities such as strength and coordination, preva-
lent in most arduous jobs, became the focus of
studies by industrial engineers, biomechanists, and
ergonomists. In the 1970s, researchers quantified the
forces required to move 4-wheel carts, lift objects,
and perform other push/pull tasks using dynamome-
ters, load cells, and force platforms to identify the
strength requirements and limitations. They quanti-
fied manual materials handing factors and the impact
on the musculoskeletal system. Snook and Ciriello
[26] developed tables that indicated the maximum
acceptable lifting weight for percentages of several
male and female populations. Table 2 contains val-
ues from the Snook tables that show 75% of women
in industrial jobs can lift objects weighing 10 kg at 2-
minute intervals throughout the work day compared
to 75% of industrial men who lift 19 kg objects at 2-
minute intervals. However, when the object weight
is 25 kg, only 50% of males can lift it every two
minutes from floor level to knuckle height, while
50% of women can lift 12 kg objects for the same
rate and height. Although these tables are very help-
ful in changing the workplace requirements to fit
the worker, they cannot be imposed as employment
standards with different requirements for males and
females due to employment statutes and the work-
place requirements.
Chaffin and associates investigated the impact of
manual materials handling on low back pain and
determined the magnitude of the compressive force
on the L5/S1discs (e.g., 650 kg) that was hazardous
when lifting objects [27]. They used this informa-
tion in a biomechanical model to identify variations
in load and the locations in relation to the center of
mass that resulted in lower L5/S1compressive forces
[28]. Figure 3 illustrates the decrease in acceptable
lift weight in relation to vertical and horizontal dis-
tances from the selected body markers (e.g., ankle)
and height off the floor. Chaffin and associates’ work
resulted in a battery of static strength tests for indus-
trial workers that the United States National Institute
for Occupational Safety and Health (NIOSH) pub-
lished for use by industry in evaluating workers
strength capacities [29]. This document also included
lifting guidelines for males and females based on the
height and frequency of a lift and horizontal distance
of the object from center of mass for different lift dis-
tances such as floor to knuckle, knuckle to shoulder,
and shoulder to overhead reach. Ayoub, Garg, and
associates expanded the NIOSH studies by develop-
ing dynamic lifting models that addressed time, force,
and torque, and strength norms for men and women
[30, 31].
This body of research resulted in the NIOSH
Revised Lifting Equation that used variables such as
object weight, hand position, vertical distance from
the ankle, angle of movement, lift frequency, dura-
tion of lifts, and object coupling to evaluate whether
asymmetrical lifting tasks were within acceptable
ranges [32]. The equation incorporated biomechan-
ical, physiological, and psychophysical criteria to
determine whether the lift or movement is within
safe parameters. Although use of this equation assists
organizations in limiting weights lifted by work-
ers and redesign of the workplace, it may not be
viable for use in physical employment standards
Table 2
Example of maximum acceptable weight lifted by various percentages of male and female industrial workers
Floor Level to Kunckle Height Knuckle Level to
One lift every Shoulder Height
WidthaDistancebPercentc59 14 12530 8 59 14 12530 8
seconds minutes hr seconds minutes hr
Male 0 75 90 6 7 9 11 13 14 14 17 8 10 12 13 14 14 16 17
75 9 11 13 161920 21 24 1014 16 181819 21 23
50 12 15 17 22 25 27 28 32 13 17 20 22 23 24 26 29
25 15 18 21 28 31 34 35 41 16 21 24 27 27 28 32 35
10 18 22 25 33 37 40 41 48 19 24 28 31 32 33 37 40
Female 0 75 90 5 6 7 7 8 8 9 12 5 6 7 9 9 9 10 12
75 7 8 9 9 10 10 11 14 6 7 8 10 11 11 12 14
50 8 10 10 11 12 12 13 17 7 8 9 11 12 12 13 16
25 9 11 12 131414 15 21 8 9 10 131414 15 18
10 11 13 14 14 15 16 17 23 9 10 11 14 15 15 17 20
aWidth= distance from body in cm. bDistance=vertical distance of lift in cm. cPercent =industrial population percentage (e.g., males) who
can lift specific weight at a given frequency. dNumber = weight in kg. Adapted from: Snook SH, Ciriello VM. Maximum weights and
workload acceptable to female workers. Journal of Occupational Medicine. 1974. 16(8):527-34. Snook SH, Irvine C, Bass, SF. Maximum
weights and workloads acceptable to male industrial workers. American Industrial Hygiene Association Journal. 1970;31:579-86.
Corrected Proof
D.L. Gebhardt / Historical perspective on physical standards 7
Fig. 3. Changes in lifting capacity related to vertical and horizon-
tal distance from selected body markers. Note. Adapted from Work
practice guide for manual lifting, by National Institute for Occu-
pational Safety and Health (NIOSH), 1981, p. 75, Copyright 1981
by the U.S. Department of Health and Human Services.
because employers cannot always modify the work-
place. They can stipulate that an object weighing over
22.7 kg requires two people, but this is not always
feasible.
The European Union legislated a Council Direc-
tive in 1990 (90/269/EEC of 29 May 1990) to reduce
the risk of back injuries (fourth individual Directive
within the meaning of Article 16 (1) of Directive
89/391/EEC) and provide minimum health and safety
requirements for manual materials handling [33]. The
directive stipulated that employers shall use mechan-
ical equipment when at all possible to avoid the need
for manual materials handling by workers. The Euro-
pean approach was more directive to employers than
the U.S. approach. Although these government bod-
ies provided guidelines that would hopefully reduce
injuries, the work setting does not always allow for
changes in worker dynamics. For example, in the
shipbuilding industry large 500 to 3,000 metric ton
gantry and tower cranes lift large sections of a ship
into place. However, riggers lift and move shackles,
chain falls, come-a-longs, slings, and chains weigh-
ing 13.6 kg to 73.0 kg when rigging a ship section to
a crane [20]. Although the technology has eliminated
some of the arduous tasks, heavy lifting, pushing,
and pulling tasks remain present in the workplace.
This fact is seen in recent research in the Netherlands
that addressed the sequence of bricklaying and how
to implement ergonomic measures for effective task
performance [34], which was similar to Gilbreth’s
bricklaying research [3].
In summary, the physiological, biomechanical,
and ergonomic parameters provide detailed informa-
tion related to critical job tasks and overall work
demands and have been used to increase produc-
tivity and reduce some of the physical demand in
the workplace. As instrumentation to measure these
parameters advanced, reanalysis of job tasks has
expanded our knowledge of work demands. Although
these studies add to our knowledge base, gener-
ating employment tests and physical employment
standards that evaluate an individual’s aerobic and
strength capabilities posed more challenges.
6. Physical tests for employment purposes
In the mid to late 1970s physical performance test-
ing became more prevalent in employment selection
and resulted in new employment laws, guidelines,
and litigation. The military, fire service and law
enforcement agencies in Australia, Canada, U.S., and
several European countries were the predominant
organizations using physical assessments to deter-
mine whether an individual was qualified for arduous
jobs. Employers used two types of tests to assess
applicants’ physical capabilities in relation to job
demands. These were basic ability and job simulation
assessments, which remain in use today. Basic abil-
ity tests evaluate a single physical ability or construct
associated with performance of job tasks such as mus-
cular strength, muscular endurance, aerobic capacity,
anaerobic power, flexibility, equilibrium, and coordi-
nation [35]. This type of test has three advantages:
(a) assesses individual abilities, (b) can be used for
multiple jobs, and (c) is practical when there is lim-
ited space or transporting a test to multiple locations.
Alternately, job simulations include essential com-
ponents of the job and can include tools and objects
used by workers but cannot include actions or actual
tasks that would be learned during training or on the
job (e.g., handcuffing). The advantages of job simula-
tions include a resemblance to the job and the ability
to develop the test directly from job analysis data.
Corrected Proof
8D.L. Gebhardt / Historical perspective on physical standards
When developing or selecting basic ability or job
simulation assessments, there are several parameters
one should consider. The first parameter addresses
statistical properties and includes reliability, validity,
and adverse impact. The reliability of basic ability
tests such as arm lift, dynamic lift, 300-meter run,
and beep test ranged from 0.40 to 0.95 [35, 36], while
job simulations (e.g., pursuit run, carton lift), ranged
from 0.50 to 0.91 [35, 37]. In studies that compared
both basic ability and job simulation physical tests to
a job performance measure (criterion measure) such
as picking products for an order in a warehouse or
pursuing and handcuffing a perpetrator, basic ability
tests had predominantly higher validities (0.02–0.81)
than job simulation (0.37–0.63) [35]. However, some
of the low validities occurred for basic ability tests
associated with measures of flexibility and equilib-
rium.
Adverse impact in physical assessment typically
occurs in relation to male-female differences. Some
test developers state that job simulations have less
adverse impact than basic ability tests. However,
recent research demonstrated that basic ability and
job simulations have comparable levels of adverse
impact. A meta-analysis study coded physical tests
based on Gebhardt and Baker’s [35, 37] classifi-
cation approach to investigate sex differences and
adverse impact [38]. This study found basic ability
tests involving muscular strength (e.g., grip strength,
push-ups, shuttle run) had slightly less adverse impact
on women (δ= 1.60) than job simulations (e.g., hose
drag, casualty transportation) (δ= 1.94), where δis
the weighted effect size for the sample size corrected
for measurement and sampling errors in the crite-
rion [38]. Conversely, the level of adverse impact
for cardiovascular assessments was the same between
basic ability tests (e.g., treadmill, step test) (δ= 1.87)
and job simulations (e.g., emergency response cir-
cuit, shoveling) (δ=1.93). A large-scale male study
(n= 50,000+) that investigated ethnic differences
between basic ability tests and job simulations found
that White males tended to perform better than
African Americans on both basic ability and job
simulation tests that involved continuous movement.
However, African Americans and Whites perfor-
mance was similar for basic ability tests and job
simulations involving muscular strength [39]. These
studies demonstrated that basic ability tests involving
muscular strength and muscular endurance may have
less adverse impact on women than job simulations,
while basic ability tests and job simulations of car-
diovascular endurance had the same level of adverse
impact. Further, there are racial differences for tests
involving continuous movement.
The fourth parameter centers on practical issues
such as cost, logistics, test administration, and scor-
ing paradigms. Considerations for basic ability tests
focus on the cost of test equipment and who will
administer the tests. For job simulations the issues
involve obtaining a test location that is not cost
prohibitive, ensuring that tasks simulations do not
include trainable skills, constructing the test to allow
for set-up in multiple locations, and generating scor-
ing procedures that reflect minimum job performance
and individual differences.
In addition, environmental parameters such as tem-
perature, protective clothing, and work location can
affect physical test composition and cut scores. Heat
and cold stress occur in outdoor and indoor physical
jobs and can be exacerbated by protective equipment
worn by workers. High temperatures and humidity
in the workplace results in longer times to com-
plete tasks, heat stress and illness, and mortality
[40]. Further, the use of protective clothing increases
metabolic rate in relation to its thickness and number
of layers which can escalate heat stress [41, 42]. Cold
environments affect the respiratory system and can
lead to pain in the extremities and musculoskeletal
and tissue injuries, along with decreased mobility and
manual dexterity [41]. Thus, to ensure that the assess-
ment accurately reflects the job demands, clothing
and equipment worn on the job may be incorporated
into the physical test.
Workplace location can affect the demand of a job.
For example, Fig. 4 shows that airport security per-
sonnel at larger airports (i.e., Cat X, Cat 1) handle
a greater quantity of heavy baggage than at smaller
airports (i.e., Cat 2, Cat 3) [43]. Thus, some locations
may require different employment standards.
7. Enactment of employment guideline and
statutes
After the passage of the Civil Right Act of 1964
in the U.S., two landmark cases related to employ-
ment discrimination (Griggs v. Duke Power, 1970;
Albemarle Paper Co. v. Moody, 1975) led to the
United States’ Equal Employment Opportunity Com-
mission (EEOC) publishing the Uniform Guidelines
on Employee Selection, which established standards
for applicant selection procedures, addressed adverse
impact, and prohibited employment discrimination
based on race, color, religion, sex, or nation origin
Corrected Proof
D.L. Gebhardt / Historical perspective on physical standards 9
Fig. 4. Effect of airport size in relation to size and quantity of
baggage handled. Cat X and Cat 1 = large airports; Cat 2 and Cat
3 = smaller airports. Whetzel DL, Gebhardt DL, Baker TA, Erk
RT, Fleisher MS, Volpe EK, St. Ville KA, Oliver JT, Geimer JL,
Chang T. Job analysis of transportation security officer job series.
Alexandria, VA: Human Resources Research Organization; 2012.
[44]. This document had a profound effect on the
drafting of employment requirements in other coun-
tries such as Canada, UK, and Australia from 1988
to 2010 [45, 46]. For example, the Canadian Psycho-
logical Association adopted the Uniform Guidelines
in the development of their principles and poli-
cies for employment practices, which influenced the
Canadian human rights codes and commissions in
multiple provinces such as the Ontario’s Human
Rights Commission [46, 47]. Likewise countries
such as South Africa (South African Employment
Act 1988) adopted the Uniform Guidelines valid-
ity section, while the UK in concert with European
Discrimination law (1990) used the Uniform Guide-
lines premises and expanded them to include age,
gender reassignment, disability, marriage, pregnancy,
and sexual orientation [33] and later combined sepa-
rate employment statutes to form the British Equality
Act of 2010 [48]. However, Australia enacted sepa-
rate statutes to address employment discrimination
(Racial Discrimination Act 1975, Sex Discrimina-
tion Act 1984, Disability Discrimination Act 1992,
Age Discrimination Act 2004) [46]. The employment
statues and guidelines across these countries applied
to all assessments (e.g., physical tests, interviews,
job evaluations) and employment decisions including
selection, promotion, retention, and training.
Besides addressing development of assess-
ment procedures (e.g., job analysis, validation
approaches), these guidelines and statutes focused
on methods for assessing adverse impact and the
obligation of the employer and/or test developer to
reduce adverse impact. The most common method
to assess whether adverse impact exists against a
protected group (e.g., race, sex, ethnic group) is the
4/5 s or 80% rule, which indicates adverse impact is
present if the minority (protected) group’s passing
rate is less than 80% of the majority group’s passing
rate [44]. In physical testing the minority group of
concern is typically women. Other methods should
be used to confirm the 4/5ths rule result, which
can be affected by sample size. One method is the
Standard Deviation or Z test that investigates whether
differences in passing rates are due to chance at a
probability valve of 0.05 [49]. A difference of 2
standard deviations indicates adverse impact when
comparing the expected number of passes to the
actual number. A second method is the Fisher’s
Exact Test that calculates all2x2combinations to
determine whether differences in passing rates are
due to chance [49]. Adverse impact is present if the
probability value is significant at the 0.05 level.
8. Impact of case law
Although most organizations and test developers
follow the criteria for developing a valid assessment,
litigation abounds in relation to physical employ-
ment standards with the U.S. having more physical
employment standards litigation than other coun-
tries. The UK and Australia have much lower rates
of assessment related litigation, with Canada having
seminal legal cases that shaped their Human Rights
laws [46].
In Berkman v. City of New York [50], Brenda Berk-
man failed a physical test in 1978 for a firefighter
position. Using the 4/5ths rule, the court found the
test discriminated against women and ordered a new
test be developed. Berkman and other women passed
the test and were hired by the city. This case set a
precedent for all physical employment standards in
the U.S. emphasizing that the test must reflect job
standards. In Canada, the Meiorin case shaped seg-
ments of future employment decisions [51]. Tawney
Meiorin was employed as a contract firefighter in
British Columbia in 1989 and hired by the British
Columbia government in 1992. In1994 she failed
the physical employment test and was terminated
from her job. The Labour Arbitration Board ruled in
favor of Meiorin, but the Court of Appeal overturned
this decision. Subsequently, the Canadian Supreme
Court overturned the appeals court decision, rein-
stated Meiorin, and issued a three-part test to evaluate
whether a discriminatory standard is an occupational
requirement. The three-part test stated that “(a) the
Corrected Proof
10 D.L. Gebhardt / Historical perspective on physical standards
standard was adopted for a purpose that is ratio-
nally connected to job performance; (b) the particular
standard was adopted in an honest and good faith
belief that it was necessary to the fulfilment of that
legitimate work-related purpose; and (c) the stan-
dard is reasonably necessary to the accomplishment
of that legitimate work-related purpose” (impossible
to accommodate without undue employer hardship)
[51]. These two cases set legal precedence in rela-
tion to physical employment standards. More detailed
examinations of past legal cases in physical testing are
in reviews by Gebhardt and Baker [35] and Hogan and
Quigley [52].
Recent litigation in the U.S. addressed one of
the current issues in physical employment standards,
which is the use of a single cut score or gender normed
scores. In Bauer v. Holder [53] a trainee in the Federal
Bureau of Investigation (FBI) academy challenged
the use of gender-normed physical standards as a
graduation requirement from the academy (30 push-
ups for men, 14 push-ups for women). The district
court found for the plaintiff and stated that gender-
normed tests were discriminatory because female law
enforcement personnel perform the same physical
tasks as their male counterparts. During the appeals
process, the FBI stressed that the assessments were
fitness tests and the Court of Appeals (4th Circuit)
upheld gender normed standards as a novel issue but
did not address whether the level of physical per-
formance was a bona fide occupational qualification
[54]. In a third review the District Court upheld the
gender-normed standard from the Court of Appeals
[55]. Although this is only one of a limited num-
ber of gender-normed cases, it is important to note
that other litigation upheld a single cut score(s) and
that this ruling stated that the male and female phys-
ical employment standards must be equal. To date
gender-normed standards were only relevant to law
enforcement positions using basic ability tests. Pri-
vate sector industries and fire service organizations
use physical assessments with single cut scores.
9. Identification of cut scores
For the past 50 years, cut scores or performance
standards have been used to identify individuals who
can perform or be trained to perform the essential
job tasks or a segment of a job. A variety of methods
have been used to determine cut scores that are rea-
sonable, useful, and consistent with acceptable job
performance. The methods used depend upon the
type of validity data available (e.g., content valid-
ity, criterion-related validity)1and range from expert
judgment to comparison of test and job performance
data. In 1939 Taylor and Russell developed a method
to estimate the percentage of new employees who
would perform a job successfully and identify a cut
score based on a validity coefficient, the number of
applicants needed to fill vacant positions, and the
percentage of current employees who perform the
job successfully [56]. This method used data from
a criterion related validity study and employer hiring
needs. In the 1990s a judgmental cut score approach,
bookmarking, was introduced in which subject mat-
ter experts identify a score on a test that indicated the
likelihood an individual would be successful (e.g.,
probability of 0.67). However, relying on judgment
and history of past performance did not account for
changes in applicant populations or job demands.
Empirical methods such as expectancy tables, con-
tingency tables, ergonomic data, pass/fail tables, and
Pareto analysis use validity (e.g., test and job perfor-
mance measures), job analysis, and adverse impact
data to identify cut scores [56].
Expectancy tables show the percentage of individ-
uals meeting or exceeding a specific score point, their
expected level of job performance, and differences in
job performance across test scores. For example, for a
test score of 72 the table would show that 90% of test
takers met or exceeded this score and these test takers
had a job performance of score of 24. For a score of
78, 80% of the test takers met or exceeded this score
and had mean job performance of 29. This 5-point
jump in job performance suggests that an applicant
with a score of 78 or higher would have better job per-
formance than one with a score of 72. Test scores with
larger increases in job performance point to potential
cut scores.
Contingency tables have been used to determine
cut score accuracy by determining the percentage of
correct (true passes, true failures) and incorrect (false
passes, false failures) decisions [56]. Combined with
an expectancy table, one can ascertain whether an
increase in job performance with a specific test score
coincides with an acceptable level of correct deci-
sions as shown in the formula below for a sample of
1Content validity shows the assessment is a representative sam-
ple of significant parts of the job as obtained in the job analysis.
Construct validity involves identifying an ability or trait that under-
pins successful job performance. Selection procedure measure the
candidate’s level of a characteristic/ability that is important to
job success. Criterion-related validity demonstrates a statistical
relationship between test scores and measures of job.
Corrected Proof
D.L. Gebhardt / Historical perspective on physical standards 11
165 individuals with 143 true passes, 8 true failures,
7 false passes, and 7 false failures.
Correct Decisions =(true passes +true failures)
(true passes +true failures +false passes +failures)
Correct Decisions =143 +8
(143 +8+7+7)
=0.915 or 91.5
Thus, if the expectancy table showed a marked
increase in job performance at a specific score (e.g.,
78 from above example) with a high level of cor-
rect decisions (e.g., 91.5%), then that score would be
selected as a cut score.
With the increased level of scrutiny in physi-
cal testing as seen in the litigation starting in the
1970s, additional cut score assessments have evolved.
One common approach is to generate pass/fail tables
that evaluate the impact of potential cut scores on
protected groups by showing the passing rate of a
minority group (e.g., women) in relation to the major-
ity group (e.g., men). This approach hinges on the
4/5ths rule for evaluating adverse impact. For exam-
ple, if 91% of the majority group and 74% of the
minority group achieve a selected cut score (e.g., 78
from the above example), there is no adverse impact
(74%/91% = 0.81) since this value (0.81) is greater
than 4/5 or 0.80. However, if the minority group pass
rate is 70%, adverse impact is present (i.e., 0.77) since
this value is below 0.80.
As far back as the early 1900s researchers such as
Sargent advocated use of multiple physical tests to
evaluate individuals [6]. Today physical test batter-
ies typically consist of three or more assessments and
can be scored individually (multiple hurdle approach)
or as a composite (compensatory approach). When
using a compensatory approach, the tests can be
unit weighted with each test contributing equally to
the overall score or weighted based on the statisti-
cal analysis (e.g., regression). If adverse impact is
present, Pareto analysis provides a method to investi-
gate the changes in job performance and diversity
using optimal weighting factors [57]. The Pareto
weighting approach optimizes two variables such as
job performance and diversity at the same time to
locate the optimal weighting factors for the test
battery components that yield the greatest job per-
formance and increase in diversity. This analysis
requires test and job performance data from a mini-
mum sample size of 100 and was shown to provide
better predicted job performance with a decrease in
adverse impact [57]. The Pareto-optimum weigh-
ing occurs at the point where one variable (e.g.,
job performance) cannot be improved without a
worse outcome for the second variable (e.g., adverse
impact). This statistical analysis has potential to
reduce adverse impact in physical testing. Greater
details about the Pareto analysis and other methods
to set cut scores are found in articles by Song et al.
[57] and Gebhardt [56].
In summary, no single best approach exists to set
cut scores and human judgment is involved in all
methods. The soundest approach for setting legally
defensive cut scores involves integrating multiple
methods and sources of information that lead to a
preponderance of evidence that a cut score is useful
(e.g., predicts job performance of new hires) and fair.
10. Benefits of physical employment
standards
Physical tests have existed for a long time, but
it was only in the late 1970s that a greater focus
was placed on the validity of the tests in employ-
ment settings. Past research identified the demands
of job tasks and organizations implemented physical
assessments to select workers who could safely and
effectively perform arduous job tasks with a minimal
risk of injury. Due to the proprietary nature of per-
sonnel selection research many of the studies were
not published. However, there are published stud-
Table 3
Cost and injury reductions in railroad industry with use of physical employment testing
Railroad industry (5 years of Injury Data) Test No Test
Number Hired 12,741 15,794a
Number Injured 648 3,898
Musculoskeletal Injuries % of Total Injuries 74.8% 71.1%
Costs for Tested v. Not Tested [Covariates: Age, Job Tenure, & Year Injured] $15,316b$66,147
Days Lost [Covariates: Age, Job Tenure, & Year Injured] 79.1 142.1
aSample estimated from total workers due to lack of accurate hiring data. bp< 0.01.
Corrected Proof
12 D.L. Gebhardt / Historical perspective on physical standards
ies related to use of pre-employment physical tests
in personnel selection and their efficacy in terms
of injury reduction, decrease worker compensation
costs, improved productivity, and increased profit
margins.
Arnold, Rauschenberger, Soubel, and Guion [58]
developed and validated a strength test battery
that exhibited high correlations between muscular
endurance tests and a simulation of steel worker job
tasks. They implemented the test to select steel work-
ers and after a 6-month period found that the new
hires work productivity doubled for workers hired
using the physical test, which equated to increased
productivity for an individual worker of $5,000 in
1982 dollars or $13,113 in 2018.
Baker and Gebhardt [59] validated a test bat-
tery that included muscular strength and muscular
endurance tests for selection of railroad train ser-
vice workers. After implementing the physical test
to select train service workers, the railroad acquired
another railroad that serviced the same geographical
areas. Thus, injury data were available for one rail-
road that used a physical test for selection and one that
did not. To determine the effectiveness of the test bat-
tery, they conducted prospective utility analyses that
included days lost from work, restricted duty days,
gross settlement costs, legal expenses and adminis-
trative costs [60]. Data for the utility analysis were
obtained for new hires in original and acquired rail-
roads for a 5-year period. Table 3 shows that 648 of the
original railroad’s new hires (test group; n= 12,714)
sustained injuries, while 3,898 of the acquired rail-
road (no test group) were injured during the same
5-year period. Controlling statistically for age, job
tenure, and year injured (ANCOVA), these results
showed injury costs and days lost from work were
significantly lower (p< .01) for workers tested prior
to entry into the job than for workers hired with-
out a pre-employment physical test. The increased
cost to replace a single worker in the acquired rail-
road (no test group) for the additional lost days
(142.1–79.1 =63 days) compared to the original rail-
road would be $17,438 in 2018 at an hourly wage of
$34.60 per hour ($10,574 in 1995). Thus, substantial
savings were achieved by screening applicants for the
train service job.
Anderson and Briggs [60] showed that workers in
manual materials handling jobs who passed a phys-
ical selection test had a 47% lower injury rate and a
21% higher retention rate. Legge [61] implemented
a functional capacity assessment for security person-
nel prior to entry into annual defensive tactic training.
This testing and remedial training resulted in a reduc-
tion of annual injury costs of $187,000 to almost
zero over a 2-year period. Knapik et al. [62] ana-
lyzed injuries in a law enforcement academy over
a 6-year period and found higher injury rates for
recruits with lower scores on a physical test battery
with most injuries associated with defensive tactics
and fitness training. As is evident, use of physical
employment standards has the following benefits: (a)
decreased injury risk, (b) decrease cost to employer;
(c) improved productivity; and (d) increased profit
margin.
11. Conclusions
The history of physical assessment and employ-
ment standards demonstrated that arduous jobs
remain in the workplace today. Approximately 28%
of workforce performs jobs with moderate to heavy
physical demand [1]. Thus, individuals with the capa-
bilities to perform arduous job in an effective and safe
manner are needed to ensure productivity and injury
reduction in industrial, law enforcement, fire and res-
cue, and military settings. PES meet the need to hire
workers that can perform job tasks effectively and
safely. These standards and the accompanying phys-
ical assessments provide valid predictions related to
performance of arduous job tasks [23, 35, 38]. Fur-
ther, use of physical assessments in the selection
setting resulted in employer benefits ranging from
reduction in lost work time, injuries, and turnover to
increases in productivity [35, 58–61].
The challenges for the future remain like those
of the past. PES must be job related and cut scores
must be reasonable in relation to the demands
of the job. Ensuring the validity of the physical
tests and cut scores will help avoid litigation, as
will further research into methods to optimize test
utility while decreasing adverse impact (e.g., Pareto-
optimization). As more women enter physical jobs,
we can increase our knowledge base related to
their job and test performance. Continued efforts to
demonstrate the utility of physical assessments and
return on investment in terms of increased productiv-
ity and decreased costs related to injuries, lost time
from work, and turnover will entice employers to
adopt PES in their organizations.
Conflict of interest
None to report.
Corrected Proof
D.L. Gebhardt / Historical perspective on physical standards 13
References
[1] Bureau of Labor Statistics, U.S. Department of Labor.
The Economics Daily: Physical strength required for jobs
in different occupations in 2016. [online 2017]. Avail-
able: http://www.bls.gov/opub/ted/2017/physical-strength-
required-for-jobs-in-different-occupations-in-2016.htm.
[2] East WB. A historical review and analysis of Army physical
readiness training and assessment. Fort Leavenworth, KS:
Combat Studies Institute Press; 2013.
[3] Gilbreth FB. Bricklaying system. New York, NY: The M.C.
Clark Publishing Co.; 1909.
[4] Hogan JC. Physical abilities. In Dunnette M. editor. Hand-
book of industrial and organizational psychology (Vol. 2).
Palo Alto, CA: Consulting Psychologists Press, Inc.; Chap-
ter 11, 1991, pp. 754-831.
[5] Gebhardt DL, Baker TA, Volpe EK, Billerbeck KT. Devel-
opment and validation of physical performance tests for
selection of Walmart orderfillers. Beltsville, MD: Human
Performance Systems, Inc.; 2009.
[6] Sargent D. Universal test for strength, speed and endurance
of the human body. Cambridge, MA: Powell Press; 1902.
[7] Martin EG. Tests of muscular efficiency. Physiological
Reviews. 1921;3:454-75.
[8] Clark HH. A manual: Cable-tension strength tests.
Chicopee, MA: Brown-Murphy Co.; 1953.
[9] Douglas CG. A method for determining the total respiratory
exchange in man. Journal of Physiology. 1911;42:17-8.
[10] Froelicher VF, Thompson AJ, Noguera I, Davis G, Stewart
AJ, Triebwasser JH. Prediction of maximal oxygen con-
sumption. Chest. 1975;68(3):331-6.
[11] Fleishman EA. Structure and measurement of physical fit-
ness. Englewood Cliffs, NJ: Prentice-Hall; 1964.
[12] Jackson AS. Factor analysis of selected muscular
strength and motor performance tests. Research Quarterly.
1971;42(2):164-72.
[13] Borchart JW. A cluster analysis of static strength tests.
Research Quarterly. 1968;39:258-61.
[14] Baumgartner TA, Zuidema MA. Factor analysis of physical
fitness tests. Research Quarterly. 1972;43(4):443-50.
[15] Snook SH, Irvine C, Bass, SF. Maximum weights and
workloads acceptable to male industrial workers. American
Industrial Hygiene Association Journal. 1970;31:579-86.
[16] ˚
Astrand I, Fugelli P, Carlsson CG, Rodahl K, Vokac, Z.
Energy output and work stress in coastal fishing. Scandi-
navian Journal of Clinical and Laboratory Investigations.
1973;31:1105-13.
[17] ˚
Astrand PO, Rodahl K. Textbookof work physiology: Phys-
iological basis of exercise. New York, NY: McGraw-Hill;
1977.
[18] Flanagan JC. The critical incidence technique. Psychologi-
cal Bulletin. 1954;51(4):327-58.
[19] Fine SA, Wile WW. An introduction to functional job
analysis. Washington, DC: W. E. Upjohn Institute for
Employment Research; 1971.
[20] Gebhardt DL, Baker TA, Volpe, EK, St. Ville KA. Job anal-
ysis of Huntington Ingalls shipbuilding jobs. Volume 1:
Job analysis. Beltsville, MD: Human Performance Systems,
Inc.; 2015.
[21] Fleishman EA, Quaintance MK. Taxonomies of human per-
formance. Orlando, FL: Academic Press; 1984.
[22] European Centre for the Development of Vocational
Training (CEDEFOP). Quantifying skill needs in Europe.
Luxembourg: Publications Office of the European Union;
2013.
[23] Gebhardt DL, Bake TA, Volpe EK, St. Ville MA,
Development and validation of physical assessments for
Huntington-Ingalls shipyard jobs: Volume 2: Test develop-
ment and validation. Alexandria, VA: HumRRO; 2016.
[24] Turner D. Energy cost of some industrial operations. British
Journal of Industrial Medicine. 1955;12:237-41.
[25] Passmore RC, Durnin JVGA. Human Energy expenditure.
Physiological Reviews. 1955;35(4):801-40.
[26] Snook SH, Ciriello VM. Maximum weights and work-
load acceptable to female workers. Journal of Occupational
Medicine. 1974;16(8):527-34.
[27] ChaffinDB, Park KS. A longitudinal study of low-back pain
ass associated with occupational lifting factors. American
Industrial Hygiene Association Journal. 1973;34:513-25.
[28] ChaffinDB, Herring GD, Keyserling WM. Pre-employment
strength testing-An updated position. Journal of Occupa-
tional Medicine. 1978;20:403-8.
[29] National Institute for Occupational Safety and Health
(NIOSH). Work practices guide for manual lifting. Cincin-
nati, OH: NIOSH; 1981.
[30] Ayoub MM, Mital A, Bakken GM, Asfour SS, Bethea N.
Development of strength and capacity norms for manual
materials handling activities: The state of the art. Human
Factors. 1980;22:271-83.
[31] Garg A, Sharma D, Chaffin D, Schmidler JM. Biomechan-
ical stresses as related to motion trajectory lifting. Human
Factor. 1983;25:527-39.
[32] National Institute for Occupational Safety and Health
(NIOSH). Applications manual for the revised NIOSH lift-
ing equation. Cincinnati, OH: NIOSH; 1994.
[33] EUR LexCouncil Directive 90/269/EEC of 29 May 1990 on
the minimum health and safety requirements for the manual
handling of loads. Article 16 (1) of Directive 89/391/EEC.
1990. Available from https://eur-lex.europa.eu/legal-
content/EN/TXT/?uri=CELEX:31990L0269
[34] van der Molen HF, Sluiter JK, Frings-Dresen MHW.
Behavioural change phases of different stakeholders
involved in the implementation process of ergonomics mea-
sures in bricklaying. Applied Ergonomics. 2005;36:S449-
59.
[35] Gebhardt DL, Baker TA. Physical performance tests. In Far
JL, Tippins, NT, editors. Handbook of Employee Selection.
New York, NY: Routledge; 2017. Chapter 12, pp. 277-97.
[36] Myers DC, Gebhardt DL, Crump CE, Fleishman EA. The
dimensions of human physical performance: Factor analy-
sis of strength, stamina, flexibility, and body composition
measures. Human Performance. 1993;6(4):309-44.
[37] Gebhardt DL, Baker TA. Physical Performance. In Scott
J, Reynolds D, editors. Handbook of Work Assess-
ment. San Francisco, CA: Jossey-Bass; 2010, Chapter 7,
pp. 165-196.
[38] Courtright SH, McCormick BW, Postlethwaite BE, Reeves
CJ, Mount MK. A meta-analysis of sex differences in phys-
ical ability: Revised estimates and strategies for reducing
differences in selection contexts. Journal of Applied Psy-
chology. 2013;98:623-41.
[39] Baker TA. Physical performance test results across eth-
nic groups: Does the type of test have an impact? Paper
presented at the Society of Industrial and Organizational
Psychology; New York, NY, 2007.
[40] Cheung SS, Lee JK, Oksa J. Thermal stress, human per-
formance, and physical employment standards. Applied
Corrected Proof
14 D.L. Gebhardt / Historical perspective on physical standards
Physiology, Nutrition, and Metabolism. 2016;41(6 Suppl
2):S148-64.
[41] Dorman LE, Havenith, G. The effects of protective cloth-
ing on energy consumption 508 during different activities.
European Journal of Applied Physiology. 2009;105:463-70.
[42] McLellan TM. Sex-related differences in thermoregulatory
responses while wearing 1122 protective clothing. European
Journal of Applied Physiology. Occupational Physiology.
1998;78(1):28-37.
[43] Whetzel DL, Gebhardt DL, Baker TA, Erk RT Fleisher MS,
Volpe EK, St. Ville KA, Oliver JT, Geimer JL, Chang T.
Job analysis of transportation security officer job series.
Alexandria, VA: Human Resources Research Organization;
2012.
[44] Equal Employment Opportunity Commission, Civil Ser-
vice Commission, Department of Labor, and Department
of Justice. Uniform Guidelines on Employee Selection Pro-
cedures. Washington, DC: Bureau of National Affairs, Inc.;
1978.
[45] Payne W, Harvey J. A framework for the design and
development of physical employment tests and standards.
Ergonomics. 2010;53(7):858-71.
[46] Adams EM. Human rights at work: Physical standards for
employment and human rights law. Applied Physiology,
Nutrition, and Metabolism. 2016;42(6 Suppl 2):S63-73.
[47] Latham GP, Sue-Chan C. A meta-analysis of the situational
interview: An enumerative review of reasons for its validity.
Canadian Psychology. 1999;40:56-67.
[48] Equality Act 2010. The National Archives. [retrieved Mar
2013]. Available from: https://www.equalityhumanrights.
com/en/equality-act-2010/what-equality-act.
[49] Cohen DB, Aamodt MG, Dunleavy EM. Technical advi-
sory committee report on best practices in adverse impact
analysis. Washington, DC: Center for Corporate Equality;
2010.
[50] Berkman v. City of New York, 536 F. Supp. 117 (E. D. N.Y.
1982).
[51] British Columbia (Public Service Employee Relations
Commission v. British Columbia Government and Service
Employees’ Union (B.C.G.S.E.U.) (Meiorin Grievance).
1999.
[52] Hogan JC, QuigleyAM. Physical standards for employment
and the courts. American Psychologist. 1986;41:1193-217.
[53] Bauer v. Holder, 25 F. Supp. 3d 842, 865 (E.D. Va. 2014).
[54] Bauer v. Lynch, 812 F.3d 340 (4th Cir. 2016).
[55] Bauer v. Sessions, No. 1:13-cv-93 (E.D. Va. Mar. 24, 2017).
[56] Gebhardt DL. Establishing performance standards. In Con-
stable S, Palmer B, editors. The process of physical fitness
standards development – State of the art report. Wright-
Patterson AFB, OH: Human Systems Information Analysis
Center (HSIAC-SOAR); Chapter 6, 2000, pp. 179-99.
[57] Song QC, Wee S, Newman DA. Diversity shrinkage:
Cross-validating Pareto-optimal weights to enhance diver-
sity via hiring practices. Journal of Applied Psychology.
2017;102(12):1636-57.
[58] Arnold JD, Rauschenberger JN, Soubel WG, Guion RM.
Validation and utility of strength test for selecting steel-
workers. Journal of Applied Psychology. 1982;67:588-604.
[59] Baker TA, Gebhardt DL. Development and validation
of physical performance tests for train service positions.
Hyattsville, MD: Human Performance Systems, Inc.; 1994.
[60] Baker TA, Gebhardt DL. Utility of physical performance
tests in reduction of days lost and injuries in railroad train
service positions. Beltsville, MD: Human Performance Sys-
tems, Inc.; 2001.
[61] Anderson C, Briggs J. A study of the effectiveness of
ergonomically based functional screening tests and their
relationship to reducing worker compensation injuries.
Work. 2008;31(1):27-37.
[62] LeggeJ. Job-specific functional; assessment associated with
reduction in musculoskeletal injuries in building security
workers. Paper presented at Physical Employment Stan-
dards, Canmore, Alberta Canada, 2015.
[63] Knapik J, Spiess A, Swedler D, Tyson G Hauret K, Yoder J,
Jones B. Retrospective examination of injuries and physical
fitness during Federal Bureau of Investigation new agent
training. Journal of Occupational Medicine and Toxicology.
2011;6:26-37.
Corrected Proof