Content uploaded by Andrew Mo
Author content
All content in this area was uploaded by Andrew Mo on Sep 05, 2021
Content may be subject to copyright.
The Reliability of the AOSpine Thoracolumbar
Classification System in Children: Results of a
Multicenter Study
Andrew Z. Mo, MD,* Patricia E. Miller, MS,†Michael P. Glotzbecker, MD,†Ying Li, MD,‡
Nicholas D. Fletcher, MD,§ Vidyadhar V. Upasani, MD,∥Anthony I. Riccio, MD,¶
Michael T. Hresko, MD,†Walter F. Krengel III, MD,# David Spence, MD,**
Sumeet Garg, MD,†† and Daniel J. Hedequist, MD†
Background: The purpose of this study was to determine whether
the new AOSpine thoracolumbar spine injury classification sys-
tem is reliable and reproducible when applied to the pediatric
population.
Methods: Nine POSNA (Pediatric Orthopaedic Society of North
America) member surgeons were sent educational videos and sche-
matic papers describing the AOSpine thoracolumbar spine injury
classification system. The material also contained magnetic resonance
imaging and computed tomography imaging of 25 pediatric patients
with thoracolumbar spine injuries organized into cases to review and
classify. The evaluators classified injuries into 3 primary categories:
A, B, and C. Interobserver reliability was assessed for the initial
readingbyFleisskappacoefficient (k
F
) along with 95% confidence
interval (CI). For A and B type injuries, subclassification was con-
ducted including A0 to A4 and B1 to B2 subtypes. Interobserver
reliability across subclasses was assessed using Krippendorff alpha
(α
k
) along with bootstrapped 95% CI. Imaging was reviewed a sec-
ond time by all evaluators ~1 month later. All imaging was blinded
and randomized. Intraobserver reproducibility was assessed for the
primary classifications using Fleiss kappa and subclassification re-
producibility was assessed by Krippendorff alpha (α
k
) along with
95% CI. Interpretations for reliability estimates were based on Landis
and Koch (1977): 0 to 0.2, slight; 0.2 to 0.4, fair; 0.4 to 0.6, moderate;
0.6 to 0.8, substantial; and >0.8, almost perfect agreement.
Results: Twenty-five cases were read for a total of 225 initial and
225 repeated evaluations. Adjusted interobserver reliability was
almost perfect (k
F
=0.82; CI, 0.77-0.87) across all raters. Sub-
classification reliability was substantial (α
K
=0.79; CI, 0.62-0.90).
Adjusted intraobserver reproducibility was almost perfect (k
F
=
0.81; CI, 0.71-0.90) for both primary classifications and for
subclassifications (α
k
=0.81; CI, 0.73-0.86).
Conclusions: The reliability for the AOSpine thoracolumbar spine
injury slassification System was high amongst POSNA surgeons
when applied to pediatric patients. Given a lack of a uniform
classification in the pediatric population, the AOSpine thor-
acolumbar spine injury classification system has the potential to be
used as the first universal spine fracture classification in children.
Level of Evidence: Level III.
Key Words: AOSpine, thoracolumbar, pediatric, spine, trauma
(J Pediatr Orthop 2020;40:e352–e356)
In the adult population, there have been significant en-
deavors in developing classification systems of thor-
acolumbar injuries. These systems evolved from simple
morphologic classifications to more complex systems based
on fracture morphology (injury mechanism), evaluation of
posterior ligamentous integrity, and neurological status of
the patient.1–3There does not yet exist a dedicated classi-
fication system for pediatric thoracolumbar fractures despite
the discordance in presentation. Pediatric thoracolumbar
fractures vary in morphology, severity, and morbidity.
Treatments vary from observation to surgery depending on
factors such as fracture stability, displacement, and neuro-
logical status.
The AOSpine Classification Group has developed the
AOSpine Injury Classification System (https://aospine.ao-
foundation.org/clinical-library-and-tools/aospine-injury-classi-
fication-system).4This classification incorporates fracture
morphology (injury mechanism), evidence of posterior liga-
mentous integrity, neurological status of the patient, and pa-
tient specific modifiers to classify injuries. This classification
system also has 4 separate classification systems related to
anatomic location: upper cervical, subaxial, thoracolumbar,
From the *Department of Orthopaedic Surgery, Lenox Hill Hospital,
New York, NY; †Department of Orthopaedic Surgery, Boston Child-
ren’s Hospital, Boston, MA; ‡Department of Orthopaedic Surgery,
University of Michigan, Ann Arbor, MI; §Department of Orthopaedic
Surgery, Children’s Healthcare of Atlanta, Atlanta, GA; ∥Department
of Orthopaedic Surgery, Rady Children’s Hospital, San Diego, CA;
¶Department of Orthopaedic Surgery, Texas Scottish Rite Hospital for
Children, Dallas, TX; #Department of Orthopedics and Sports Medi-
cine, Seattle Children’s Hospital, Seattle, WA; **Department of Or-
thopaedic Surgery, University of Tennessee-Campbell Clinic, Le
Bonheur Children’s Hospital, Memphis, TN; and ††Department of
Orthopedics, University of Colorado School of Medicine, Aurora, CO.
No external funding was received for any aspect of this work.
The authors declare no conflicts of interest.
Reprints: Daniel J. Hedequist, MD, Department of Orthopaedic Surgery,
Boston Children’s Hospital, 300 Longwood Avenue, Boston, MA
02115. E-mail: daniel.hedequist@childrens.harvard.edu.
Copyright © 2020 Wolters Kluwer Health, Inc. All rights reserved.
DOI: 10.1097/BPO.0000000000001521
ORIGINAL ARTICLE
e352
|
www.pedorthopaedics.com J Pediatr Orthop Volume 40, Number 5, May/June 2020
Copyright r2020 Wolters Kluwer Health, Inc. All rights reserved.
and sacral. Independent evaluations have validated interob-
server and intraobserver reliability of this classifcation.5,6 The
AOSpine thoracolumbar spine injury classification system was
developed from and was preceded by elements of the Magerl
classification system, Denis classification system, and Thor-
acolumbar Injury Classification System (TLICS).7,8 In addi-
tion, TLICS incorporates a point system designed to provide
treatment guidelines for surgeons. Although TLICS has been
validated as a reliable classification system in the pediatric
population, there are currently no studies directly assessing the
newest AOSpine TL Classification System in children.9–11
The purpose of this study was to conduct a multicenter
study testing the interobserver reliability and intraobserver
reproducibility of the AOSpine Injury Classification System
when applied to the pediatric population.
METHODS
A retrospective institutional review was performed
utilizing an internal trauma database at a single in-
stitution. Approval for this study was obtained from the
institutional review board. Patients under the age of
18 years who had been treated operatively for a thor-
acolumbar fracture between 2006 and 2016 were identi-
fied. Inclusion criteria included patients with computed
tomography (CT) scans and magnetic resonance imaging
(MRI) and who were younger than 18 years of age. Given
the nature of the database, all included cases were oper-
ative with available complex imaging. Nonoperative cases
were unavailable.
Imaging records of patients who fulfilled study inclusion
criteria were collected and deidentified. Each patient case in-
cluded plain film radiographs, CT, and MRI. CT and MRI
were exported as cine clips, utilizing a native function within the
hospital picture archiving and communication system (Synpase
PACS, Fujifilm Medical Systems USA Inc., Stamford, CT).
These files were uploaded to an online survey interface (Google
Forms, Alphabet Inc., Mountain View, CA), which were div-
ided into 3 forms consisting of sets of patient cases to be re-
viewed and classified according to the AOSpine thoracolumbar
spine injury classification system. The online form allowed
preset entry options corresponding to the AOSpine thor-
acolumbar spine injury classification system.
Radiographic assessment of TL spinal injuries using
the AOSpine thoracolumbar spine injury classification
system (A0 to A4, B1 to B2, or C) was conducted by 9
pediatric orthopaedic surgeons. Each evaluator is a
member of the Pediatric Orthopaedic Society of North
America and based at a level 1 pediatric trauma center in
the United States. Each rater has extensive experience with
pediatric spine trauma patients in addition to elective
practice patients. The 9 evaluators classified injuries into 3
primary categories: A, B, or C. Injury morphology was
classified as an A injury (compression), B injury (dis-
traction), or C injury (translation). For each patient case,
if multiple injuries were present, the most severe injury was
recorded and classified. Type A fractures were graded in
increasing severity as follows: A0 (simple), A1 (compression),
A2 (pincer), A3 (burst involving 1 endplate), and A4 (burst
involving both endplates) (Fig. 1). Type B fractures include:
classic bony chance (B1), failure of the posterior tension band
such as horizontal fracture lines through the posterior elements
or evidence of posterior ligamentous disruption (B2) (Fig. 2),
and hyperextension injuries (B3). Type C fractures/injuries
demonstrate dissociation between cranial and caudal segments
(Fig. 3).
Twenty-five patient cases that met inclusion criteria
were divided into 3 forms consisting of 8, 8, and 9 sets of
patient imaging. These were disseminated to each reviewer
in 1-week intervals. Each reviewer completed each of the
three forms. Before initiation of the study, a test form was
sent to familiarize raters with the interface. The raters
were each provided a poster illustration of the classi-
fication system and a video tutorial. Classification scores
for each patient case were automatically recorded and
uploaded to a digital spreadsheet. Patient cases were
randomized and distributed into 3 new sets and redis-
tributed at 1-week intervals. Each case was reviewed a
second time by all 9 evaluators 1 month from the
initial read.
The classification of each patient case was compared
across raters for interobserver reliability. The classification
of each patient case per reviewer was analyzed for intra-
observer reliability. Intraobserver reproducibility was as-
sessed for the primary classifications using Fleiss kappa
and subclassification reproducibility was assessed by
Krippendorff alpha (α
k
) along with 95% confidence interval
(CI). Fleiss kappa and Krippendorff alpha are considered
FIGURE 1. CT sagittal image demonstrating A4 complete
burst of the L2 vertebra. CT indicates computed tomography.
J Pediatr Orthop Volume 40, Number 5, May/June 2020 Reliability of the AOSpine TL Classification in Children
Copyright © 2020 Wolters Kluwer Health, Inc. All rights reserved. www.pedorthopaedics.com
|
e353
Copyright r2020 Wolters Kluwer Health, Inc. All rights reserved.
adjusted measures of agreement as they adjust for the
ratings of multiple raters and for ratings that would occur
simply by chance. For granularity, the exact percent
of agreement was also reported for primary and sub-
classifications. It is important to clarify that the raw
percent of agreement typically overestimates or underestimates
the true agreement because it does not take into account
agreements made by chance alone or the information provided
by multiple raters. Therefore, interpretations of agreement
should be made with respect to adjusted agreement statistics
only (ie, kappa and alpha coefficients). Interpretations for
reliability estimates were based on Landis and Koch12:
0 to 0.2, slight; 0.2 to 0.4, fair; 0.4 to 0.6, moderate; 0.6 to
0.8, substantial; and >0.8, almost perfect agreement.
Interobserver reliability was assessed for the initial read-
ing across all 9 raters by Fleiss kappa coefficient (k
F
)
along with 95% CI. For A and B type injuries, sub-
classification was conducted including A0 to A4 and B1
to B2 subtypes. Interobserver reliability across subclasses
was assessed using Krippendorff alpha (α
k
)alongwith
bootstrapped 95% CI.
RESULTS
Twenty-five patients met inclusion criteria. The mean
age at injury was 13.4 years (range, 3.6 to 17.8 y). Demo-
graphics are included in Table 1. Utilizing the AOSpine TL
Spine Injury Classification System: 6 patients had type A
injuries, 15 patients had type B injuries, and 4 patients had C
injuries.
Six out of 9 of the subtypes were characterized by at
least 1 rater on the initial review and 7 of 9 were detected
on the second review (Table 2).
Interobserver Reliability
Adjusted interobserver agreement, examining only pri-
mary classifications (A, B, and C), was 82% (k
F
=0.82; CI,
0.77-0.87), suggesting almost perfect agreement across 9 raters
with exact, unadjusted interobserver agreement occurring in
56% (14/25) of cases (Table 3). Adjusted subclassification
agreement was 79% (α
K
=0.79; CI, 0.62-0.90), indicating
substantial agreement, with exact agreement occurring in 32%
(8/25) of cases (Table 3).
Intraobserver Reproducibility
Adjusted intraobserver agreement for primary clas-
sification was 81% (k
F
=0.81; CI, 0.71-0.90), indicating
almost perfect agreement, with exact intraobserver
agreement exhibited in 88% (197/225) of ratings. Inter-
observer agreement for each rater ranged from 0.51 to
FIGURE 3. CT sagittal cut demonstrating C type injury with
T3-T4 translation. CT indicates computed tomography
TABLE 1. Patient, Injury, and Surgical Characteristics (N =25)
Characteristics Frequency (%)
Age (y; mean ± SD) 13.6 3.61
Mechanism of injury
Motor vehicle accident 15 (60)
Fall 7 (28)
Sports related 3 (12)
Procedure type
Posterior 25 (100)
Injury classification
AOSpine
A 6 (24)
B 14 (56)
C 5 (20)
FIGURE 2. MRI STIR sagittal cut demonstrating T12-L1 B2 and
T12 A4 injury with posterior ligamentous complex disruption.
MRI indicates magnetic resonance imaging. STIR indicates
short tau inversion recovery.
Mo et al J Pediatr Orthop Volume 40, Number 5, May/June 2020
e354
|
www.pedorthopaedics.com Copyright © 2020 Wolters Kluwer Health, Inc. All rights reserved.
Copyright r2020 Wolters Kluwer Health, Inc. All rights reserved.
1.00. Adjusted intraobserver agreement for subclassifications
was 81% (α
k
=0.81; CI, 0.73-0.86), suggesting almost perfect
agreement, with exact intraobserver agreement for sub-
classifications exhibited in 68% (152/225) of cases across all
9raters.
DISCUSSION
Thoracolumbar spine trauma classifications have evolved
and expanded significantly from the original studies by Denis
and Magerl. Numerous studies and classification systems have
been created through a series of modifications. TLICS was a
milestone, unifying many of these systems and their ideology,
through the emphasis of fracture morphology, posterior liga-
mentous complex integrity, and neurological status. It was also
pivotal in providing treatment guidelines via a point system,
although this provided only definitive recommendations for
clear cases.
There is currently no uniform classification system in
the pediatric population for fractures of the spine, and no
definitive operative guidelines for surgeons taking care of
children with spinal trauma. Pediatric thoracolumbar
fractures are commonly grouped by morphology into
compression fractures, burst fractures, chance injuries,
and injuries with translation. Many surgeons use the
principles of the TLICS to guide treatment. To date, there
are a handful of studies validating TLICS in children.9–11
The AOSpine Injury Classification System is the next
step in classification development. Through the work of
surgeons in the AOSpine Classification group in the AOSpine
Knowledge Forum the fracture classification has evolved with
the newer AOSpine TL Classification System. Integrating
fracture morphology, posterior ligamentous integrity, and the
neurological status of the patient similar to TLICS, this
classification is more comprehensive than prior classifications
and ranges from simple avulsion fractures of the spine (A0) in
a patient with no neurological injury to severe translational
injuries (Type C) with complete neurological loss.2,5,6 Evalu-
ation of PLC integrity is accomplished by reviewing CT scans,
which can be supplemented with MRI as well. The greater the
evidence of PLC injury (the posterior tension band), the more
severe the fracture/ligamentous injury is thought to be with
subsequent potential for instability, deformity, and neuro-
logical compromise.
The AOSpine TL spine injury classification system has
previously been found to have good interobserver and in-
traobserver reliability and reproducibility in the adult
population.4,5 The results of this study demonstrated high
interobserver reliability (k
F
=0.82; CI, 0.77-0.87) and intra-
observer reproducibility (k
F
=0.81; CI, 0.71-0.90), further
supporting its use in pediatric spine fractures. Spinal stability
and neurological preservation remain the hallmark of treat-
ment. The AOSpine TL spine injury classification system
augments communication to guide treatment.
TheAOSpineTLspineinjuryclassification system incor-
porates neurological status of patients.4The focus of this study
was solely on injury morphology. This study was also limited
by decreased representation of certain subtypes, specifically a
lack of cases demonstrating A1, A2, A3, and B3 subtypes. In
addition, there are anatomic differences in our patient pop-
ulation secondary to varying states of skeletal maturity. Future
studies could address this limitation through increased cohort
sizes. However, this should not affect interobserver and intra-
observer reliability with the cases matching inclusion criteria.
CONCLUSIONS
The benefits of a standard classification include
consistent physician communication regarding fracture
type, accurate data classification for research studies, and
ideally consensus treatment recommendations. Rather
than creating a new classification de novo, the AOSpine
TL Classification System shows considerable promise in
the pediatric population. Common injuries in this pop-
ulation consist of compression fractures, burst fractures,
chance fractures, and more severe translation injuries. The
AOSpine TL Classification System applies well with these
morphology patterns. Our results show high interobserver
reliability and interobserver reproducibility in applying
the AOSpine TL Classification System to the pediatric
population, further strengthening its application to pediatric
spine trauma. Furthermore, the classification system was
readily learnable by a group of pediatric orthopaedists who
had no prior experience using the classification system. The
system was easily learned and applied used by applying
materials available to all surgeons and providers on the
TABLE 2. Distribution of Thoracolumbar Injuries (N =225) for
Each Read
n (%)
AO Class Read 1 Read 2
A0 0 (0) 0 (0)
A1 0 (0) 4 (2)
A2 3 (1) 3 (1)
A3 22 (10) 22 (10)
A4 56 (25) 62 (28)
B1 16 (7) 25 (11)
B2 80 (36) 61 (27)
B3 0 (0) 0 (0)
C 48 (21) 48 (21)
TABLE 3. Interobserver Reliability and Intraobserver
Reproducibility Across 9 Raters
Interobserver
Reliability
Intraobserver
Reproducibility
Primary Classifications
(A, B, and C) k
F
(95% CI) k
F
(95% CI)
Read 1 0.82 (0.77-0.87)
Read 2 0.78 (0.74-0.83)
All reads 0.80 (0.77-0.84) 0.81 (0.71-0.90)
Subclassifications
(A0-A4,B1-B2,C)
α
k
(95% CI) α
k
(95% CI)
Read 1 0.79 (0.62-0.90)
Read 2 0.75 (0.56-0.87)
All reads 0.77 (0.65-0.86) 0.81 (0.73-0.86)
α
k
indicates Krippendorff alpha coefficient; k
F
, Fleiss kappa coefficient; CI,
confidence interval.
J Pediatr Orthop Volume 40, Number 5, May/June 2020 Reliability of the AOSpine TL Classification in Children
Copyright © 2020 Wolters Kluwer Health, Inc. All rights reserved. www.pedorthopaedics.com
|
e355
Copyright r2020 Wolters Kluwer Health, Inc. All rights reserved.
AOSpine website (https://aospine.aofoundation.org/clinical-
library-and-tools/aospine-injury-classification-system). These
resources will be invaluable to providers who are managing
spine trauma patients when communicating to other surgeons
and important to conducting further research in spine trauma.
Further studies are needed to determine the transferability of
the AOSpine classification systems in the other anatomical
regions of the spine
REFERENCES
1. Sethi MK, Schoenfeld AJ, Bono CM, et al. The evolution of
thoracolumbar injury classification systems. Spine J. 2009;9:780–788.
2. Schroeder GD, Harrop JS, Vaccaro AR. Thoracolumbar Trauma
Classification. Neurosurg Clin N Am. 2017;28:23–29.
3. Patel AA, Vaccaro AR. Thoracolumbar spine trauma classification.
J Am Acad Orthop Surg. 2010;18:63–71.
4. Vaccaro AR, Oner C, Kepler CK, et al. AOSpine Thoracolumbar Spine
Injury Classification System. Spine (Phila Pa 1976). 2013;38:2028–2037.
5. Urrutia J, Zamora T, Yurac R, et al. An independent interobserver
reliability and intraobserver reproducibility evaluation of the new
AOSpine Thoracolumbar Spine Injury Classification System. Spine
(Phila Pa 1976). 2015;40:E54–E58.
6. Schroeder GD, Vaccaro AR, Kepler CK, et al. Establishing the injury
severity of thoracolumbar trauma. Spine (Phila Pa 1976). 2015;40:
E498–E503.
7. Magerl F, Aebi M, Gertzbein SD, et al. A comprehensive classification of
thoracic and lumbar injuries. Eur Spine J. 1994;3:184–201.
8. Vaccaro AR, Lehman RA, Hurlbert RJ, et al. A new classification of
thoracolumbar injuries: the importance of injury morphology, the
integrity of the posterior ligamentous complex, and neurologic status.
Spine (Phila Pa 1976). 2005;30:2325–2333.
9. Savage JW, Moore TA, Arnold PM, et al. The reliability and validity
of the Thoracolumbar Injury Classification System in pediatric spine
trauma. Spine (Phila Pa 1976). 2015;40:E1014–E1018.
10. Sellin JN, Steele WJ, Simpson L, et al. Multicenter retrospective
evaluation of the validity of the Thoracolumbar Injury Classification
and Severity Score system in children. J Neurosurg Pediatr. 2016;
18:164–170.
11. Dawkins RL, Miller JH, Ramadan OI, et al. Thoracolumbar Injury
Classification and Severity Score in children: a reliability study. J
Neurosurg Pediatr. 2018;21:284–291.
12. Landis JR, Koch GG. The measurement of observer agreement for
categorical data. Biometrics. 1977;33:159–174.
Mo et al J Pediatr Orthop Volume 40, Number 5, May/June 2020
e356
|
www.pedorthopaedics.com Copyright © 2020 Wolters Kluwer Health, Inc. All rights reserved.
Copyright r2020 Wolters Kluwer Health, Inc. All rights reserved.