ArticlePDF Available

Caldicellulosiruptor Core and Pangenomes Reveal Determinants for Noncellulosomal Thermophilic Deconstruction of Plant Biomass

American Society for Microbiology
Journal of Bacteriology
Authors:

Abstract and Figures

Extremely thermophilic bacteria of the genus Caldicellulosiruptor utilize carbohydrate components of plant cell walls, including cellulose and hemicellulose, facilitated by a diverse set of glycoside hydrolases (GHs). From a biofuel perspective, this capability is crucial for deconstruction of plant biomass into fermentable sugars. While all species from the genus grow on xylan and acid-pretreated switchgrass, growth on crystalline cellulose is variable. The basis for this variability was examined using microbiological, genomic, and proteomic analyses of eight globally diverse Caldicellulosiruptor species. The open Caldicellulosiruptor pangenome (4,009 open reading frames [ORFs]) encodes 106 GHs, representing 43 GH families, but only 26 GHs from 17 families are included in the core (noncellulosic) genome (1,543 ORFs). Differentiating the strongly cellulolytic Caldicellulosiruptor species from the others is a specific genomic locus that encodes multidomain cellulases from GH families 9 and 48, which are associated with cellulose-binding modules. This locus also encodes a novel adhesin associated with type IV pili, which was identified in the exoproteome bound to crystalline cellulose. Taking into account the core genomes, pangenomes, and individual genomes, the ancestral Caldicellulosiruptor was likely cellulolytic and evolved, in some cases, into species that lost the ability to degrade crystalline cellulose while maintaining the capacity to hydrolyze amorphous cellulose and hemicellulose.
Content may be subject to copyright.
Caldicellulosiruptor Core and Pangenomes Reveal Determinants for
Noncellulosomal Thermophilic Deconstruction of Plant Biomass
Sara E. Blumer-Schuette,
a,d
Richard J. Giannone,
b,d
Jeffrey V. Zurawski,
a,d
Inci Ozdemir,
a,d
Qin Ma,
c,d
Yanbin Yin,
c,d
Ying Xu,
c,d
Irina Kataeva,
c,d
Farris L. Poole II,
c,d
Michael W. W. Adams,
c,d
Scott D. Hamilton-Brehm,
b,d
James G. Elkins,
b,d
Frank W. Larimer,
b
Miriam L. Land,
b,d
Loren J. Hauser,
b,d
Robert W. Cottingham,
b,d
Robert L. Hettich,
b,d
and Robert M. Kelly
a,d
Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, North Carolina, USA
a
; Biosciences Division, Oak Ridge National
Laboratory, Oak Ridge, Tennessee, USA
b
; Department of Biochemistry and Molecular Biology, University of Georgia, Athens, Georgia, USA
c
; and BioEnergy Science Center,
Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
d
Extremely thermophilic bacteria of the genus Caldicellulosiruptor utilize carbohydrate components of plant cell walls, including
cellulose and hemicellulose, facilitated by a diverse set of glycoside hydrolases (GHs). From a biofuel perspective, this capability
is crucial for deconstruction of plant biomass into fermentable sugars. While all species from the genus grow on xylan and acid-
pretreated switchgrass, growth on crystalline cellulose is variable. The basis for this variability was examined using microbiolog-
ical, genomic, and proteomic analyses of eight globally diverse Caldicellulosiruptor species. The open Caldicellulosiruptor pan-
genome (4,009 open reading frames [ORFs]) encodes 106 GHs, representing 43 GH families, but only 26 GHs from 17 families
are included in the core (noncellulosic) genome (1,543 ORFs). Differentiating the strongly cellulolytic Caldicellulosiruptor spe-
cies from the others is a specific genomic locus that encodes multidomain cellulases from GH families 9 and 48, which are associ-
ated with cellulose-binding modules. This locus also encodes a novel adhesin associated with type IV pili, which was identified in
the exoproteome bound to crystalline cellulose. Taking into account the core genomes, pangenomes, and individual genomes,
the ancestral Caldicellulosiruptor was likely cellulolytic and evolved, in some cases, into species that lost the ability to degrade
crystalline cellulose while maintaining the capacity to hydrolyze amorphous cellulose and hemicellulose.
Interest in cellulosic biofuels (29) has sparked efforts to isolate
microorganisms capable of both hydrolysis and fermentation of
plant biomass, a process referred to as consolidated bioprocessing
(CBP) (49,50). Since plant biomass deconstruction could be ac-
celerated at elevated temperatures, thermophilic microorganisms
have been considered catalysts for CBP (8). Of particular note in
this regard are members of the extremely thermophilic genus Cal-
dicellulosiruptor that inhabit globally diverse, terrestrial hot
springs (12,27,56,57,61,69,80,98) and thermally heated mud
flats (31). Caldicellulosiruptor species are Gram-positive bacteria
and typically associate with plant debris; consequently, all isolates
characterized to date hydrolyze certain complex carbohydrates
characteristic of plant cell walls (8,97). As such, members of the
genus Caldicellulosiruptor are excellent genetic reservoirs of en-
zymes for plant biomass degradation and, pending the develop-
ment of functional genetics systems, are potential metabolic hosts
for CBP (9).
Currently, there are two main paradigms described for micro-
bial degradation of crystalline cellulose: cellulosomal (3) and non-
cellulosomal (48,54). Enzymatically, both systems require the
concerted efforts of cellobiohydrolases, endocellulases, and -glu-
cosidases (49). Crystalline cellulose deconstruction via cell mem-
brane-bound cellulosomes was first described in the thermophile
Clostridium thermocellum and has since been described in other
mesophilic Firmicutes, such as Clostridium cellulolyticum,Ace-
tivibrio cellulolyticus, and Ruminococcus flavefaciens (3). Analysis
of genome sequence data from biomass-degrading microorgan-
isms has helped to identify noncellulosomal bacteria that also lack
identifiable cellobiohydrolases, such as Cytophaga hutchinsonii
(96) and Fibrobacter succinogenes (77), both of which require close
attachment to cellulose for efficient hydrolysis, and Sacharophagus
degradans (95), which uses processive endocellulases (94), indi-
cating that there is great diversity in strategies used for crystalline
cellulose hydrolysis. As members of the phylum Firmicutes,Cal-
dicellulosiruptor species are distinct from the thermophilic, anaer-
obic clostridia in that they secrete free and S-layer-bound cellu-
lases and hemicellulases (9,23,24,43,44,58,60,63,75,84,89,90)
that are not assembled into cellulosomes (85,89). In this respect,
their strategy for crystalline cellulose deconstruction is similar to
that for noncellulosomal biomass-degrading aerobic fungi, such
as Trichoderma reesei,(54), the thermophilic fungi Myceliophthora
thermophila and Thielavia terrestris (7), or the thermophilic aer-
obe Thermobifida fusca (48).
The noncellulosomal strategy used by the genus Caldicellulo-
siruptor for plant biomass deconstruction involves novel, multi-
domain, carbohydrate-active enzymes (23,24,58,60,75,84,89).
However, while all Caldicellulosiruptor species hydrolyze hemicel-
lulose, not all can degrade crystalline cellulose. This gives rise to
significant disparity across the genus with respect to the capacity
to deconstruct plant cell walls. To date, only limited information is
available on the genus Caldicellulosiruptor, especially with respect
to the diversity within the genus and the characteristics of individ-
ual species. Given the variability within the genus for cellulose
deconstruction, insights into this important environmental and
biotechnological capability could be obtained by comparative ex-
Received 2 March 2012 Accepted 17 May 2012
Published ahead of print 25 May 2012
Address correspondence to Robert M. Kelly, rmkelly@eos.ncsu.edu.
Supplemental material for this article may be found at http://jb.asm.org/.
Copyright © 2012, American Society for Microbiology. All Rights Reserved.
doi:10.1128/JB.00266-12
August 2012 Volume 194 Number 15 Journal of Bacteriology p. 4015– 4028 jb.asm.org 4015
amination of weakly to strongly cellulolytic Caldicellulosiruptor
species. To this end, here we examine the core genomes and pan-
genomes of eight members of this genus, in conjunction with exo-
proteomics analysis, to identify common and distinctive determi-
nants that drive plant biomass deconstruction.
MATERIALS AND METHODS
Cultivation of Caldicellulosiruptor spp. Seven Caldicellulosiruptor spe-
cies were revived from freeze-dried cultures provided by the German Col-
lection of Microorganisms and Cell Cultures (DSMZ [http://www.dsmz
.de]) in the recommended culture medium, after which they were
transferred to modified DSMZ medium 640 (Trypticase, resazurin, cys-
teine-HCl, and FeCl
3
·6H
2
O were not added; the reducing agent was 10%
[wt/vol] Na
2
S·9H
2
O added to a final concentration of 0.5% in prepared
medium) (9). The eighth strain of the species examined in this study, C.
obsidiansis, was isolated from the Obsidian Pool, Yellowstone National
Park (27). Complex substrates used as carbon sources for growth included
microcrystalline cellulose (Avicel; PH-101; FMC), birchwood xylan
(Sigma), and acid-pretreated switchgrass (Panicum virgatum [70]), all
added to growth medium at 5 g/liter and, in the case of biomass, 5 g/liter
wet weight. Cell density measurements at 24 h were averages from two
biological replicates in 50-ml cultures. Enumerations of cell densities were
conducted under epifluorescence microscopy using acridine orange
(Kodak) as a fluorescent dye (30). The qualitative rating of cellulose hy-
drolysis ability was by the organism’s ability to shred Whatman no. 1 filter
paper while being grown in Hungate tubes, as described previously (9).
Those species capable of shredding filter paper were designated cellulo-
lytic. Those that were noted to grow on microcrystalline cellulose (Avicel)
but not shred filter paper were designated weakly cellulolytic.
Genomic DNA isolation and quality control. High-molecular-weight
genomic DNA (gDNA) from five Caldicellulosiruptor species was harvested as
described before (9). Overall, the cultures were grown to early stationary
phase on DSMZ culture medium as recommended by DSMZ (without res-
azurin), using DSMZ medium 671 with cellobiose (for C. hydrothermalis,C.
kristjanssonii,C. kronotskyensis, and C. lactoaceticus) and DSMZ medium 144
with glucose (for C. owensensis). Cultures were harvested by centrifugation at
5,000 rpm for 15 min, and gDNA was isolated as previously described (20),
except with an additional step requiring lysozyme (100 mg/ml), and the final
precipitation of gDNA in isopropanol was collected by a flamed glass hook
and then gently washed in 70% (vol/vol) ethanol. Dried gDNA was resus-
pended in Tris-EDTA (TE) buffer to roughly 1 g/l and checked for quality
on a 1% [wt/vol] agarose gel in 1Tris-acetate-EDTA (TAE) buffer. Molec-
ular size standards and the protocol for assessing gDNA quality using aga-
rose gel electrophoresis were both provided by the DOE Joint Genome
Institute (JGI) (http://my.jgi.doe.gov/general/protocols/20100809-Genomic
-DNA-QC.doc).
Genome sequencing. The finished genome sequences of C. bescii (60,
98), C. obsidiansis (17), and C. saccharolyticus (89) were completed prior
to this project. For C. hydrothermalis,C. kristjanssonii,C. kronotskyensis,
C. lactoaceticus, and C. owensensis, a combination of 454 Titanium (51)
and Illumina (5) technologies was used (10), similar to the sequencing
strategy for C. obsidiansis. Detailed protocols explaining these methods
can be found at http://www.jgi.doe.gov/.
Genome assembly and annotation. For the five genome sequences
that were completed for this project, assembly has been previously de-
scribed (10). Genes were identified using Prodigal (33) as part of the Oak
Ridge National Laboratory genome annotation pipeline, followed by a
round of manual curation using the JGI GenePRIMP pipeline (64). The
predicted open reading frames (ORFs) were translated and used to search
the National Center for Biotechnology Information (NCBI) nr database
(6), UniProt (88), TIGRFAM (26), Pfam (68), PRIAM (14), KEGG (34),
COG (83), and InterPro (100) databases. These data sources were com-
bined to assert a product description for each predicted protein. Noncod-
ing genes and miscellaneous features were predicted using tRNAscan-SE
(46), RNAmmer (39), Rfam (25), TMHMM (36), and SignalP (v3.0) (4).
The C. saccharolyticus annotation was updated using the same pipeline,
except that manual curation was done without GenePRIMP. Further an-
notation of selected proteins included molecular size and isoelectric point
(pI) prediction (19), signal peptide prediction (SignalP v4.0) (67), and
transmembrane (TMHMM) (36) prediction.
Phylogenetic analysis. All three copies of 16S rRNA genes were used
in the construction of a phylogenetic tree. ClustalW (86) was used to align
16S sequences from all sequenced Caldicellulosiruptor species, plus one
copy of a 16S rRNA gene from three distantly related species. Pairwise
distance calculations were done using the MEGA4 software package (82)
with the Tajima-Nei substitution model. These distance calculations were
then used to construct dendrograms based on neighbor joining and as-
sessed with 1,000 bootstraps. Average nucleotide identity was used to
assess the relatedness of species, taking their whole-genome sequences
into consideration. All eight sequenced Caldicellulosiruptor species and
the same three outliers mentioned above were uploaded into the Jspecies
package using the ANIb BLASTn option (71). Average nucleotide identity
(ANI), reported as percent identity, was represented using the cellplot
feature of JMP (v9) to create a heat plot. ANIb percentages can be found in
Table S1 in the supplemental material.
Prediction of orthologous and functional groups of proteins. Using
all eight finished genomes, orthologous groups of proteins were predicted
by OrthoMCL (42). Parameters were selected at a Pvalue of 1E5, per-
cent identity cutoff of 0, percent match cutoff of 0, Markov clustering
algorithm (MCL) inflation of 1.5, and molecular weight of 316. Or-
thoMCL output (see Data Set S1 in the supplemental material), based on
protein-protein homology, was used to compute the core genome and
pangenome according to Tettelin et al. (85). Top-ranked similarity
searches against genomes in the KEGG database (34) used BLASTp (1).
Functional classification of proteins was determined based on searches
against databases from NCBI (COG) (83), CAZy (13), integrated micro-
bial genomes (IMG) (53), and InterProScan sequence search (100). Pre-
dictions of carbohydrate transporters were done as previously described
(91) and also utilized the Find Functions database of the IMG portal (53).
Fractionation of substrate-bound proteins, extracellular proteins,
and intracellular proteins. Seven Caldicellulosiruptor species, four cellu-
lolytic and three weakly cellulolytic, were selected for proteomic analysis.
Samples were transferred on Avicel PH-101 three times prior to inocula-
tion of two independent 500-ml cultures, each in 1,000-ml, 45-mm-di-
ameter screw-top Pyrex bottles. A starting inoculum of 1 10
6
cells/ml
was used for all cultures, and growth proceeded in batch for 24 h. After 24
h, biological repeats were combined for processing. Spent Avicel with
substrate-bound (SB) proteins was isolated by decanting the supernatant
(SN) and whole cells (WC) and washing the SB fraction twice with ice-
cold DSMZ medium 640, following which the medium was decanted and
combined with the SN-plus-WC fraction. Further washing of the SB frac-
tion was done, as described previously (72), with TBS-Ca-T buffer (25
mM Tris-HCl, pH 7.0, 150 mM NaCl, 1 mM CaCl
2
, and 0.05% [vol/vol]
Tween 20). Cell-free SN fraction was obtained by centrifugation at 4°C
and 5,000 rpm for 15 min, followed by bottle-top filtration through a
0.22-m-pore-size filter (Millipore). The resulting WC pellet after cen-
trifugation was washed once with ice-cold DSMZ medium 640 and col-
lected by centrifugation as described above.
Proteomic measurements of Avicel-induced protein fractions. Each
fraction for proteomic analysis (WC, SN, and SB) was independently pre-
pared for bottom-up, two-dimensional liquid chromatography-tandem
mass spectrometry (2D-LC-MS/MS) to retain fractional protein localiza-
tion. Proteins in each fraction were first isolated and denatured by one of
the following related methodologies. (i) Cells in the WC fraction were
lysed by a combination of boiling and sonication (Branson sonifier) in
SDS lysis buffer (4% SDS, 100 mM Tris-HCl, 50 mM dithiothreitol
[DTT]). Released proteins (2 mg of crude lysate as measured by bicin-
choninic acid [BCA] assay) were then isolated via 20% trichloroacetic acid
(TCA) and resuspended in 250 l 8 M urea as previously described (21).
(ii) SN proteins were concentrated to 1 ml by centrifugal membrane fil-
Blumer-Schuette et al.
4016 jb.asm.org Journal of Bacteriology
tration (Vivaspin 20 PES; 5-kDa cutoff; GE Healthcare), TCA precipi-
tated, acetone washed, and resuspended in 250 l of 8 M urea. (iii) Pro-
teins bound to Avicel (SB) were first stripped from the spent substrate
(10 ml) with 10 ml SDS lysis buffer plus boiling and sonication. Samples
were then centrifuged at 4,500 gand supernatant collected. Proteins in
this crude SB fraction were then concentrated, precipitated, washed, and
resuspended as described for method ii. Following isolation and denatur-
ation, proteins obtained from each fraction, now in 250 l of 8 M urea,
were reduced (dithiothreitol), alkylated (iodoacetamide), digested
(two additions of trypsin), and prepared for 2D-LC-MS/MS analysis as
previously described (21). Peptide concentrations were measured by
BCA assay.
To reveal the protein complement of each fraction, 25, 50, or 100 g
peptides (SB, SN, and WC, respectively) was bomb loaded onto a biphasic
MudPIT back column (55,93) packed with 5 cm strong cation exchange
(SCX) resin, followed by 3 cm reversed-phase (RP) resin (Luna and
Aqua resins, respectively; Phenomenex). Loaded peptides were then
washed/desalted offline, placed in line with an in-house pulled nanospray
emitter packed with 15 cm RP resin, and analyzed via MudPIT 2D-LC-
MS/MS analysis as previously described (21). Briefly, for WC analysis, a
full 24-h MudPIT was employed (11 salt cuts at 5, 7.5, 10, 12.5, 15, 17.5,
20, 25, 35, 50, and 100% of 500 mM ammonium acetate followed by a
100-min organic gradient). For both SN and SB peptide fractions, a mini-
MudPIT was utilized (4 salt cuts at 10, 20, 35, and 100% of 500 mM
ammonium acetate followed by a 100-min organic gradient). Peptide
fragmentation data were collected via a hybrid LTQ XL-Orbitrap mass
spectrometer (Thermo Fisher Scientific) operating in data-dependent
mode. Full MS1 scans (2 microscans; 5 MS/MS per MS1) were obtained
using the Orbitrap mass analyzer set to 15K resolution, while MS/MS
scans (2 microscans) were obtained/performed in the LTQ mass spec-
trometer.
Resultant peptide fragmentation data (MS/MS) obtained from
each fraction/organism were scored against their respective annotated
proteomes, which were downloaded from NCBI (Table 1) using the
SEQUEST database-searching algorithm (18). Peptide-sequenced MS/MS
spectra were filtered (XCorr [cross-correlation score], 1, 1.8; 2, 2.5;
3, 3.5; DeltCN [normalized difference between first and second match
scores], 0.08) and assembled into protein loci by DTASelect (81). Peptide
spectral counts (SpC) resulting from intraspecies, nonunique peptides
were balanced across their shared protein source (21) to prevent overes-
timation of protein abundance that could occur between proteins with
high degrees of homology, i.e., glycoside hydrolases. Once balanced, SpC
for each fraction (SB, SN, and WC) were converted to normalized spectral
abundance factors (NSAF) (101) by applying a fractional SpC shift (0.33)
to all proteins as described in reference 43. Normalized data from each
species and fraction were grouped together based on OrthoMCL to iden-
tify trending by orthologous proteins. Using the NSAF values, enrichment
scores for both SB (SBE NSAF
SB
/[NSAF
WC
NSAF
SN
]) and EC
(ECE [NSAF
SB
NSAF
SN
]/NSAF
WC
) fractions were calculated. Subcel-
lular and extracellular partitioning was calculated (50% indicates equal
partitioning) to visualize in Excel how the NSAF was split between frac-
tions.
RESULTS AND DISCUSSION
General genome characteristics. Eight closed Caldicellulosiruptor
genome sequences were examined to seek out determinants for
the capacity to degrade plant biomass, defined by the ability to
hydrolyze the various polysaccharide components of biomass, in-
cluding crystalline cellulose (Table 1). These species represent
globally diverse, terrestrial isolation sites (North America, Ice-
land, Russia, and New Zealand) (Fig. 1A). Genome sizes for the
Caldicellulosiruptor species range from 2.43 to 2.97 Mb, with an
average genome size of 2.74 Mb and average GC content of
35.5% across the genus (Table 1). Previous work has demon-
strated a range in cellulolytic capacity for this genus of closely
related members (9). No one feature of the Caldicellulosiruptor
genomes appears to correlate with location or phenotype; how-
ever, the two North American strains have the smallest genomes,
both by nucleotide number and ORF count (Table 1). Phyloge-
netic analysis based on the three 16S rRNA loci found in each
genome (Fig. 1B) confirms previous reports that the genera are
closely related to each other, with C. saccharolyticus, an isolate
from New Zealand, being the most divergent among this group.
Dendrograms were built based on 16S rRNA phylogeny of species
sharing common biogeography form location-specific clades, re-
gardless of phenotype, such as the isolates from North America,
Iceland, and Russia (Fig. 1A and B). Using members from the
orders Clostridiales,Thermoanaerobacterales, and Dictyoglomales
as outgroups, C. saccharolyticus appears to be the oldest member
of its genus due to greater divergence from the other species, hav-
ing branched off earliest in the Caldicellulosiruptor clade (Fig. 1A).
The ancestral nature of C. saccharolyticus is reinforced by con-
sidering the whole genome using average nucleotide identity
(ANI) (Fig. 1C)(71). Since location-specific clades formed when
we used 16S rRNA sequences, we explored whether or not this
same trend would hold true once entire genomes were considered.
This proved to be the case with the Icelandic species, which are
highly related (98% shared identity), and the North American
species (92% shared identity) (Fig. 1C; also see Table S1 in the
supplemental material). Interestingly, one species isolated from
Russia, C. hydrothermalis, is slightly more related to an Icelandic
species, C. lactoaceticus, when ANI is considered (Fig. 1C; also see
Table S1). Furthermore, when the closed genome sequences are
aligned based on geographical location, areas of macrosynteny are
apparent, again regardless of cellulolytic phenotype (see Fig. S1).
These areas of macrosynteny are not apparent when all eight
TABLE 1 General Caldicellulosiruptor genome characteristics
Species
Accession no.
Genome size (Mb) No. of protein-coding genes GC content (%) ReferenceCulture Genome
C. bescii DSM 6725 CP001393 2.93 2,776 35.2 15
C. hydrothermalis DSM 18901 CP002219 2.77 2,625 36.5 10
C. kristjanssonii DSM 12137 CP002326 2.80 2,648 36.0 10
C. kronotskyensis DSM 18902 CP002330 2.84 2,583 35.0 10
C. lactoaceticus DSM 9545 CP003001 2.62 2,492 36.1 10
C. obsidiansis ATCC BAA2073 CP002164 2.53 2,331 35.2 17
C. owensensis DSM 13100 CP002216 2.43 2,264 35.4 10
C. saccharolyticus DSM 8903 CP000679 2.97 2,760 35.2 89
Cellulose Degradation by Caldicellulosiruptor spp.
August 2012 Volume 194 Number 15 jb.asm.org 4017
genomes are aligned due to genetic rearrangement during evolu-
tion of the genus (data not shown). While 16S phylogeny and ANI
are widely used for taxonomic classification of species, they are not
appropriate metrics to assign phenotypes within the genus Caldi-
cellulosiruptor, especially with respect to cellulolytic capability.
Growth on plant biomass and complex carbohydrates differ-
entiates between weakly and strongly cellulolytic Caldicellulo-
siruptor species. To explore the relationship between genome
content and growth on complex carbohydrates, the eight Caldicel-
lulosiruptor species were cultured on crystalline cellulose (Avicel),
xylan, acid-treated switchgrass, and filter paper (Fig. 2). While all
species grew well on acid-pretreated SWG, which contains hemi-
cellulose, cellulose, and pectin (70), more variability was noted for
growth on Avicel (Fig. 2A). All species also grew well on xylan, as
expected, based on the core genome (Fig. 2). However, growth on
Avicel (Fig. 2A) and filter paper (Fig. 2B) differentiated the
strongly from weakly cellulolytic species across the genus. In gen-
eral, C. bescii,C. kronotskyensis,C. saccharolyticus, and C. obsidi-
ansis grew best on Avicel and filter paper, with C. lactoaceticus
growth being at a somewhat lower level. C. hydrothermalis,C.
kristjanssonii, and C. owensensis, however, grew less on these sub-
strates and did not break down filter paper to any visible extent
(Fig. 2B). These growth experiments provided a perspective for
comparative genomic analysis with respect to crystalline cellulose
hydrolyzing capability.
Caldicellulosiruptor core and pangenomes. To identify spe-
cific determinants among the Caldicellulosiruptor genomes that
would enable some species and not others to hydrolyze crystalline
cellulose, a baseline view of the genomes is required. The Caldicel-
lulosiruptor core and pangenomes (see Fig. S1 and Data Set S1 in
the supplemental material), based on these eight sequenced spe-
cies, contain 1,580 and 4,009 genes, respectively (42,85). The pan-
genome was found to be open, such that the projected number of
orthologous genes discovered with each new species sequenced
reaches an asymptote of 125 genes. This result is not surprising,
given that these species are isolated from dynamic environments,
specifically environments with variable nutrient types for organo-
trophic growth (27). Functional characterization of the core Cal-
dicellulosiruptor genome using COG analysis indicated that while
translation and amino acid transport families are enriched in the
core versus pangenome, carbohydrate metabolism and transport
remain the major features of the genus Caldicellulosiruptor (see
FIG 1 Biogeography of sequenced Caldicellulosiruptor species. (A) Global distribution of cellulolytic and weakly cellulolytic species. Squares denote cellulolytic
species and circles denote weakly cellulolytic species. Colors shading the shapes indicate common isolation locations. (B) Phylogenetic tree using 16S rRNA
sequences from sequenced species plus related outliers. MEGA4 was used to calculate distances and build the phylogenetic tree (82). (C) Phylogenomic heat plot
using ANI as a measure of relatedness. Red indicates more closely related species, while gray to blue indicates more distantly related species; the percentages of
homology for each pairing of species can be found in Table S1 in the supplemental material. Abbreviated species names are after the assigned locus tags and are
the following: Cbes, C. bescii; Calhy, C. hydrothermalis; Calkr, C. kristjanssonii; Calkro, C. kronotskyensis; Calla, C. lactoaceticus; COB47, C. obsidiansis; Calow, C.
owensensis; Csac, C. saccharolyticus; Cthe, Clostridium thermocellum; Dtur, Dictyoglomus turgidum; and Teth39, Thermoanaerobacter pseudethanolicus.
Blumer-Schuette et al.
4018 jb.asm.org Journal of Bacteriology
Fig. S2 and Table S2). For the core genome, approximately 120
genes are involved in carbohydrate transport and metabolism ac-
cording to COG classification (see Table S2), which includes the
main glycolysis pathway, 6 ABC transporters, and 30 CAZy-re-
lated proteins (see Fig. S3). This suggests that the core Caldicellu-
losiruptor genome is capable of extracellular xylan, glucan, and
starch hydrolysis, xylan deacetylation, and the import of the re-
sulting oligosaccharides and their catabolism through central me-
tabolism (Fig. 3; also see Fig. S4). Interestingly, all six of the core
ABC transporters are from the CUT1 group (see Table S4) (91),
which forms the basis for Caldicellulosiruptor organotrophic im-
port of oligosaccharides (74,76), which are then further processed
to their respective monosaccharides. Of additional interest is the
colocalization of glycoside hydrolases and carbohydrate ABC
transporters, especially among those included in the Caldicellulo-
siruptor core genome (see Fig. S3). A previous study (91) also
observed this phenomenon in C. saccharolyticus and may be indic-
ative of synergy between centralized carbohydrate-hydrolyzing
and import systems. However, the core genome suggests that not
all Caldicellulosiruptor species are capable of crystalline cellulose
hydrolysis, given that GHs belonging to families known to exhibit
these biocatalytic capabilities are not identifiable in several ge-
nomes.
The convergence of the number of orthologs in the core
genome and the open nature of the Caldicellulosiruptor pange-
nome indicates that each species is endowed with a set of specific
genes beyond the core that relate to the types of carbohydrates
present in their environment. Comparisons between the frequen-
cies of the unique Caldicellulosiruptor KEGG BLASTp hits in the
Caldicellulosiruptor core genome versus pangenome showed an
increase in unique proteins in the pangenome versus the core
genome, with C. bescii possessing the largest number and highest
frequency of unique Caldicellulosiruptor proteins (see Table S3 in
the supplemental material). Analysis of the top-ranked BLASTp
hits from strongly cellulolytic versus weakly cellulolytic species
did not exhibit any trends based on cellulolytic capability. Top-
ranked KEGG BLASTp hits do highlight the major phyla with
homologs to proteins from the genus Caldicellulosiruptor, includ-
ing Firmicutes,Dictyoglomi,Thermotogae,Proteobacteria, and
Euryarchaeota. Since the genus Caldicellulosiruptor is classified
under the phylum Firmicutes, identifying the majority of best
BLASTp hits from Firmicutes was expected (47). Members of the
phyla Dictyoglomi,Thermotogae,Proteobacteria, and Euryar-
FIG 2 Capacity for crystalline cellulose deconstruction and growth of Caldi-
cellulosiruptor species on complex substrates. (A) Cell density (cells/ml) for
each species after 24 h of growth on the following: Avicel, microcrystalline
cellulose; Xylan, birchwood xylan; and SWG, acid-treated switchgrass. Stan-
dard deviations are equal to one-third or less of the cell density. Abbreviations
are after the assigned locus numbering system and are the following: C, con-
trol; 1, Cbes, C. bescii; 2, Calhy, C. hydrothermalis; 3, Calkr, C. kristjanssonii;4,
Calkro, C. kronotskyensis; 5, Calla, C. lactoaceticus; 6, COB47, C. obsidiansis;7,
Calow, C. owensensis; and 8, Csac, C. saccharolyticus. (B) Microbial decon-
struction of Whatman number 1 filter paper during growth. Fibers released
from the substrate at the bottom of the Hungate culture tube are indicative of
enzymatic activity against crystalline cellulose.
FIG 3 Core carbohydrate-active enzymes and carbohydrate-binding mo-
tif-containing proteins from all eight Caldicellulosiruptor species. (A) Core
glycoside hydrolases (GH), polysaccharide lyases (PL), carbohydrate
esterases (CE), and carbohydrate-binding motifs (CBM). Numbers refer to
protein families established and curated by CAZy (http://www.cazy.org)
(13). (B) Core glycoside hydrolases for strongly cellulolytic species.
Dashed-line parentheses indicate gene truncations in C. lactoaceticus (†)
and C. saccharolyticus (‡); solid-line parentheses indicate an additional
CBM family 3 domain in C. lactoaceticus.
Cellulose Degradation by Caldicellulosiruptor spp.
August 2012 Volume 194 Number 15 jb.asm.org 4019
chaeota are often isolated or identified from the same locations as
the genus Caldicellulosiruptor (32,37). As such, Caldicellulosirup-
tor proteins that are direct homologs to proteins from the above-
mentioned phyla are likely the result of historical horizontal gene
transfer (HGT) in their environment (52). Common biogeogra-
phy influencing 16S rRNA and ANI-based phylogenetic analyses
was not necessarily observed in the context of the number of dis-
tinct phyla from KEGG best BLASTp hits, indicative of HGT that
is not otherwise detected by phylogenetic analyses. For example,
the highly related species C. kristjanssonii and C. lactoaceticus
(ANI, 98 to 98.1%; see Table S1) share similar frequencies of best
BLASTp hits from the major related phyla (see Table S3); how-
ever, C. kristjanssonii had BLASTp best hits to a total of 31 phyla,
while C. lactoaceticus had best hits to 22 phyla. Due to the open
nature of the Caldicellulosiruptor pangenome, HGT events are im-
portant for the evolution of Caldicellulosiruptor species capable of
succeeding in their dynamic environments. Increasing the num-
ber of Caldicellulosiruptor genome sequences available would also
further identify unique genes acquired through HGT, a fraction of
which map back to specific aspects of carbohydrate hydrolysis,
transport, and metabolism.
Relationship between ABC carbohydrate transporter inven-
tory and growth substrate range. Since noncore genes appear to
be involved in a species’ ability to hydrolyze crystalline cellulose,
the inventory of carbohydrate transporters was first considered.
Overall, the genus Caldicellulosiruptor has 6 core ATP-binding
cassette (ABC) transporters out of 45 in the pangenome (see Table
S4 in the supplemental material). Substrate preferences for five of
these core transporters have previously been assigned based on
transcriptomic analysis of C. saccharolyticus (91). Only C. hydro-
thermalis,C. kronotskyensis, and C. saccharolyticus contain unique
transporters not found in the other sequenced Caldicellulosiruptor
species. As mentioned above, all of the core ABC transporters are
of the CUT1 type, which are typically involved in oligosaccharide
import (74,76), although some of these CUT1 transporters from
C. saccharolyticus will respond to monosaccharides (91). These
transporters appear to import some, but not all, oligosaccharides
that are generated by plant biomass hydrolysis. As there is a wide
variety of CAZy-related genes found in the Caldicellulosiruptor
genomes, there are also particular ABC transporters used by indi-
vidual species to support growth on various types of plant bio-
mass.
A connection between ABC transporter number, diversity, and
substrate range was evident in examining the genomes. C. lacto-
aceticus has the most restricted carbohydrate preferences (9,57)
and also has the fewest carbohydrate-related ABC transporters,
one-third of those found in C. hydrothermalis. This further sup-
ports the point that C. lactoaceticus has evolved as a specialist on
higher-chain plant polysaccharides and cannot use glucose to sup-
port growth due to the lack of a transporter for glucose. The next
closest related species to C. lactoaceticus,C. kristjanssonii, has only
three more transporters than C. lactoaceticus and is capable of
growth on glucose (9,12), strongly implicating one of those three
additional transporters in glucose uptake. Two of these transport-
ers have previously been implicated in glucose import for C. sac-
charolyticus, and this finding further confirms that result (91).
C. hydrothermalis contains the most transporters of any mem-
ber of the genus, with 39 ABC transporters predicted to be in-
volved in carbohydrate transport. On the whole, the GC content
of C. hydrothermalis is higher than that of the rest of the genus
(Table 1), implying that it has obtained genes through HGT. In-
deed, seven ABC transporters from C. hydrothermalis are unique
to the genus and could be the result of HGT. Interestingly, C.
hydrothermalis grows weakly on Avicel (Fig. 2A) and does not
visibly deconstruct filter paper (Fig. 2B), indicating that trans-
porter inventory does not correlate with the ability to hydrolyze
crystalline cellulose. Instead, it appears that C. hydrothermalis has
evolved by either importing diverse types of carbohydrates into
the cell or using multiple transporters to maximize the uptake of
specific carbohydrates.
Overall, no common transporter set could be identified that
was only present in cellulolytic but not weakly cellulolytic Caldi-
cellulosiruptor species (see Table S4 in the supplemental material).
This finding seems reasonable, given that all isolated species have
been described as having the ability to grow on cellobiose (12,27,
31,61,69,80,98). Since these bacteria are assumed to live in plant
biomass-degrading communities, even if a species is lacking
strong cellulolytic machinery it would be beneficial to maintain
the ability to import cellulose hydrolysis products. In addition, no
correlation between the number of transporters and cellulolytic
ability was evident (Table 2). However, the diversity of carbohy-
drate transporters in weakly cellulolytic species merits further
consideration for the design of a biocatalyst for CBP. By incorpo-
rating a large number of diverse carbohydrate transporters, flux
TABLE 2 Carbohydrate-related domains and transporter inventory
Species
No. of ORFs with
a
:
Total
b
SigP
c
CT
d
GH CBM PL CE GT
C. bescii 52 22 4 7 29 68 23 20
C. hydrothermalis 62 17 1 6 28 74 15 39
C. kristjanssonii 37 15 3 5 31 48 14 15
C. kronotskyensis 77 28 4 9 35 93 32 28
C. lactoaceticus 44 18 4 4 27 54 17 12
C. obsidiansis 47 18 2 5 29 59 16 20
C. owensensis 51 16 4 8 31 67 19 18
C. saccharolyticus 59 17 1 6 30 70 17 25
a
GH, glycoside hydrolase; CBM, carbohydrate binding module; PL, polysaccharide lyase; CE, carbohydrate esterase; GT, glycosyl transferase. Numbers of carbohydrate-active
protein domains were retrieved from the CAZy database at http://www.cazy.org (13).
b
Indicates the total number of ORFs that contain either glycoside hydrolases, carbohydrate-binding modules, polysaccharide lyases, or carbohydrate esterases.
c
SigP, number of signal peptides.
d
CT, number of ATP binding cassette (ABC) carbohydrate transporters.
Blumer-Schuette et al.
4020 jb.asm.org Journal of Bacteriology
through many different catabolic pathways could be maintained,
supported by the fact that the genus does not exhibit carbon ca-
tabolite repression (CCR) (89,91).
Similarities and subtle differences in core metabolism influ-
ence carbohydrate preferences. Since carbohydrate transporter
diversity did not appear to correlate with specific determinants for
cellulolytic ability, an examination of the metabolic capacity
should be considered. However, based on the information re-
ported here and for the previously sequenced Caldicellulosiruptor
genomes (15,89), the core metabolic pathways across the genus
appear to be highly conserved. All species are capable of glycolysis
through the Embden-Meyerhof-Parnas (EMP) pathway, fermen-
tation of xylose through a nonoxidative pentose phosphate path-
way (PPP), uronic acid metabolism, oxidation of acetate-coen-
zyme A (CoA), and reduction of pyruvate through an incomplete
citric acid cycle (TCA) (see Fig. S4 in the supplemental material).
The highly conserved EMP pathway would be responsible for ox-
idation of glucose liberated from cellulose or starch and highlights
the importance of both -D- and -D-glucose as energy sources.
Aside from the metabolism of cellulose and pectin, there are
some differences between Caldicellulosiruptor species with respect
to various monosaccharide metabolic pathways involved in hemi-
cellulose metabolism. One subtle difference concentrates on the
xylose isomerase (XI) of C. saccharolyticus, which is a class I XI, in
contrast to the other species, which use a class II XI (28,40). The
significance of this is unknown; however, all Caldicellulosiruptor
species grow well on xylose (9) and the analogous complex poly-
saccharide xylan (Fig. 2A), indicating that both types of XI are able
to catalyze efficient xylose metabolism for the genus Caldicellulo-
siruptor.
Three other alternative pathways that feed into the PPP involve
other aldopentoses: D-ribose, L-arabinose, and D-arabinose. To
metabolize L-arabinose, a component of pectin and arabinoxylan,
a putative L-fucose isomerase (MCL group 1847; see Data Set S1 in
the supplemental material) appears to be used by most species (see
Fig. S4). In contrast, the Icelandic Caldicellulosiruptor species lack
the genes to metabolize any of these aldopentoses, which also ex-
plains their inability to grow on D/L-arabinose and ribose (12,57).
This apparent lack of D/L-arabinose-specific isomerases and ki-
nases would then theoretically reduce their capacity to metabolize
a portion of the hydrolysis products from arabinoxylan.
Another example of upstream carbohydrate conversion path-
ways influencing carbohydrate growth profiles is the metabolism
of deoxy-sugars, such as L-fucose and L-rhamnose. The plant cell
wall component pectin can contain L-rhamnose, and xyloglucans
can also be fucosylated (99), making the catabolism of deoxy-
sugars important for the complete metabolism of all biomass-
related carbohydrates. While some species possess complete path-
ways to metabolize deoxy-sugars, not all species have been
described as being capable of growth on them; for example, C.
bescii was described as being unable to grow on fucose (80). In
addition, other species with incomplete deoxy-sugar pathways
have been described as being capable of growth on rhamnose, with
C. owensensis being one such example (31). This highlights the
overall need for a better understanding of the alternate upstream
carbohydrate conversion pathways.
GH inventory reflects the capacity for crystalline cellulose
hydrolysis. Ultimately, the answer to what makes an organism
weakly or strongly cellulolytic rests to a large extent on its enzy-
matic inventory. As discussed above, the inventory of carbohy-
drate transporters and metabolic pathways only gives clues about
the metabolic capacity of the organism after deconstruction of
plant biomass. Therefore, a comparative analysis of their glycoside
hydrolase (GH) inventory should highlight distinct determinants
for cellulose deconstruction. The pangenome of the genus Caldi-
cellulosiruptor encodes 134 carbohydrate-active enzymes (CAZy)
(13), here classified as GHs, carbohydrate esterases (CEs), poly-
saccharide lyases (PLs), and carbohydrate binding modules
(CBMs); 48 of these contain signal peptides and are predicted to
be extracellular (Table 2). Carbohydrate-active enzyme inventory
of the pangenome constitutes the collective capacity of the genus
to metabolize complex and simple carbohydrates, including vari-
ous types of plant biomass. In a preliminary screen of carbohy-
drate-active enzyme inventory from the genus based on draft
genome sequence data, GH family 48 and CBM family 3 were
implicated as essential elements for crystalline cellulose hydrolysis
by Caldicellulosiruptor species (9). With eight finished genome
sequences, a more complete assessment can be done.
As might be expected of microorganisms capable of plant bio-
mass degradation, each Caldicellulosiruptor species contains a
significant number of GH domains and CBM modules in their
genomes, from 38 and 26, respectively, for C. kristjanssonii up to
84 and 63, respectively, for C. kronotskyensis (Table 2). These
numbers are higher than those for other thermophilic anaerobes
but are smaller than those for fungi, such as Trichoderma reesei
(200) (15,54). The genome of C. kronotskyensis contains 84 GH
domains that represent 38 different GH families, which is also the
highest diversity of GH domains found in an anaerobic thermo-
phile (13,60). This is about 50% more GH domains than many
other Caldicellulosiruptor species (Table 2). However, the diversity
of GH families does not necessarily map back to cellulolytic capa-
bility, as C. hydrothermalis and C. saccharolyticus possess 60 and 61
families, respectively, and have vastly different plant biomass de-
construction capabilities (Fig. 2B).
Approximately one-fourth of the 121 CAZy-related ORFs are
conserved across all eight sequenced Caldicellulosiruptor genomes
and constitute the core. These 30 ORFs include 26 enzymes con-
taining GH domains, three containing CE domains, and one with
only a single CBM domain (Fig. 3A). Four ORFs from this core are
predicted to be extracellular, including Csac_0678 and its or-
thologs, a bifunctional GH5 domain enzyme (63), a putative
xylanase, a putative pullulanase, and a carbohydrate esterase (Fig.
3A). In theory, these genes represent the minimal set of CAZy-
related genes required for biomass deconstruction by a Caldicel-
lulosiruptor species. While it may be tempting to use this list as an
indication of the minimal set of extracellular enzymes required by
the genus to support a plant biomass-degrading lifestyle, func-
tional homology of non-core enzymes must also be considered.
Indeed, C. kristjanssonii has 11 GH domain-containing enzymes
above the core Caldicellulosiruptor set, the lowest number of total
GH domain-containing enzymes in the genus. Note that the min-
imal set of carbohydrate-active enzymes in the genus does not
equip the microbe for crystalline cellulose hydrolysis, although the
GH5-containing enzyme does allow for random cleavage of amor-
phous cellulose (63). C. lactoaceticus, a species closely related to C.
kristjanssonii, is cellulolytic and possesses only 6 more CAZy-re-
lated ORFs above that of C. kristjanssonii (Table 2). Comparison
to core cellulolytic enzymes will highlight which of these six addi-
tional CAZy-related ORFs are important for cellulose hydrolysis.
Cellulose Degradation by Caldicellulosiruptor spp.
August 2012 Volume 194 Number 15 jb.asm.org 4021
It appears that both species isolated from Iceland are minimalists
with respect to gene inventory for carbohydrate hydrolysis.
Seven additional genes conserved among the cellulolytic spe-
cies comprise the core cellulolytic carbohydrate-active enzyme list
(Fig. 3B). This set includes full or partial homologs of enzymes
with GH9 and GH48 domains linked by CBM3 modules, GH74
and GH48 domains linked by CBM3 modules, and GH9 and GH5
domains linked by CBM3 modules (Fig. 3B). Indeed, as a previous
preliminary analysis suggested, those species that are strongly cel-
lulolytic also possess enzymes with GH9 and GH48 catalytic do-
mains and CBM3 modules (9) (see Tables S5 and S6 in the sup-
plemental material). In fact, these enzymes are colocalized in loci
that contain anywhere from four to seven modular multidomain
enzymes, all of which possess CBM3 modules (Fig. 4). One weakly
cellulolytic species, C. kristjanssonii, also has some CBM3-linked
enzymes; however, none also has a GH48 domain, which appears
to be the absolute determinant for crystalline cellulose hydrolysis
in the genus (see Table S5). In the comparison between C. krist-
janssonii and C. lactoaceticus, where six additional ORFs are pres-
ent in the cellulolytic C. lactoaceticus, three are conserved among
cellulolytic species and only two multidomain multifunctional
ORF products are encoded by cellulolytic Caldicellulosiruptor spe-
cies, the GH74:GH48 enzyme and CelA, a GH9:GH48 enzyme
(Fig. 3B). Family 48 GHs are often characterized as cellobiohydro-
lases (2), supporting the theory that this particular family is re-
sponsible for the strong cellulolytic phenotype. Indeed, mutations
in GH48-containing enzymes have disrupted the cellulolytic abil-
ity of Ruminococcus albus 8(16) and reduced the cellulolytic effi-
ciency of Clostridium thermocellum (60) and Clostridium cellulo-
lyticum (66). There are cases, however, where the sole presence of
a GH48 domain is not enough to promote a strong cellulolytic
phenotype, as is the case for cellulosomal Clostridium acetobutyli-
cum (59,73), even though the GH48 enzyme was expressed and
secreted as part of the cellulosome (45). Evidently, even in the case
of the strongly cellulolytic Caldicellulosiruptor species, additional
determinants beyond the presence of GH domains and CBM
modules most likely exist that promote crystalline cellulose hy-
drolysis.
Identification of secreted proteins provides insights into
substrate attachment and hydrolysis. To further probe what de-
terminants exist beyond the cellulolytic GH family containing en-
zymes in the genus Caldicellulosiruptor, Avicel-induced proteins
were identified via bottom-up proteomics. Avicel was used as a
model plant biomass substrate due to the large proportion of cel-
lulose in plant cell walls and previous studies on T. reesei cellulase
systems demonstrating strong affinity of cellulases for Avicel (38,
62). A strong, potentially irreversible interaction between Caldi-
cellulosiruptor proteins and Avicel would be ideal for proteomic
screening to identify substrate-bound proteins, since their affinity
for Avicel would have to withstand washing steps to remove cells.
Previous proteomic screens from members of the genus focused
on the cell-free extracellular and whole-cell fractions of cellulo-
lytic Caldicellulosiruptor species (15,43,44). We previously
reported on differential two-dimensional SDS-PAGE profiles of
cell-free supernatant from cells grown on Avicel in an attempt to
capture protein-level differences of weakly to strongly cellulolytic
Caldicellulosiruptor species (9). To fully capture differential pro-
tein expression between weakly and strongly cellulolytic Caldicel-
lulosiruptor species, an expanded proteomic screen was employed.
Here, we describe the first comprehensive genus-wide screen of
Avicel-induced proteins identified not only from SN and WC but
also from the Avicel-bound (SB) fraction from four selected
strongly cellulolytic and three weakly cellulolytic Caldicellulosir-
uptor species.
Overall, between 36 and 48% of total protein-coding sequences
predicted from Caldicellulosiruptor genomes was detected as pep-
tides from the SB, SN, and WC fractions using mass spectrometry
(see Data Set S2 in the supplemental material). This is lower than
the 54% detection for C. bescii (15) or 65% detection for C. obsidi-
ansis (44); however, previous experiments included two or more
growth substrates analyzed and/or measurements at various
growth stages, whereas this study only included one growth sub-
strate, Avicel. Peptides identified in the SB fraction ranged from 16
to 24% of total protein-coding sequences detected; however, the
numbers could be inflated by the presence of intercellular proteins
released by cells adhered irreversibly to Avicel. Weakly cellulolytic
Caldicellulosiruptor species had the lowest frequency (16.7 and
20.1% for C. owensensis and C. hydrothermalis, respectively) of
proteins detected in the SB fraction. This result is not unexpected.
A weakly cellulolytic species would not be expected to produce many
proteins that interact with cellulose, including the above-mentioned
multidomain modular enzymes with CBM family 3 motifs. However,
another weakly cellulolytic species, C. kristjanssonii, had the largest
frequency of protein-coding sequences detected in the SB fraction,
FIG 4 Gene clusters of CBM3-containing glycoside hydrolases. Locus tags are the following: Cbes, Athe_1867-Athe_1853; Calkr, Calkr_0017,
Calkr_1847Calkr_1849, Calkr_2455, and Calkr_2522; Calkro, Calkro_0850Calkro_0864; Calla, Calla_0015Calla_0018, Calla_1251Calla_1249,
Calla_2311Calla_2308, and Calla_2385; COB47, COB47_1673COB47_1662; Csac, Csac_1076Csac_1085. CBM3 modules are denoted by white diamonds,
and dashed lines mean that orthologs possess the CBM3 module; green ovals, GH5; red ovals, GH9; lilac ovals, GH10; blue ovals, GH44; gray ovals, GH48; purple
ovals, GH74; blue rectangles, polysaccharide lyase; beige arrow, GT39; and brown rectangle, AraC transcriptional regulator.
Blumer-Schuette et al.
4022 jb.asm.org Journal of Bacteriology
again potentially from intercellular leakage. In fact, the average sub-
strate-bound enrichment score (SBE) for C. kristjanssonii is lower
than the average SBE for the entire genus, indicative of intercellular
protein contamination of the SB fraction.
Identification of glycoside hydrolases bound to cellulose.
Peptides classified as CAZy-related ORFs were detected at higher
frequencies than the complete proteome, ranging from 54 to 83%
detection (see Data Set S2 in the supplemental material). As ex-
pected, one of the most detected fractions of extracellular peptides
corresponded to proteins encoded by the gene cluster containing
enzymes with CBM3 modules (MCL cluster 4; see Data Set S2).
These GHs were also enriched in the SB fraction more so than in
the SN fraction (weighted percentages of 88, 3, and 9% total NSAF
for SB, SN, and WC, respectively), agreeing with the cellulose-
binding function of CBM family 3 modules (87). One particular
group, orthologs of CelA (GH9-CBM3-CBM3-CBM3-GH48)
(Fig. 3B), was the most abundant CBM3-containing enzyme de-
tected in the SB fraction. Previous studies identifying extracellular
proteins in C. bescii and C. obsidiansis grown on cellulose also
found that CelA is the most abundant CBM3-containing enzyme
produced by cellulolytic Caldicellulosiruptor species (44). Enrich-
ment of cellobiohydrolases bound to Avicel has been noted before;
in competitive binding assays using T. reesi cellulases, including
cellobiohydrolases and endoglucanases, the cellobiohydrolases
bound with a higher affinity to Avicel (38). This observation fur-
ther highlights the association of modular multidomain enzymes
containing both GH48 and CBM3 domains to crystalline cellulose
and emphasizes their important role in its hydrolysis.
One benefit of identifying proteins in the SB fraction is the
discovery of previously overlooked enzymes, such as the enrich-
ment of a modular multidomain mannanase (GH26) enzyme on
cellulose (22). This cellulolytic enzyme contains two CBM fami-
lies, CBM27 and CBM35, which are found in the genomes of C.
hydrothermalis,C. kristjanssonii,C. lactoaceticus, and C. obsidiansis
(MCL group 2116; see Data Set S2 in the supplemental material).
Enrichment of this enzyme in the SB fraction was significantly
higher in two weakly cellulolytic species, C. hydrothermalis and C.
kristjanssonii (NSAF of 1.57 10
2
and 4.82 10
3
, respec-
tively), and significantly lower (almost nonexistent) in the cellu-
lolytic C. lactoaceticus (NSAF of 2.35 10
4
). Furthermore, there
was no detection of this protein in the SN or WC fractions of
strongly cellulolytic C. obsidiansis grown on cellobiose, cellulose,
or switchgrass, as shown in another study (43). At a minimum,
this indicates that there are different regulatory mechanisms for
weakly versus strongly cellulolytic species; those species lacking
CBM3 protein loci are likely compensating for other enzymes. As
mentioned above, previous reports using an orthologous enzyme
from Caldicellulosiruptor sp. Rt8B.4 (22) characterized this en-
zyme as a mannanase, and there has been no further description of
enzyme activity beyond that on gluco- and galactomannans (78).
However, when the carbohydrate-binding specificity of the CBM
motifs was investigated, it was noted that the N terminus of the
protein, comprised of the CBM motifs, demonstrated affinity for
not only mannan but also glucans, such as soluble cellulose and
-glucan (79). It is not unusual for noncellulolytic enzymes to be
targeted to cellulose to decouple cellulose from surrounding poly-
saccharides, as is the case for some of the multimodular enzymes
containing CBM3 motifs (MCL group 4; see Data Sets 1 and 2 in
the supplemental material).
Noncatalytic proteins bound to cellulose. Other proteins that
have been theorized to be involved in microbe-substrate interac-
tions were also enriched in the substrate-bound fraction (Fig. 5).
The major protein that forms the S layer (MCL group 219; see
Data Set S2 in the supplemental material) was found in the extra-
cellular fractions of all species in significant amounts. In fact, this
protein alone constituted more than 9% of the total spectra col-
FIG 5 Extracellular, cell membrane-bound proteins involved in microbe-cellulose interactions of strongly cellulolytic Caldicellulosiruptor. Highlighted proteins
were detected in the supernatant or substrate-bound proteome. Proteins found enriched in the substrate proteome are shaded red, those enriched in the
supernatant are shaded green, and proteins shaded blue indicate enrichment in the cell lysate. Noted proteins shaded gray were detected in all three protein
fractions and were not determined to be enriched in one fraction over another. Numbers in parentheses above proteins are nominal labels given to orthologous
families of proteins as determined by the OrthoMCL program (42). Exact locus tag numbers for each orthologous protein family are found in Data Set S1 in the
supplemental material, and NSAF for each MCL group are found in Data Set S2.
Cellulose Degradation by Caldicellulosiruptor spp.
August 2012 Volume 194 Number 15 jb.asm.org 4023
lected across all organisms, with overall fractional partitioning of
35, 53, and 12% (SB, SN, and WC, respectively). However, as
previously observed with 2-dimensional SDS-PAGE analysis (9),
the supernatants of C. saccharolyticus and C. owensensis are en-
riched with the S-layer protein (see Data Set S2) relative to the
other Caldicellulosiruptor spp. The recently characterized, S-layer-
located cellulolytic enzyme Csac_0678 (63) was also enriched in
the SB fraction (MCL group 1342; see Data Set S2), as expected.
Interestingly, only the ortholog from C. owensensis was strongly
enriched in the SN fraction, potentially as a result of the trun-
cated CBM28 motif, highlighting the importance of this par-
ticular CBM family in adherence to noncrystalline portions of
Avicel (11). A role for four other S-layer-associated proteins in
substrate attachment also can be assigned from their observed
binding to Avicel (Fig. 5). Although the majority of proteins do
not have identifiable carbohydrate-binding modules, they all
strongly partition toward the SB fraction (86% of their total
SpC collected overall).
Another group of proteins potentially involved in substrate
attachment are those assembled into flagella (Fig. 5). Surprisingly,
proteins comprising the flagella were detected primarily in the SN
fraction for strongly cellulolytic species (22, 67, and 11% for SB,
SN, and WC, respectively, based on total NSAF), while for the
weakly cellulolytic species the proteins were enriched in the SB
fraction (54, 37, and 9% for SB, SN, and WC, respectively). En-
richment of the flagella in the SN fraction of strongly cellulolytic
species indicates that cellulose will induce expression of flagellar
genes, although in this case the flagella were not detected to play a
role in cellular adhesion. In contrast, enrichment of flagellum
components in the SB fraction indicates a more important role for
flagella in cellulose adhesion for weakly cellulolytic species. A two-
stage mechanism for cell surface attachment has been proposed
for the proteobacterium Caulobacter crescentus, with the re-
versible primary surface attachment mechanism involving the
flagella, followed by attachment by type IV pili prior to biofilm
formation in the irreversible secondary phase (41). Clearly,
there are differing mechanisms for cellulose attachment even
within the genus Caldicellulosiruptor, and the enrichment of
flagellum-related proteins in the SB fraction from weakly cel-
lulolytic species may be indicative of an extended reversible
attachment phase.
Formation of a cellulolytic biofilm by the strongly cellulolytic
species C. obsidiansis on cellulose surfaces has been shown previ-
ously (92). Since this irreversible secondary stage of cell surface
attachment occurs with a strongly cellulolytic species, we looked at
the abundance of type IV pilus-related proteins to determine if
these structures play a role. Indeed, fewer prepilin subunits were
detected for two of the weakly cellulolytic species than for strongly
cellulolytic species. In addition, the prepillin subunits were en-
riched in the SN fraction for all species (5, 93, and 2% for SB, SN,
and WC, respectively). However, almost 7-fold fewer of these pro-
teins were detected for the weakly cellulolytic species (MCL
groups 443, 1652, and 1819; Fig. 5; also see Data Set S2 in the
supplemental material).
Proteins from the type IV pilus genomic region that were en-
riched in the SB fraction (82% of total NSAF for MCL group 1820
and 97% of total NSAF for MCL group 1653) belonged almost
exclusively to the strongly cellulolytic species (Table 3). Annotated
as hypothetical proteins, they have no significant homology to
proteins outside the genus. Orthologs from highly cellulolytic spe-
TABLE 3 Caldicellulosiruptor adhesins located downstream of type IV pilus gene clusters
MCL group
a
and/or
gene locus
Protein property Protein abundance
e
in:
Length (aa) Size
b
(kDa) pI
b
SigP
c
TMD
d
SB SN WC
1820
Athe_1871 642 70.1 5.37 N Y 2.26E03 1.07E04 3.16E06
Calkr_0826 634 68.9 8.3 N Y 5.37E03 8.54E04 7.07E05
Calkro_0844 642 69.6 5.18 N Y 4.41E03 8.77E06 2.83E06
Calla_1507 634 69.0 8.02 N Y 9.80E03 2.32E03 5.45E04
COB47_1678 642 69.8 5.13 N Y NA
f
NA NA
Csac_1073 642 69.9 5.13 N Y 4.29E03 2.49E03 5.99E05
1653
Athe_1870 649 70.3 6.37 N Y 2.04E03 1.05E05 3.13E06
Calhy_0908 638 71.0 5.8 Y Y ND
g
ND ND
Calkr_0827 622 68.9 5.7 Y N ND ND ND
Calkro_0845 649 70.5 7.02 N Y 6.95E04 8.67E06 2.80E06
Calla_1506 628 69.5 6.01 N Y ND ND ND
COB47_1675 649 70.3 5.72 N Y NA NA NA
Csac_1074 649 70.4 5.58 N Y 1.84E04 1.75E05 4.80E05
Calow_1589 667 71.7 9.23 Y N 4.70E03 1.60E02 2.07E04
Calow_1590 900 100.2 5.12 N Y 2.93E04 7.64E04 7.79E05
a
OrthoMCL group numbers for orthologous Caldicellulosiruptor proteins (see Data Set S1 in the supplemental material). No orthologous groups were assigned to the two proteins
detected from C. owensensis.
b
Predictions for molecular size and pI used the ExPASy Compute pI/M
w
tool (http://web.expasy.org/compute_pi/) (19).
c
SigP, number of signal peptides; predicted using SignalP (http://www.cbs.dtu.dk/services/SignalP/) (67).
d
TMD, transmembrane domain; predicted using the TMHMM server (http://www.cbs.dtu.dk/services/TMHMM/) (36).
e
Protein abundance is reported as NSAF for each fraction screened. SB, substrate bound; SN, supernatant; WC, whole-cell lysate.
f
NA, protein abundance not available.
g
ND, not detected in protein fractions using proteomics.
Blumer-Schuette et al.
4024 jb.asm.org Journal of Bacteriology
cies (C. bescii,C. kronotskyensis,C. obsidiansis, and C. saccharolyti-
cus) had identity scores ranging from 81 to 95% (MCL group
1820) and 85 to 99% (MCL group 1653), whereas orthologs from
species isolated in Iceland were highly homologous to each other
(99% identity) yet were much more divergent from the highly
cellulolytic group, with identity scores ranging from 36 to 37%
(MCL group 1820) and 40% (MCL 1653). Indeed, when predicted
parameters such as molecular size and isoelectric point are com-
pared within MCL groups 1820 and 1653, orthologs from C. lac-
toaceticus and C. kristjanssonii are the smallest proteins, and in the
case of MCL group 1820 they are the most positively charged, with
a predicted pI of more than 8 (Table 3).
Orthologous MCL group 1820 is expressed by all species exam-
ined and was enriched in the SB fraction, in some cases being more
than 90% of total NSAF. Since an ortholog in MCL group 1820 is
also expressed and found enriched in the SB fraction from the
weakly cellulolytic C. kristjanssonii, these proteins do not impart a
strong cellulolytic phenotype. However, the ORF directly down-
stream, represented by orthologous MCL group 1653, was only
detected by proteomic screening in the highly cellulolytic species
examined and was also enriched in the SB fraction (Table 3). The
demonstrated differential expression of this MCL group during
growth on cellulose implicates MCL group 1653 in a Caldicellulo-
siruptor species’ ability to hydrolyze crystalline cellulose. Based on
genomic proximity of the ORFs to the type IV pilus locus and the
enrichment of these proteins in the SB fraction, we propose that
these proteins are novel adhesins that mediate attachment of type
IV pili to cellulose (MCL groups 1820 and 1653; Fig. 5; also see
Data Set S2 in the supplemental material). Gram-positive species
sequenced so far generally have one genomic locus that contains
the cluster of type IV pilus assembly genes, including hypothetical
proteins located adjacent to the locus (65). In the genomic neigh-
borhood of type IV pilus genes, it appears that the adhesins and
the type IV pilus locus also reside directly upstream of the cellulase
gene cluster in strongly cellulolytic species, lending evidence to a
synergistic expression pattern (see Table S7). No orthologs of
these adhesins are found in the genome of C. owensensis, a weakly
cellulolytic species, which instead possesses other adhesin-like
proteins located downstream of the type IV pilus locus (Table 3;
also see Table S7). However, both adhesins from C. owensensis
were enriched in the SN fraction, and the sole adhesin from C.
hydrothermalis was not detected in any of the protein fractions
(Table 3). A potential role for those adhesin-like proteins cannot
be ruled out, and indeed low levels of mRNA for Calhy_0908 were
detected when C. hydrothermalis was grown on cellobiose or
switchgrass (data not shown). In the case of C. owensensis, while
type IV pilus-proximate proteins were not enriched in the SB frac-
tion, these proteins are expressed in response to the detection of
cellooligosaccharides and may mediate attachment to other poly-
saccharides found in biomass, such as xylan, pectin, or mannans.
Determination of the polysaccharide specificity of these putative
adhesins, as well as further characterization of the interplay be-
tween neighboring adhesins, are the subjects of ongoing experi-
ments.
Was the ancestral Caldicellulosiruptor cellulolytic? The
genomic neighborhoods of type IV pilus- and CBM3-containing
enzymes present an interesting case of presumed genomic rear-
rangement of cellulases in a weakly cellulolytic species, C. krist-
janssonii, and the closely related strongly cellulolytic species, C.
lactoaceticus. Since the CBM3-containing enzymes of C. kristjans-
sonii and C. lactoaceticus are found in blocks throughout their
respective genomes instead of a single locus, genomic rearrange-
ment can explain the separation of the type IV locus and CBM3-
containing enzymes. Genomic rearrangement in this locus could
explain the lack of GH48-containing enzymes in C. kristjanssonii
and, hence, weak growth on crystalline cellulose (Fig. 2A). Since
the genomic identity is very close (ANI of 98%; see Table S1 in
the supplemental material) between these two species with vastly
different phenotypes on cellulose, it begs the question of which
phenotype came first in the Caldicellulosiruptor lineage, strongly
or weakly cellulolytic?
Two clusters of CAZy-related enzymes exist in the pangenome;
one cluster includes primarily glucan-degrading enzymes (GDL)
with CBM3 domains (Fig. 4), and the other contains xylan-de-
grading enzymes (XDL) and xylooligosaccharide transporters
(91). Since Caldicellulosiruptor species from more than one conti-
nental location contain one or both clusters, it is likely that the
ancestral Caldicellulosiruptor species contained both clusters. This
also suggests that the ancestral Caldicellulosiruptor species was
strongly cellulolytic and capable of crystalline cellulose decon-
struction, and that weakly cellulolytic species have lost that ability
through gene deletion events.
Members of the genus, except C. hydrothermalis and C.
owensensis, have at least one homolog contained within the GDL,
which means that C. hydrothermalis and C. owensensis either
branched off from the Caldicellulosiruptor lineage prior to acqui-
sition of those genes by the ancestral species or that they lost the
entire region after speciation. As mentioned before, the type IV
pilus operon is also located directly upstream of the GDL in
strongly cellulolytic species. The separation of these colocated re-
gions, in addition to further genomic rearrangements in the GDL
of Icelandic species, makes it likely that C. hydrothermalis and C.
owensensis lost the GDL after speciation. In addition to the loss of
the GDL, these two species also lost one or both cellulose-associ-
ating adhesins from the type IV pilus loci, indicating that gene loss
occurred further upstream than just the GDL. Furthermore, if the
weakly cellulolytic Caldicellulosiruptor species were the result of a
separate lineage in the genus, one would expect the weakly cellu-
lolytic species to be more genetically similar to one another, which
16S phylogeny and ANI both disprove (Fig. 1; also see Table S1 in
the supplemental material). It is also interesting that many genes
located in the GDL cluster of the strongly cellulolytic Caldicellu-
losiruptor species appear to be the result of various recombination
events after gene duplication of glycoside hydrolase domains with
CBM3 domains (23,35,60)(Fig. 4). Microsynteny in the GDL
between C. saccharolyticus and C. kronotskyensis, two geographi-
cally distinct species (Fig. 4), indicates that there has been addi-
tional rearrangement in the GDL of C. bescii after speciation.
Conclusions. Eight whole-genome sequences from the genus
Caldicellulosiruptor, ranging from weakly to strongly cellulolytic
species (Fig. 2A), were assessed for determinants of cellulolytic
capability. While biogeography was determined to play a role in
the level of relatedness between species based on 16S phylogeny
and ANI (Fig. 1), it was not a reliable metric to predict phenotype.
Using detailed comparative analysis of the genomes, carbohydrate
transport and catabolic pathways were indicative of carbohydrate
metabolic capabilities. However, genomic analysis is not enough
to predict cellulolytic capability. This is not to say that there is no
benefit of such an analysis, since metabolic engineering of a CBP
Cellulose Degradation by Caldicellulosiruptor spp.
August 2012 Volume 194 Number 15 jb.asm.org 4025
organism will require detailed characterization of the import and
metabolism of carbohydrates.
Further analysis of the CAZy-related gene inventory did reaf-
firm previously predicted determinants for cellulolytic ability,
namely, enzymes possessing GH family 48 domains with CBM
family 3 modules. Indeed, when the GDL for the cellulolytic spe-
cies C. lactoaceticus is compared to that of the highly related C.
kristjanssonii, the presence of a GH48-containing enzyme, a GH5-
containing enzyme, and an additional GH9 enzyme in C. lactoace-
ticus are the main differences. Since C. kristjanssonii already pos-
sesses a GH9 linked to CBM3 modules and other GH5-containing
enzymes in its genome, it is unlikely that these were the determi-
nants for a cellulolytic phenotype. Most likely, it is the presence of
a GH48-containing enzyme that makes the difference, since GH
family 48 members are most often characterized as cellobiohydro-
lases (13). Additionally, when species that grow better than C.
lactoaceticus on cellulose are considered (Fig. 2A), the enzyme
CelA, which links a GH9 and GH48 with three CBM3 modules
(Fig. 4B), appears to be the determinant for strong cellulolytic
growth. Lastly, proteomic-based identification of substrate-
bound extracellular proteins revealed additional determinants for
a strong cellulolytic phenotype, including two type IV pilus-asso-
ciated adhesins. As more Caldicellulosiruptor species genomes be-
come available, the insights discussed here can be further evalu-
ated.
ACKNOWLEDGMENTS
This work was supported by the Bioenergy Science Center (BESC), Oak
Ridge National Laboratory, a U.S. Department of Energy Bioenergy Re-
search Center funded by the Office of Biological and Environmental Re-
search in the DOE Office of Science (contract no. DE-PS02-06ER64304
[DOE 4000063512]).
We gratefully acknowledge the efforts of Lynne Goodwin (JGI-LANL)
and Karen Walston Davenport (LANL) on the Caldicellulosiruptor se-
quencing project. We also thank Dhaval Mistry and Dustin Nelson
(NCSU) for their technical assistance in gathering physiological data.
REFERENCES
1. Altschul SF, et al. 1997. Gapped BLAST and PSI-BLAST: a new gener-
ation of protein database search programs. Nucleic Acids Res. 25:3389
3402.
2. Barr BK, Hsieh YL, Ganem B, Wilson DB. 1996. Identification of two
functionally different classes of exocellulases. Biochemistry 35:586–592.
3. Bayer EA, Lamed R, White BA, Flint HJ. 2008. From cellulosomes to
cellulosomics. Chem. Rec. 8:364–377.
4. Bendtsen JD, Nielsen H, von Heijne G, Brunak S. 2004. Improved
prediction of signal peptides: SignalP 3.0. J. Mol. Biol. 340:783–795.
5. Bennett S. 2004. Solexa Ltd. Pharmacogenomics 5:433–438.
6. Benson DA, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW. 2011.
GenBank. Nucleic Acids Res. 39:D32–D37.
7. Berka RM, et al. 2011. Comparative genomic analysis of the thermo-
philic biomass-degrading fungi Myceliophthora thermophila and Thiela-
via terrestris. Nat. Biotechnol. 29:922–927.
8. Blumer-Schuette SE, Kataeva I, Westpheling J, Adams MW, Kelly RM.
2008. Extremely thermophilic microorganisms for biomass conversion:
status and prospects. Curr. Opin. Biotechnol. 19:210–217.
9. Blumer-Schuette SE, Lewis DL, Kelly RM. 2010. Phylogenetic, micro-
biological, and glycoside hydrolase diversities within the extremely ther-
mophilic, plant biomass-degrading genus Caldicellulosiruptor. Appl. En-
viron. Microbiol. 76:8084 8092.
10. Blumer-Schuette SE, et al. 2011. Complete genome sequences for the an-
aerobic, extremely thermophilic plant biomass-degrading bacteria Caldicel-
lulosiruptor hydrothermalis,Caldicellulosiruptor kristjanssonii,Caldicellulosir-
uptor kronotskyensis,Caldicellulosiruptor owensensis, and Caldicellulosiruptor
lactoaceticus. J. Bacteriol. 193:1483–1484.
11. Boraston AB, Ghaffari M, Warren RAJ, Kilburn DG. 2002. Identifica-
tion and glucan-binding properties of a new carbohydrate-binding mod-
ule family. Biochem. J. 361:35–40.
12. Bredholt S, Sonne-Hansen J, Nielsen P, Mathrani IM, Ahring BK.
1999. Caldicellulosiruptor kristjanssonii sp. nov., a cellulolytic, extremely
thermophilic, anaerobic bacterium. Int. J. Syst. Bacteriol. 49 991–996.
13. Cantarel BL, et al. 2009. The Carbohydrate-Active enZymes database
(CAZy): an expert resource for glycogenomics. Nucleic Acids Res. 37:
D233–D238.
14. Claudel-Renard C, Chevalet C, Faraut T, Kahn D. 2003. Enzyme-
specific profiles for genome annotation: PRIAM. Nucleic Acids Res. 31:
6633–6639.
15. Dam P, et al. 2011. Insights into plant biomass conversion from the
genome of the anaerobic thermophilic bacterium Caldicellulosiruptor
bescii DSM 6725. Nucleic Acids Res. 39:3240–3254.
16. Devillard E, et al. 2004. Ruminococcus albus 8 mutants defective in
cellulose degradation are deficient in two processive endocellulases,
Cel48A and Cel9B, both of which possess a novel modular architecture. J.
Bacteriol. 186:136–145.
17. Elkins JG, et al. 2010. Complete genome sequence of the cellulolytic
thermophile Caldicellulosiruptor obsidiansis OB47
T
. J. Bacteriol. 192:
6099 6100.
18. Eng JK, McCormack AL, Yates JR. 1994. An approach to correlate
tandem mass spectral data of peptides with amino acid sequences in a
protein database. J. Am. Soc. Mass Spectrom. 5:976–989.
19. Gasteiger E, et al. 2003. ExPASy: The proteomics server for in-depth
protein knowledge and analysis. Nucleic Acids Res. 31:3784–3788.
20. Geslin C, et al. 2003. PAV1, the first virus-like particle isolated from a
hyperthermophilic euryarchaeote, Pyrococcus abyssi. J. Bacteriol. 185:
3888–3894.
21. Giannone RJ, et al. 2011. Proteomic characterization of cellular and
molecular processes that enable the Nanoarchaeum equitans-
Ignicoccus hospitalis relationship. PLoS One 6:e22942. doi:10.1371/
journal.pone.0022942.
22. Gibbs MD, Elinder AU, Reeves RA, Bergquist PL. 1996. Sequencing,
cloning and expression of a beta-1,4-mannanase gene, manA, from the
extremely thermophilic anaerobic bacterium, Caldicellulosiruptor Rt8B.
4. FEMS Microbiol. Lett. 141:37–43.
23. Gibbs MD, et al. 2000. Multidomain and multifunctional glycosyl hy-
drolases from the extreme thermophile Caldicellulosiruptor isolate
Tok7B.1. Curr. Microbiol. 40:333–340.
24. Gibbs MD, Saul DJ, Luthi E, Bergquist PL. 1992. The beta-mannanase
from Caldocellum saccharolyticum is part of a multidomain enzyme.
Appl. Environ. Microbiol. 58:3864–3867.
25. Griffiths-Jones S, Bateman A, Marshall M, Khanna A, Eddy SR. 2003.
Rfam: an RNA family database. Nucleic Acids Res. 31:439 441.
26. Haft DH, Selengut JD, White O. 2003. The TIGRFAMs database of
protein families. Nucleic Acids Res. 31:371–373.
27. Hamilton-Brehm SD, et al. 2010. Caldicellulosiruptor obsidiansis sp.
nov., an anaerobic, extremely thermophilic, cellulolytic bacterium iso-
lated from Obsidian Pool, Yellowstone National Park. Appl. Environ.
Microbiol. 76:1014–1020.
28. Hartley BS, Hanlon N, Jackson RJ, Rangarajan M. 2000. Glucose
isomerase: insights into protein engineering for increased thermostabil-
ity. Biochim. Biophys. Acta 1543:294–335.
29. Himmel ME, et al. 2007. Biomass recalcitrance: engineering plants and
enzymes for biofuels production. Science 315:804 807.
30. Hobbie JE, Daley RJ, Jasper S. 1977. Use of nucleopore filters for
counting bacteria by fluorescence microscopy. Appl. Environ. Micro-
biol. 33:1225–1228.
31. Huang CY, Patel BK, Mah RA, Baresi L. 1998. Caldicellulosiruptor
owensensis sp. nov., an anaerobic, extremely thermophilic, xylanolytic
bacterium. Int. J. Syst. Bacteriol. 48:91–97.
32. Hugenholtz P, Pitulle C, Hershberger KL, Pace NR. 1998. Novel
division level bacterial diversity in a Yellowstone hot spring. J. Bacteriol.
180:366–376.
33. Hyatt D, et al. 2010. Prodigal: prokaryotic gene recognition and trans-
lation initiation site identification. BMC Bioinformatics 11:119.
34. Kanehisa M, Goto S, Sato Y, Furumichi M, Tanabe M. 2012. KEGG for
integration and interpretation of large-scale molecular data sets. Nucleic
Acids Res. 40:D109–D114.
35. Kataeva I, Li X-L, Chen H, Choi S-K, Ljungdahl LG. 1999. Cloning and
sequence analysis of a new cellulase gene encoding CelK, a major cellu-
Blumer-Schuette et al.
4026 jb.asm.org Journal of Bacteriology
losome component of Clostridium thermocellum: evidence for gene du-
plication and recombination. J. Bacteriol. 181:5288–5295.
36. Krogh A, Larsson B, von Heijne G, Sonnhammer EL. 2001. Predicting
transmembrane protein topology with a hidden Markov model: applica-
tion to complete genomes. J. Mol. Biol. 305:567–580.
37. Kublanov IV, et al. 2009. Biodiversity of thermophilic prokaryotes with
hydrolytic activities in hot springs of Uzon Caldera, Kamchatka (Russia).
Appl. Environ. Microbiol. 75:286–291.
38. Kyriacou A, Neufeld RJ, MacKenzie CR. 1989. Reversibility and com-
petition in the adsorption of Trichoderma reesei cellulase components.
Biotechnol. Bioeng. 33:631– 637.
39. Lagesen K, et al. 2007. RNAmmer: consistent and rapid annotation of
ribosomal RNA genes. Nucleic Acids Res. 35:3100–3108.
40. Lewis D. 2010. Functional genomics analysis of extremely thermophilic
fermentative microorganisms from the archaeal genus Pyrococcus and
bacterial genus Caldicellulosiruptor. North Carolina State University, Ra-
leigh, NC.
41. Li G, et al. 2012. Surface contact stimulates the just-in-time deployment
of bacterial adhesins. Mol. Microbiol. 83:41–51.
42. Li L, Stoeckert CJ, Roos DS. 2003. OrthoMCL: identification of or-
tholog groups for eukaryotic genomes. Genome Res. 13:2178–2189.
43. Lochner A, et al. 2011. Label-free quantitative proteomics for the ex-
tremely thermophilic bacterium Caldicellulosiruptor obsidiansis reveal
distinct abundance patterns upon growth on cellobiose, crystalline cel-
lulose, and switchgrass. J. Proteome Res. 10:5302–5314.
44. Lochner A, et al. 2011. Use of label-free quantitative proteomics to
distinguish the secreted cellulolytic systems of Caldicellulosiruptor bescii
and Caldicellulosiruptor obsidiansis. Appl. Environ. Microbiol. 77:4042–
4054.
45. Lopez-Contreras AM, et al. 2004. Substrate-induced production and
secretion of cellulases by Clostridium acetobutylicum. Appl. Environ. Mi-
crobiol. 70:5238–5243.
46. Lowe TM, Eddy SR. 1997. tRNAscan-SE: a program for improved de-
tection of transfer RNA genes in genomic sequence. Nucleic Acids Res.
25:955–964.
47. Ludwig W, Schleifer K-H, Whitman W. 2009. Revised road map to the
phylum Firmicutes. In de Vos P P. et al. (ed), Bergey’s manual of system-
atic bacteriology, vol 3. Springer, New York, NY.
48. Lykidis A, et al. 2007. Genome sequence and analysis of the soil cellu-
lolytic Actinomycete Thermobifida fusca YX. J. Bacteriol. 189:2477–2486.
49. Lynd LR, Weimer PJ, van Zyl WH, Pretorius IS. 2002. Microbial
cellulose utilization: fundamentals and biotechnology. Microbiol. Mol.
Biol. Rev. 66:506–577.
50. Lynd LR, Wyman CE, Gerngross TU. 1999. Biocommodity engineer-
ing. Biotechnol. Prog. 15:777–793.
51. Margulies M, et al. 2005. Genome sequencing in microfabricated high-
density picolitre reactors. Nature 437:376–380.
52. Markowitz VM, et al. 2010. The integrated microbial genomes system:
an expanding comparative analysis resource. Nucleic Acids Res. 38:
D382–D390.
53. Markowitz VM, et al. 2012. IMG: the integrated microbial genomes
database and comparative analysis system. Nucleic Acids Res. 40:D115–
D122.
54. Martinez D, et al. 2008. Genome sequencing and analysis of the bio-
mass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina). Nat.
Biotechnol. 26:553–560.
55. McDonald WH, Ohi R, Miyamoto DT, Mitchison TJ, Yates JR III.
2002. Comparison of three directly coupled HPLC MS/MS strategies for
identification of proteins from complex mixtures: single-dimension LC-
MS/MS, 2-phase MudPIT, and 3-phase MudPIT. Int. J. Mass Spectrom.
219:245–251.
56. Miroshnichenko ML, et al. 2008. Caldicellulosiruptor kronotskyensis sp.
nov. and Caldicellulosiruptor hydrothermalis sp. nov., two extremely ther-
mophilic, cellulolytic, anaerobic bacteria from Kamchatka thermal
springs. Int. J. Syst. Evol. Microbiol. 58:1492–1496.
57. Mladenovska Z, Mathrani IM, Ahring BK. 1995. Isolation and charac-
terization of Caldicellulosiruptor lactoaceticus sp. nov., an extremely ther-
mophilic, cellulolytic, anaerobic bacterium. Arch. Microbiol. 163:223–
230.
58. Morris DD, Gibbs MD, Ford M, Thomas J, Bergquist PL. 1999. Family
10 and 11 xylanase genes from Caldicellulosiruptor sp. strain Rt69B. 1.
Extremophiles 3:103–111.
59. Nolling J, et al. 2001. Genome sequence and comparative analysis of the
solvent-producing bacterium Clostridium acetobutylicum. J. Bacteriol.
183:4823–4838.
60. Olson DG, et al. 2010. Deletion of the Cel48S cellulase from Clostridium
thermocellum. Proc. Natl. Acad. Sci. U. S. A. 107:17727–17732.
61. Onyenwoke RU, Lee YJ, Dabrowski S, Ahring BK, Wiegel J. 2006.
Reclassification of Thermoanaerobium acetigenum as Caldicellulosiruptor
acetigenus comb. nov. and emendation of the genus description. Int. J.
Syst. Evol. Microbiol. 56:1391–1395.
62. Otter DE, Munro PA, Scott GK, Geddes R. 1989. Desorption of
Trichoderma reesei cellulase from cellulose by a range of desorbents. Bio-
technol. Bioeng. 34:291–298.
63. Ozdemir I, Blumer-Schuette SE, Kelly RM. 2012. S-layer homology
(SLH) domain proteins Csac_0678 and Csac_2722 implicated in plant
polysaccharide deconstruction by the extremely thermophilic bacterium
Caldicellulosiruptor saccharolyticus. Appl. Environ. Microbiol. 78:768
777.
64. Pati A, et al. 2010. GenePRIMP: a gene prediction improvement pipe-
line for prokaryotic genomes. Nat. Methods 7:455–457.
65. Pelicic V. 2008. Type IV pili: e pluribus unum? Mol. Microbiol. 68:827–
837.
66. Perret S, Maamar H, Belaich J-P, Tardif C. 2004. Use of antisense RNA
to modify the composition of cellulosomes produced by Clostridium cel-
lulolyticum. Mol. Microbiol. 51:599 607.
67. Petersen TN, Brunak S, Heijne Nielsen H. 2011. SignalP 4.0: discrim-
inating signal peptides from transmembrane regions. Nat. Methods
8:785–786.
68. Punta M, et al. 2012. The Pfam protein families database. Nucleic Acids
Res. 40:D290–D301.
69. Rainey FA, et al. 1994. Description of Caldicellulosiruptor saccharolyticus
gen. nov., sp. nov: an obligately anaerobic, extremely thermophilic, cel-
lulolytic bacterium. FEMS Microbiol. Lett. 120:263–266.
70. Raman B, et al. 2009. Impact of pretreated switchgrass and biomass
carbohydrates on Clostridium thermocellum ATCC 27405 cellulosome
composition: a quantitative proteomic analysis. PLoS One 4:e5271. doi:
10.1371/journal.pone.0005271.
71. Richter M, Rosselló-Móra R. 2009. Shifting the genomic gold standard
for the prokaryotic species definition. Proc. Natl. Acad. Sci. U. S. A.
106:19126–19131.
72. Rincon MT, et al. 2007. A novel cell surface-anchored cellulose-binding
protein encoded by the sca gene cluster of Ruminococcus flavefaciens.J.
Bacteriol. 189:4774 4783.
73. Sabathe F, Belaich A, Soucaille P. 2002. Characterization of the cellu-
lolytic complex (cellulosome) of Clostridium acetobutylicum. FEMS Mi-
crobiol. Lett. 217:15–22.
74. Saier MH. 2000. A functional-phylogenetic classification system for
transmembrane solute transporters. Microbiol. Mol. Biol. Rev. 64:354
411.
75. Saul DJ, et al. 1990. celB, a gene coding for a bifunctional cellulase from
the extreme thermophile Caldocellum saccharolyticum.” Appl. Environ.
Microbiol. 56:3117–3124.
76. Schneider E. 2001. ABC transporters catalyzing carbohydrate uptake.
Res. Microbiol. 152:303–310.
77. Suen G, et al. 2011. The complete genome sequence of Fibrobacter
succinogenes S85 reveals a cellulolytic and metabolic specialist. PLoS One
6:e18814. doi:10.1371/journal.pone.0018814.
78. Sunna A. 2010. Modular organisation and functional analysis of dis-
sected modular beta-mannanase CsMan26 from Caldicellulosiruptor
Rt8B. 4. Appl. Microbiol. Biotechnol. 86:189–200.
79. Sunna A, Gibbs MD, Bergquist PL. 2001. Identification of novel beta-
mannan- and beta-glucan-binding modules: evidence for a superfamily
of carbohydrate-binding modules. Biochem. J. 356:791–798.
80. Svetlichnyi VA, Svetlichnaya TP, Chernykh NA, Zavarzin GA. 1990.
Anaerocellum thermophilum gen. nov sp. nov: an extremely thermophilic
cellulolytic eubacterium isolated from hot springs in the Valley of Gey-
sers. Microbiology 59:598 604.
81. Tabb DL, McDonald WH, Yates JR. 2002. DTASelect and Contrast:
tools for assembling and comparing protein identifications from shot-
gun proteomics. J. Proteome Res. 1:21–26.
82. Tamura K, Dudley J, Nei M, Kumar S. 2007. MEGA4: Molecular
Evolutionary Genetics Analysis (MEGA) software version 4.0. Mol. Biol.
Evol. 24:1596–1599.
83. Tatusov RL, et al. 2003. The COG database: an updated version includes
eukaryotes. BMC Bioinformatics 4:41.
Cellulose Degradation by Caldicellulosiruptor spp.
August 2012 Volume 194 Number 15 jb.asm.org 4027
84. Te’o VS, Saul DJ, Bergquist PL. 1995. celA, another gene coding for a
multidomain cellulase from the extreme thermophile Caldocellum sac-
charolyticum. Appl. Microbiol. Biotechnol. 43:291–296.
85. Tettelin H, et al. 2005. Genome analysis of multiple pathogenic isolates
of Streptococcus agalactiae: implications for the microbial “pan-genome.”
Proc. Natl. Acad. Sci. U. S. A. 102:13950–13955.
86. Thompson JD, Higgins DG, Gibson TJ. 1994. CLUSTAL W: improving
the sensitivity of progressive multiple sequence alignment through se-
quence weighting, position-specific gap penalties and weight matrix
choice. Nucleic Acids Res. 22:4673–4680.
87. Tormo J, et al. 1996. Crystal structure of a bacterial family-III cellulose-
binding domain: a general mechanism for attachment to cellulose.
EMBO J. 15:5739–5751.
88. UniProt Consortium. 2010. Ongoing and future developments at the
Universal Protein Resource. Nucleic Acids Res. 39:D214–D219.
89. van de Werken HJ, et al. 2008. Hydrogenomics of the extremely ther-
mophilic bacterium Caldicellulosiruptor saccharolyticus. Appl. Environ.
Microbiol. 74:6720 6729.
90. VanFossen AL, Ozdemir I, Zelin SL, Kelly RM. 2011. Glycoside hy-
drolase inventory drives plant polysaccharide deconstruction by the ex-
tremely thermophilic bacterium Caldicellulosiruptor saccharolyticus. Bio-
technol. Bioeng. 108:1559 –1569.
91. VanFossen AL, Verhaart MR, Kengen SM, Kelly RM. 2009. Carbohy-
drate utilization patterns for the extremely thermophilic bacterium Cal-
dicellulosiruptor saccharolyticus reveal broad growth substrate prefer-
ences. Appl. Environ. Microbiol. 75:7718–7724.
92. Wang Z-W, Lee S-H, Elkins JG, Morrell-Falvey JL. 2011. Spatial and temporal
dynamics of cellulose degradation and biofilm formation by Caldicellulosiruptor
obsidiansis and Clostridium thermocellum. AMB Express. 1:30.
93. Washburn MP, Wolters D, Yates JR. 2001. Large-scale analysis of the
yeast proteome by multidimensional protein identification technology.
Nat. Biotechnol. 19:242–247.
94. Watson BJ, Zhang H, Longmire AG, Moon YH, Hutcheson SW. 2009.
Processive endoglucanases mediate degradation of cellulose by Sacchar-
ophagus degradans. J. Bacteriol. 191:5697–5705.
95. Weiner RM, et al. 2008. Complete genome sequence of the complex car-
bohydrate-degrading marine bacterium, Saccharophagus degradans strain
2-40T. PLoS Genet. 4:e1000087. doi:10.1371/journal.pgen.1000087.
96. Xie G, et al. 2007. Genome sequence of the cellulolytic gliding bacterium
Cytophaga hutchinsonii. Appl. Environ. Microbiol. 73:3536–3546.
97. Yang SJ, et al. 2009. Efficient degradation of lignocellulosic plant bio-
mass, without pretreatment, by the thermophilic anaerobe Anaerocel-
lum thermophilum DSM 6725. Appl. Environ. Microbiol. 75:4762–
4769.
98. Yang SJ, et al. 2010. Reclassification of Anaerocellum thermophilum’as
Caldicellulosiruptor bescii strain DSM 6725T sp. nov. Int. J. Syst. Evol.
Microbiol. 60:2011–2015.
99. York WS, van Halbeek H, Darvill AG, Albersheim P. 1990. Struc-
tural analysis of xyloglucan oligosaccharides by 1H-n.m.r. spectros-
copy and fast-atom-bombardment mass spectrometry. Carbohydrate
Res. 200:9–31.
100. Zdobnov EM, Apweiler R. 2001. InterProScan–an integration platform
for the signature-recognition methods in InterPro. Bioinformatics 17:
847–848.
101. Zybailov B, et al. 2006. Statistical analysis of membrane proteome ex-
pression changes in Saccharomyces cerevisiae. J. Proteome Res. 5:2339
2347.
Blumer-Schuette et al.
4028 jb.asm.org Journal of Bacteriology
... Their metabolism consists in the fermentation of monosaccharides, disaccharides and polysaccharides as carbon sources, including cellulose as the sole carbon source, which makes them attractive from a bioprospecting standpoint [60]. All species are capable of glycolysis through the Embden-Meyerhof-Parnas pathway, the fermentation of xylose following the non-oxidative pentose phosphate pathway and the reduction of pyruvate with an incomplete TCA cycle [61]. ...
Article
Full-text available
The northwest of Spain has an abundance of non-volcanic hot springs that, until recently, had only been used for thermalism activities. One of such hot springs, Muiño da Veiga, has now been explored using metagenomics to study the microbial community that inhabits these high-temperature circumneutral continental waters. Sequencing of the metagenome allowed the characterization of its composition, diversity, metabolic connections and potential as a source for thermozymes, as well as its ability to assemble MAGs. A diverse microbial community dominated by Bacteria domain members was revealed, particularly from the early-branching Aquificales group. The most abundant genus was Sulfurihydrogenibium, known for its implication in sulfur cycling and for forming mats that enable novel niches. The variety of primary producers with autotrophic pathways (and specifically the sulfur oxidizing pathway) expands the range of available nutrients, and the increase in biomass forms thicker mats, resulting in more available niches and broader microbial diversity. Nonetheless, certain metabolic pathways were attributed to less abundant members of the microbial community, reinforcing the idea that the rare biosphere plays important roles in the network of interactions present in an ecosystem and acts as genetic reservoirs. In addition, three of the assembled MAGs represent novel microbial diversity found in this hot spring. Moreover, the presence of enzymes and microorganisms with possible biotechnological applications was confirmed, including proteases, lipases and cell-wall degrading enzymes, pointing to the potential for the hot spring as a source for thermozymes.
... Several studies have investigated the inventory of GHs across the genus either broadly or with a focus on cellulose deconstruction (9,26,27). Since these studies, another member of the genus, Caldicellulosiruptor diazotrophicus, has been sequenced (28), and two species, C. kristjanssonii and C. lactoaceticus, have been reclassified as subspecies of C. acetigenus despite the lack of a closed genome for the latter (29). Furthermore, there are several Caldicellulosiruptor species that have larger inventories of GHs, but fewer cellulases, than C. bescii (27), which may provide insight into what is required by members of the genus to degrade more heterogeneous xylan effectively. ...
Article
Full-text available
Microbial deconstruction of lignocellulose for the production of biofuels and chemicals requires the hydrolysis of heterogeneous hemicelluloses to access the microcrystalline cellulose portion. This work extends previous in vivo and in vitro efforts to characterize hemicellulose utilization by integrating genomic reconstruction, transcriptomic data, operon structures, and biochemical characteristics of key enzymes to understand the deployment and functionality of hemicellulases by the extreme thermophile Caldicellulosiruptor bescii .
... Six large multidomain glycoside hydrolases (GHs), CelA (Athe_1867), CelB (Athe_1859), CelC (Athe_1857), CelD (Athe_1866), CelE (Athe_1865), and CelF (Athe_1860), present in the glucan degradation locus (GDL) of the C. bescii genome, play a key role in microcrystalline cellulose hydrolysis (17,18). These GDL enzymes are naturally optimized for synergistic cellulose degradation (19). ...
Article
Full-text available
Extremely thermophilic Caldicellulosiruptor species solubilize carbohydrates from lignocellulose through glycoside hydrolases (GHs) that can be extracellular, intracellular, or cell surface layer (S-layer) associated. Caldicellulosiruptor genomes sequenced so far encode at least one surface layer homology domain glycoside hydrolase (SLH-GH), representing six different classes of these enzymes; these can have multiple binding and catalytic domains. Biochemical characterization of a representative from each class was done to determine their biocatalytic features: four SLH-GHs from Caldicellulosiruptor kronotskyensis (Calkro_0111, Calkro_0402, Calkro_0072, and Calkro_2036) and two from Caldicellulosiruptor hydrothermalis (Calhy_1629 and Calhy_2383). Calkro_0111, Calkro_0072, and Calhy_2383 exhibited β-1,3-glucanase activity, Calkro_0402 was active on both β-1,3/1,4-glucan and β-1,4-xylan, Calkro_2036 exhibited activity on both β-1,3/1,4-glucan and β-1,4-glucan, and Calhy_1629 was active only on arabinan. Caldicellulosiruptor bescii, the only species with molecular genetic tools as well as already a strong cellulose degrader, contains only one SLH-GH, Athe_0594, a glucanase that is a homolog of Calkro_2036; the other 5 classes of SLH-GHs are absent in C. bescii. The C. bescii secretome, supplemented with individual enzymes or cocktails of SLH-GHs, increased in vitro sugar release from sugar cane bagasse and poplar. Expression of non-native SLH-GHs in vivo, either associated with the S-layer or as freely secreted enzymes, improved total carbohydrate solubilization of sugar cane bagasse and poplar by up to 45% and 23%, respectively. Most notably, expression of Calkro_0402, a xylanase/glucanase, improved xylose solubilization from poplar and bagasse by over 70% by C. bescii. While Caldicellulosiruptor species are already prolific lignocellulose degraders, they can be further improved by the strategy described here. IMPORTANCE Caldicellulosiruptor species hold promise as microorganisms that can solubilize the carbohydrate portion of lignocellulose and subsequently convert fermentable sugars into bio-based chemicals and fuels. Members of the genus have surface layer (S-layer) homology domain-associated glycoside hydrolases (SLH-GHs) that mediate attachment to biomass as well as hydrolysis of carbohydrates. Caldicellulosiruptor bescii, the most studied member of the genus, has only one SLH-GH. Expression of SLH-GHs from other Caldicellulosiruptor species in C. bescii significantly improved degradation of sugar cane bagasse and poplar. This suggests that this extremely thermophilic bacterium can be engineered to further improve its ability to degrade specific plant biomasses by inserting genes encoding SLH-GHs recruited from other Caldicellulosiruptor species.
... The genus Caldicellulosiruptor has been characterized as cellulose-degrading fermentative bacteria, with an optimum growth temperature of approximately 70 °C. The genus Caldicellulosiruptor is an ideal candidate for the bioprocessing of plant biomass [1][2][3]. The isolation and cultivation of Caldicellulosiruptor from geothermal environments in Japan, China, Thailand, New Zealand, Russia, Iceland and North America have been widely reported [4]; to date, nine species have been validly described [5][6][7][8][9][10][11][12]. ...
Article
A novel nitrogen-fixing fermentative bacterium, designated as YA01 T , was isolated from Nakabusa hot springs in Japan. The short-rod cells of strain YA01 T were Gram-positive and non-sporulating. Phylogenetic trees of the 16S rRNA gene sequence and concatenated sequences of 40 single-copy ribosomal genes revealed that strain YA01 T belonged to the genus Caldicellulosiruptor and was closely related to Caldicellulosiruptor hydrothermalis 108 T , Caldicellulosiruptor bescii DSM 6725 T and Caldicellulosiruptor kronotskyensis 2002 T . The 16S rRNA gene sequence of strain YA01 T shares less than 98.1 % identity to the known Caldicellulosiruptor species. The G+C content of the genomic DNA was 34.8 mol%. Strain YA01 T shares low genome-wide average nucleotide identity (90.31–91.10 %), average amino acid identity (91.45–92.10 %) and <70 % digital DNA–DNA hybridization value (41.8–44.2 %) with the three related species of the genus Caldicellulosiruptor . Strain YA01 T grew at 50–78 °C (optimum, 70 °C) and at pH 5.0–9.5 (optimum, pH 6.5). Strain YA01 T mainly produced acetate by consuming d (+)-glucose as a carbon source. The main cellular fatty acids were iso-C 17 : 0 (35.7 %), C 16 : 0 (33.3 %), DMA 16 : 0 (6.6 %) and iso-C 15 : 0 (5.9 %). Based on its distinct phylogenetic position, biochemical and physiological characteristics, and the major cellular fatty acids, strain YA01 T is considered to represent a novel species of the genus Caldicellulosiruptor for which the name Caldicellulosiruptor diazotrophicus sp. nov. is proposed (type strain YA01 T =DSM 112098 T =JCM 34253 T ).
Article
This work aimed to evaluate the feasibility of biohydrogen production from Barley Straw and Miscanthus. The primary obstacle in plant biomass decomposition is the recalcitrance of the biomass itself. Plant cell walls consist of cellulose, hemicellulose, and lignin, which make the plant robust to decomposition. However, the hyperthermophilic bacterium, Caldicellulosiruptor bescii, can efficiently utilize lignocellulosic feedstocks (Barley Straw and Miscanthus) for energy production, and C. bescii can now be metabolically engineered or isolated to produce more hydrogen and other biochemicals. In the present study, two strains, C. bescii JWCB001 (wild-type) and JWCB018 (ΔpyrFA Δldh ΔcbeI), were tested for their ability to increase hydrogen production from Barley Straw and Miscanthus. The JWCB018 resulted in a redirection of carbon and electron (carried by NADH) flow from lactate production to acetate and hydrogen production. JWCB018 produced ~54% and 63% more acetate and hydrogen from Barley Straw, respectively than its wild-type counterpart, JWCB001. Also, 25% more hydrogen from Miscanthus was obtained by the JWCB018 strain with 33% more acetate relative to JWCB001. It was supported that the engineered C. bescii, such as the JWCB018, can be a parental strain to get more hydrogen and other biochemicals from various biomass.
Article
Full-text available
Caldicellulosiruptor is a genus of thermophilic to hyper-thermophilic microorganisms that express and secrete an arsenal of enzymes degrading lignocellulosic biomasses into fermentable sugars. Because of this distinguished feature, strains of Caldicellulosiruptor have been considered as promising candidates for consolidated bioprocessing. Although a few Caldicellulosiruptor strains with industrially relevant characteristics have been isolated to date, it is apparent that further improvement of the strains is essential for industrial application. The earlier identification of the HaeIII-like restriction-modification system in C. bescii strain DSM 6725 has formed the basis for genetic methods with the aim to improve the strain's lignocellulolytic activity and ethanol production. In this study, a novel SfaNI-like restriction-modification system was identified in Caldicellulosiruptor sp. strain BluCon085, consisting of an endonuclease and two methyltransferases that recognize the reverse-complement sequences 5'-GATGC-3' and 5'-GCATC-3'. Methylation of the adenine in both sequences leads to an asymmetric methylation pattern in the genomic DNA of strain BluCon085. Proteins with high percentage of identity to the endonuclease and two methyltransferases were identified in the genomes of C. saccharolyticus strain DSM 8903, C. naganoensis strain DSM 8991, C. changbaiensis strain DSM 26941 and Caldicellulosiruptor sp. strain F32, suggesting that a similar restriction-modification system may be active also in these strains and respective species. We show that methylation of plasmid and linear DNA by the identified methyltransferases, obtained by heterologous expression in Escherichia coli, is sufficient for successful transformation of Caldicellulosiruptor sp. strain DIB 104C. The genetic engineering toolbox developed in this study forms the basis for rational strain improvement of strain BluCon085, a derivative from strain DIB 104C with exceptionally high L-lactic acid production. The toolbox may also work for other species of the genus Caldicellulosiruptor that have so far not been genetically tractable.
Article
Full-text available
Fermentative nitrogen-fixing bacteria have not yet been examined in detail in thermal environments. In the present study, we isolated the thermophilic fermentative bacterium, strain YA01 from a hot spring. This strain grew at temperatures up to 78°C. A phylogenetic analysis based on its 16S rRNA gene sequence indicated that strain YA01 belonged to the genus Caldicellulosiruptor, which are fermentative bacteria in the phylum Firmicutes, with 97.7–98.0% sequence identity to its closest relatives. Strain YA01 clearly exhibited N2-dependent growth at 70°C. We also confirmed N2-dependent growth in the relatives of strain YA01, Caldicellulosiruptor hydrothermalis 108 and Caldicellulosiruptor kronotskyensis 2002. The nitrogenase activities of these three strains were examined using the acetylene reduction assay. Similar activities were detected for all tested strains, and were slightly suppressed by the addition of ammonium. A genome analysis revealed that strain YA01, as well as other Caldicellulosiruptor, possessed a gene set for nitrogen fixation, but lacked the nifN gene, which encodes a nitrogenase iron-molybdenum cofactor biosynthesis protein that is commonly detected in nitrogen-fixing bacteria. The amino acid sequences of nitrogenase encoded by nifH, nifD, and nifK shared 92–98% similarity in Caldicellulosiruptor. A phylogenetic tree of concatenated NifHDK sequences showed that NifHDK of Caldicellulosiruptor was in the deepest clade. To the best of our knowledge, this is the first study to demonstrate the nitrogen-fixing ability of fermentative bacteria at 70°C. Caldicellulosiruptor may have retained an ancient nitrogen-fixing enzyme system.
Article
Full-text available
To develop functional metabolic engineering platforms for nonmodel microorganisms, a comprehensive understanding of the physiological and metabolic characteristics is critical. Caldicellulosiruptor bescii and other species in this genus have untapped potential for conversion of unpretreated plant biomass into industrial fuels and chemicals. The highly interactive and complex machinery used by C. bescii to acquire and process complex carbohydrates contained in lignocellulose was elucidated here to complement related efforts to develop a metabolic engineering platform with this bacterium.
Article
The potential to convert renewable plant biomasses into fuels and chemicals by microbial processes presents an attractive, less environmentally intense alternative to conventional routes based on fossil fuels. This would best be done with microbes that natively deconstruct lignocellulose and concomitantly form industrially relevant products, but these two physiological and metabolic features are rarely and simultaneously observed in nature. Genetic modification of both plant feedstocks and microbes can be used to increase lignocellulose deconstruction capability and generate industrially relevant products. Separate efforts on plants and microbes are ongoing, but these studies lack a focus on optimal, complementary combinations of these disparate biological systems to obtain a convergent technology. Improving genetic tools for plants have given rise to the generation of low‐lignin lines that are more readily solubilized by microorganisms. Most focus on the microbiological front has involved thermophilic bacteria from the genera Caldicellulosiruptor and Clostridium, given their capacity to degrade lignocellulose and to form bio‐products through metabolic engineering strategies enabled by ever‐improving molecular genetics tools. Bioengineering plant properties to better fit the deconstruction capabilities of candidate consolidated bioprocessing microorganisms has potential to achieve the efficient lignocellulose deconstruction needed for industrial relevance. This article is protected by copyright. All rights reserved.
Article
Economic production of lignocellulose degrading enzymes for biofuel industries is of considerable interest to the biotechnology community. While these enzymes are widely distributed in fungi, their industrial production from other sources, particularly by thermophilic anaerobic bacteria (growth Topt ≥ 60 °C), is an emerging field. Thermophilic anaerobic bacteria produce a large number of lignocellulolytic enzymes having unique structural features and employ different schemes for biomass degradation, which can be classified into four systems namely; ‘free enzyme system’, ‘cell anchored enzymes’, ‘complex cellulosome system’, and ‘multifunctional multimodular enzyme system’. Such enzymes exhibit high specific activity and have a natural ability to withstand harsh bioprocessing conditions. However, achieving a higher production of these thermostable enzymes at current bioprocessing targets is challenging. In this review, the research opportunities for these distinct enzyme systems in the biofuel industry and the associated technological challenges are discussed. The current status of research findings is highlighted along with a detailed description of the categorization of the different enzyme production schemes. It is anticipated that high temperature-based bioprocessing will become an integral part of sustainable bioenergy production in the near future.
Article
The crystal structure of a family‐III cellulose‐binding domain (CBD) from the cellulosomal scaffoldin subunit of Clostridium thermocellum has been determined at 1.75 A resolution. The protein forms a nine‐stranded beta sandwich with a jelly roll topology and binds a calcium ion. conserved, surface‐exposed residues map into two defined surfaces located on opposite sides of the molecule. One of these faces is dominated by a planar linear strip of aromatic and polar residues which are proposed to interact with crystalline cellulose. The other conserved residues are contained in a shallow groove, the function of which is currently unknown, and which has not been observed previously in other families of CBDs. On the basis of modeling studies combined with comparisons of recently determined NMR structures for other CBDs, a general model for the binding of CBDs to cellulose is presented. Although the proposed binding of the CBD to cellulose is essentially a surface interaction, specific types and combinations of amino acids appear to interact selectively with glucose moieties positioned on three adjacent chains of the cellulose surface. The major interaction is characterized by the planar strip of aromatic residues, which align along one of the chains. In addition, polar amino acid residues are proposed to anchor the CBD molecule to two other adjacent chains of crystalline cellulose.
Article
We describe a program, tRNAscan-SE, which identifies 99-100% of transfer RNA genes in DNA sequence while giving less than one false positive per 15 gigabases. Two previously described tRNA detection programs are used as fast, first-pass prefilters to identify candidate tRNAs, which are then analyzed by a highly selective tRNA covariance model. This work represents a practical application of RNA covariance models, which are general, probabilistic secondary structure profiles based on stochastic context-free grammars. tRNAscan-SE searches at approximately 30 000 bp/s. Additional extensions to tRNAscan-SE detect unusual tRNA homologues such as selenocysteine tRNAs, tRNA-derived repetitive elements and tRNA pseudogenes.
Article
The primary mission of Universal Protein Resource (UniProt) is to support biological research by maintaining a stable, comprehensive, fully classified, richly and accurately annotated protein sequence knowledgebase, with extensive cross-references and querying interfaces freely accessible to the scientific community. UniProt is produced by the UniProt Consortium which consists of groups from the European Bioinformatics Institute (EBI), the Swiss Institute of Bioinformatics (SIB) and the Protein Information Resource (PIR). UniProt is comprised of four major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase, the UniProt Reference Clusters and the UniProt Metagenomic and Environmental Sequence Database. UniProt is updated and distributed every 4 weeks and can be accessed online for searches or download at http://www.uniprot.org.
Article
Co-utilization of hexoses and pentoses derived from lignocellulose is an attractive trait in microorganisms considered for consolidated biomass processing to biofuels. This issue was examined for the H2-producing, extremely thermophilic bacterium Caldicellulosiruptor saccharolyticus growing on individual monosaccharides (arabinose, fructose, galactose, glucose, mannose and xylose), mixtures of these sugars, as well as on xylan and xyloglucooligosacchrides. C. saccharolyticus grew at approximately the same rate (td approximately 95 min) and to the same final cell density (1-3 x 10(8) cells/ml) on all sugars and sugar mixtures tested. In the monosaccharide mixture, while simultaneous consumption of all monosaccharides was observed, not all were utilized to the same extent (fructose > xylose/arabinose > mannose/glucose/galactose). Transcriptome contrasts for monosaccharide growth revealed minimal changes in some cases (e.g., 31 ORFs changed >/= 2-fold for glucose vs. galactose) while substantial changes occurred for cases involving mannose (e.g., 363 ORFs >/= 2-fold for glucose vs. mannose). Evidence for catabolite repression was noted neither for growth on multi-sugar mixtures nor in the corresponding transcriptomes. Based on the whole-genome transcriptional response analysis and comparative genomics, carbohydrate specificities for transport systems could be proposed for most of the 24 putative carbohydrate ATP-binding cassette (ABC) transporters and single phosphotransferase system (PTS) identified in C. saccharolyticus. While most transporter genes responded to individual monosacchrides and polysaccharides, Csac_0692-0694 was up-regulated only in the monosaccharide mixture. The results here affirm the broad growth substrate preferences of C. saccharolyticus on carbohydrates representative of lignocellulosic biomass and suggest that this bacterium holds promise for biofuels applications