ArticlePDF Available

Quantitative Comparisons of 16S rRNA Gene Sequence Libraries from Environmental Samples

Applied and Environmental Microbiology

September 2001
67(9):4374-6

DOI:10.1128/AEM.67.9.4374-4376.2001

Source
PubMed

Authors:

David Singleton

Duke University

Stephen Rathbun

University of Georgia

William B Whitman

University of Georgia

To determine the significance of differences between clonal libraries of environmental rRNA gene sequences, differences between homologous coverage curves, CX(D), and heterologous coverage curves, CXY(D), were calculated by a Cramér-von Mises-type statistic and compared by a Monte Carlo test procedure. This method successfully distinguished rRNA gene sequence libraries from soil and bioreactors and correctly failed to find differences between libraries of the same composition.

. Comparisons of environmental clone libraries

…

Results of selected LIBSHUFF comparisons. Homologous (○) and heterologous (●) coverage curves for 16S rRNA gene sequence libraries from environmental samples are shown. Solid lines indicate the value of (CX −CXY)² for the original samples at each value of D. D is equal the Jukes-Cantor evolutionary distance determined by the DNADIST program of PHYLIP (3). Broken lines indicate the 950th value (or P = 0.05 ) of (CX −CXY)² for the randomized samples. (A) Comparison of clones from grassland soils with odd (X) and even (Y) accession numbers. (B) Comparison of bioreactor clones SBR1 (X) and grassland soil SL clones (Y). (C) Comparison of C0 (X) and S0 (Y) clones from arid soils.

…

Effect of sample size on the discrimination of libraries. A comparison of the SL library from grassland soil (Y; n = variable) to the bioreactor library SBR1 ( X; n = 97 ) (●) and a comparison of the SBR1 (Y; n = variable) library to the SL ( X; n = 137 ) library (○) shown. Each point represents an average of 10 replicates, and the error bars are 1 standard deviation. The broken line indicates P = 0.05 .

…

Figures - uploaded by Stephen Rathbun

Content may be subject to copyright.

A preview of this full-text is provided by American Society for Microbiology.

Learn more

Content available from Applied and Environmental Microbiology

This content is subject to copyright. Terms and conditions apply.

APPLIED AND ENVIRONMENTAL MICROBIOLOGY,

0099-2240/01/$04.00⫹0 DOI: 10.1128/AEM.67.9.4374–4376.2001

Sept. 2001, p. 4374–4376 Vol. 67, No. 9

Quantitative Comparisons of 16S rRNA Gene Sequence

Libraries from Environmental Samples

DAVID R. SINGLETON,

MICHELLE A. FURLONG,

STEPHEN L. RATHBUN,

AND WILLIAM B. WHITMAN

Departments of Microbiology

and Statistics,

University of Georgia,

Athens, Georgia 30602-2605

Received 8 March 2001/Accepted 11 June 2001

To determine the signiﬁcance of differences between clonal libraries of environmental rRNA gene sequences,

differences between homologous coverage curves, C

(D), and heterologous coverage curves, C

(D), were cal-

culated by a Crame´r-von Mises-type statistic and compared by a Monte Carlo test procedure. This method

successfully distinguished rRNA gene sequence libraries from soil and bioreactors and correctly failed to ﬁnd

differences between libraries of the same composition.

The sequencing of 16S rRNA genes from clone libraries of

DNAs from environmental samples has led to a wealth of

information concerning prokaryotic diversity. However, in ad-

dition to methodological problems in producing libraries rep-

resentative of the environmental sample (for a review, see

reference 8), this approach is also limited by the difﬁculty in

comparing libraries and determining if they are signiﬁcantly

different.

This problem can be addressed quantitatively by application

of the formula for coverage as described by Good (4). Let Xbe

a collection of sequences, such as a library of 16S rRNA genes.

Deﬁne the “homologous” coverage of X(or C

) by a sample

from Xto be C

⫽1⫺(N

/n), where N

is the number of

unique sequences in the sample (i.e., sequences without a

replicate) and nis the total number of sequences. In practice,

the deﬁnition of N

depends upon the criteria used to deﬁne

uniqueness. For instance, McCaig et al. (6) considered se-

quences without a homolog of ⱖ97% similarity to be unique.

Other authors have used ⱖ99% sequence similarity as the

criterion. In principle, uniqueness can be deﬁned at any level

of sequence similarity or evolutionary distance (D) and a “ho-

mologous coverage curve,” or C

(D), can be generated by

plotting C

versus D(Fig. 1). The coverage curve then de-

scribes how well the sample represents the entire library Xat

various levels of relatedness. Typically, coverage might be low

at high levels of relatedness (low values of D), indicating that

only a small fraction of the sequences representing unique

species are, in fact, sampled. In contrast, coverage might be

much higher at low levels of relatedness, indicating that rep-

resentatives of most of the deep phylogenetic groups present in

Xare found in the sample.

While C

is the “homologous coverage” of Xby a sample of

X, it is also possible to calculate a “heterologous coverage” of

X(or C

) by a sample Yfrom another collection of sequences

by the following formula: C

⫽1⫺(N

/n), where N

is the

number of sequences in a sample of Xthat are not found in a

sample of Yand nis the number of sequences in the sample of

X. Similarly to N

can also be deﬁned at different levels

of Dto generate a coverage curve, C

(D). Moreover, if X⫽

Y, one might expect the coverage curves C

(D) and C

(D) [as

well as C

(D) and C

(D)] to be similar. Thus, a test for dif-

ferences between these coverage curves is also a test for dif-

ferences between Xand Y. To determine if the coverage curves

(D) and C

(D) are signiﬁcantly different, the distance be-

tween the two curves are ﬁrst calculated by using the Crame´r-

von Mises test statistic (7):

⌬CXY ⫽

冘

D⫽0.0

0.5

共CX⫺CXY兲2

where Dincreases in increments of 0.01. If X⫽Y, then ⌬C

should not be signiﬁcantly different than a ⌬Ccalculated after

randomly shufﬂing sequences between the two samples, Xand

Y. Typically, the sequences are randomly shufﬂed a large num-

ber (N) of times (e.g., N⫽999) and ⌬C

is calculated after

each shufﬂing. The randomized values plus the empirical value

of ⌬C

are ranked from largest to smallest, and then the P

value is estimated to be r/(N⫹1), where rdenotes the rank of

the empirical value of ⌬C

(5). The two libraries are consid-

ered signiﬁcantly different when P⬍0.05. We have created a

computer program (LIBSHUFF) that uses a sorted distance

matrix containing both Xand Yas input and returns the cov-

erage curves C

(D), C

(D), and C

(D), as well as the

Pvalues for both ⌬C

and ⌬C

, from the distribution of ⌬C.

In addition, the distribution of (C

⫺C

)

with Dappears to

be informative and is given as well (see below). The computer

program LIBSHUFF was written in Perl and can be down-

loaded along with more detailed instructions on its use at http:

//www.arches.uga.edu/⬃whitman/libshuff.html.

A ﬁrst test of this method was done to ensure that samples

from the same library were not shown to be different. Thus, a

collection of clonal sequences (n⫽275) from a soil community

study (6) was divided into two samples based upon accession

numbers (138 odds and 137 evens). Although the study con-

tained sequences from two sample sites (SL and SAF clones),

sequences from both sites were placed in each data set to form

nearly equivalent samples. A comparison of ⌬C

odds/evens

to ⌬C

* Corresponding author. Mailing address: Department of Microbi-

ology, University of Georgia, 527 Biological Sciences Bldg.; Athens,

GA 30602-2605. Phone: (706) 542-4219. Fax: (706) 542-2674. E-mail:

whitman@arches.uga.edu.

4374

values resulted in P⫽0.871, which indicated that the two

samples were not signiﬁcantly different (Fig. 1A). Similar re-

sults were obtained for ⌬C

evens/odds

and other arbitrarily di-

vided sequence libraries (Table 1). Thus, as expected, samples

taken from the same library were not found to be different.

To demonstrate that this procedure could correctly differ-

entiate samples from different libraries, sequences of clones

obtained from an activated sludge (SBR1; n⫽97; reference 1)

were compared to grassland soil SL clones. The SBR1 clones

were found to be signiﬁcantly different from the SL clones

(P⫽0.001; Fig. 1B). More information on the nature of this

difference was obtained by examination of the distribution of

⫺C

)

with D(Fig. 1B). At low D, the actual (C

⫺

)

exceeded the comparable values at P⫽0.05 obtained

during the calculation of ⌬C. This result suggested that the

libraries differed greatly at D⬍0.10 but shared many deep

taxa. However, smaller differences at D⬎0.3 suggested that

not all deep phylogenetic groups were found in both libraries.

Similar results were also obtained for comparisons of other soil

and bioreactor libraries (Table 1 and data not shown).

Three sequence collections consisting of multiple samples

were analyzed to determine if differences between the samples

could be detected (Table 1). Clonal libraries derived from the

microbial populations of phosphate-removing (SBR1) and

non-phosphate-removing (SBR2) bioreactors differed in the

abundance of certain taxa (1). However, these differences were

not shown to be signiﬁcant by our method (Table 1). The

compositions of libraries from the microbial communities of

improved (SL) and unimproved (SAF) upland grass pasture

soils were not found to be signiﬁcantly different (6). We also

obtained the same conclusion by our method (Table 1). Fi-

nally, comparisons of restriction fragment length types from

C0 and S0, two clonal libraries derived from arid soils, sug-

gested that C0 was more diverse than S0 (2). Our analysis of

the sequences obtained from this study was consistent with this

conclusion and further suggested that S0 was a subset of C0.

⌬C

S0/C0

was not signiﬁcant, which suggested that all of the taxa

present in S0 were also present in C0 (Table 1). However, the

reciprocal value ⌬C

C0/S0

was signiﬁcant; therefore, C0 also

contained sequences of one or more taxa not found in S0. The

distribution of (C

⫺C

)

with Dfurther indicated that the

additional taxa in C0 represented moderately deep phyloge-

netic groups, 0.15 ⬍D⬍0.25 (Fig. 1C).

FIG. 1. Results of selected LIBSHUFF comparisons. Homologous

(E) and heterologous (F) coverage curves for 16S rRNA gene se-

quence libraries from environmental samples are shown. Solid lines

indicate the value of (C

⫺C

)

for the original samples at each value

of D.Dis equal the Jukes-Cantor evolutionary distance determined

by the DNADIST program of PHYLIP (3). Broken lines indicate the

950th value (or P⫽0.05) of (C

⫺C

)

for the randomized samples.

(A) Comparison of clones from grassland soils with odd (X) and even

(Y) accession numbers. (B) Comparison of bioreactor clones SBR1 (X)

and grassland soil SL clones (Y). (C) Comparison of C0 (X) and S0 (Y)

clones from arid soils.

TABLE 1. Comparisons of environmental clone libraries

Site (reference)

Homologous

(X)Heterologous

(Y)P

Clones nClones

Grassland soils (6) Odds

138 Evens

0.871

Evens

137 Odds

0.933

SAF 138 SL 0.120

SL 137 SAF 0.135

Bioreactors (1) Odds

95 Evens

0.853

Evens

94 Odds

0.623

SBR1 97 SBR2 0.308

SBR2 92 SBR1 0.824

Arid soils (2) Odds

a,c

56 Evens

0.251

Evens

56 Odds

a,c

0.516

C0 59 S0 0.042

S0 53 C0 0.398

Grassland soil/bioreactor SAF 138 SBR1 0.001

SBR1 97 SAF 0.002

SL 137 SBR1 0.001

SBR1 97 SL 0.001

Sequences with odd or even accession numbers. Contains mixtures of both

libraries described in the reference, and they are not expected to be different.

Value of r/(N⫹1) as described in the text.

Accession number AF128647 could not be found and was not included.

VOL. 67, 2001 QUANTITATIVE COMPARISONS OF SEQUENCE LIBRARIES 4375

Sample size should have a major effect on comparisons of

libraries. The minimum number of sequences necessary to dis-

tinguish two dissimilar libraries was expected to increase with

the complexity of the libraries and decrease with the magni-

tude of the dissimilarity. This point was examined in detail by

using two libraries of high diversity and dissimilarity. Variable

numbers of clonal sequences were randomly selected from

either library SBR1 or SL (Y) and compared to the opposite

library (X), and Pvalues were determined for 10 replicates.

Approximately 20 and 25 sequences from SBR1 and SL, re-

spectively, were required to differentiate the two libraries (P⬍

0.05) when Xwas represented by 97 and 137 sequences, re-

spectively (Fig. 2). Tests were also performed to investigate the

required sample size of X(SBR1) when the size of Y(SL) was

small. It was found that nearly all (ⱖ90) of the sequences from

the SBR1 library were required to distinguish these libraries

when the SL library (Y) was represented by 20 sequences (data

not shown). When the sizes of both libraries were varied, they

were consistently detected as different when the SBR1 (X) and

SL (Y) libraries were represented by ⱖ40 and ⱖ30 sequences,

respectively (data not shown). While these results may not

generalize to all environmental samples, they should be repre-

sentative of comparisons of libraries from diverse communi-

ties, such as those found in soil and bioreactors. Importantly,

these results suggest than modestly sized libraries from micro-

bial communities similar in complexity to those used in this

study will be distinguished by this method.

We thank Kamyar Farahi and Rob Waldo for help with program-

ming in Perl. We also thank Lihua Wang of the Statistical Consulting

Ofﬁce at the University of Georgia for help.

This work was supported in part by an award from the Division of

Molecular and Cellular Biosciences at NSF (MCB-0084164).

REFERENCES

1. Bond, P. L., P. Hugenholtz, J. Keller, and L. L. Blackall. 1995. Bacterial

community structures of phosphate-removing and non-phosphate-removing

activated sludges from sequencing batch reactors. Appl. Environ. Microbiol.

61:1910–1916.

2. Dunbar, J., S. Takala, S. M. Barns, J. A. Davis, and C. R. Kuske. 1999. Levels

of bacterial community diversity in four arid soils compared by cultivation and

16S rRNA gene cloning. Appl. Environ. Microbiol. 65:1662–1669.

3. Felsenstein, J. 1993. PHYLIP (phylogenetic inference package) version 3.5c.

University of Washington, Seattle.

4. Good, I. J. 1953. The population frequencies of species and the estimation of

population parameters. Biometrika 40:237–264.

5. Hope, A. C. A. 1968. A simpliﬁed Monte Carlo signiﬁcance test procedure.

J. Royal Statist. Soc. B 30:582–598.

6. McCaig, A. E., L. A. Glover, and J. I. Prosser. 1999. Molecular analysis of

bacterial community structure and diversity in unimproved and improved

upland grass pastures. Appl. Environ. Microbiol. 65:1721–1730.

7. Pettitt, A. N. 1982. Cramer-von Mises statistic, p. 220–221. In S. Kotz and

N. L. Johnson (ed.), Encyclopedia of statistical sciences. Wiley-Interscience,

New York, N.Y.

8. von Wintzingerode, F., U. B. Go¨bel, and E. Stackebrandt. 1997. Determina-

tion of microbial diversity in environmental samples: pitfalls of PCR-based

rRNA analysis. FEMS Microbiol. Rev. 21:213–229.

FIG. 2. Effect of sample size on the discrimination of libraries. A

comparison of the SL library from grassland soil (Y;n⫽variable) to

the bioreactor library SBR1 (X;n⫽97) (F) and a comparison of the

SBR1 (Y;n⫽variable) library to the SL (X;n⫽137) library (E)

shown. Each point represents an average of 10 replicates, and the error

bars are 1 standard deviation. The broken line indicates P⫽0.05.

4376 SINGLETON ET AL. APPL.ENVIRON.MICROBIOL.

Content uploaded by Stephen Rathbun

Content may be subject to copyright.

Marine microbial hotspots—especially related to corals

Chapter

Nov 2021

Coral reefs, an oasis of the marine ecosystem, harbour millions of microorganisms. They are among the most diverse and productive, yet one of the most threatened ecosystems on the earth. Ideally, coral reefs are considered as “rain forests of the sea” because they have a comparable primary production rate with rain forests. Although they represent approximately less than 0.1% of the total ocean surface and host nearly 25% of marine species, corals are known to rely on diverse free-living and associated microbial consortiums to drive the recycling of nutrients and support the sustainability of marine life. In addition, microbial diversity maintains the holobiont health and resilience of ecosystems in tremendous environmental stress, such as anthropogenic disturbances. Consequently, restoration and introduction of microbial diversity in the ocean are of utmost importance in order to effectively conserve and build coral reefs. Recently, significant studies have been made on the profiling of associated diverse microbial consortia. This chapter presents an overview of microbial diversity hotspots in marine corals.

Technologies Promoting Genome-Based Taxonomy

Chapter

Apr 2024

Whole-genome sequencing (WGS) has proven to be a reliable method. Additionally, it is a commonly used approach in research and surveillance investigations. With the goal of achieving a whole or nearly complete genome sequence, sequencing technologies have been essential from the beginning. This will make sequencing technologies a preferred platform for the investigations of microbiome and single cells from even pristine and extraterrestrial samples. The sequencing and analysis of microbial taxa can currently be done on a variety of platforms and with a variety of assembly techniques. Recently, the whole genome has received significant attention and has been recommended by the Bacteriological Code and the SeqCode. Indeed, the whole genome of axenic culture as well as uncultivated taxa provides comprehensive information for naming a strain, for the enumeration of axenic culture in the future, and for the investigation of important genes that may be useful for various biotechnological applications. This chapter summarizes sequencing methods, a combination of sequencing technologies, algorithms, software solutions, and services for whole genome-based microbial taxonomy.

BIOINFORMATICS -ITS APPLICATION IN MICROBIAL DIVERSITY

Chapter

Full-text available

Mar 2022

Microorganisms are important constituent of earth's biodiversity. Advancement in microbial genomics and sequencing technologies has opened floodgates for generation of huge amount of biological data. This has lead to the usage of high-end computing resources for proper data analysis and data management, paving the way for new field of science to emerge, bioinformatics. Bioinformatic approach has opened new avenues for data analysis providing new in-depth knowledge of microbes and their habitat. High thorough put sequencing technologies together with multi-omics approach has further transformed our understanding of microbial communities from a variety of environment which was previously unthinkable. New computational methods are constantly being developed to collect process and extract meaningful biological information from these complex dataset. In this chapter we discuss about these new-age bioinformatics tools that are reshaping the present day microbial research by deciphering complex but vital biological information. The new tools are increasing the resolution at which microbial communities, their complexities and dynamics, can be studied to reveal their genetic potential and their functional diversity.

Dysregulation of the Intestinal Microbiome in Patients With Haploinsufficiency of A20

Article

Full-text available

Jan 2022

Introduction Haploinsufficiency of A20 (HA20) is a form of inborn errors of immunity (IEI). IEIs are genetically occurring diseases, some of which cause intestinal dysbiosis. Due to the dysregulation of regulatory T cells (Tregs) observed in patients with HA20, gut dysbiosis was associated with Tregs in intestinal lamina propria. Methods Stool samples were obtained from 16 patients with HA20 and 15 of their family members. Infant samples and/or samples with recent antibiotics use were excluded; hence, 26 samples from 13 patients and 13 family members were analyzed. The 16S sequencing process was conducted to assess the microbial composition of samples. Combined with clinical information, the relationship between the microbiome and the disease activity was statistically analyzed. Results The composition of gut microbiota in patients with HA20 was disturbed compared with that in healthy family members. Age, disease severity, and use of immunosuppressants corresponded to dysbiosis. However, other explanatory factors, such as abdominal symptoms and probiotic treatment, were not associated. The overall composition at the phylum level was stable, but some genera were significantly increased or decreased. Furthermore, among the seven operational taxonomic units (OTUs) that increased, two OTUs, Streptococcus mutans and Lactobacillus salivarius, considerably increased in patients with autoantibodies than those without autoantibodies. Discussion Detailed interaction on intestinal epithelium remains unknown; the relationship between the disease and stool composition change helps us understand the mechanism of an immunological reaction to microorganisms.

Asymptomatic Enteric Virus Infections and Association with the Gut Microbiome in Rural Residents of Northern Laos

Article

Mar 2024
AM J TROP MED HYG

Viral gastrointestinal infections are an important public health concern, and the occurrence of asymptomatic enteric virus infections makes it difficult to prevent and control their spread. This study aimed to determine the prevalence of and factors associated with asymptomatic enteric virus infection in adults in northern Laos. Fecal samples were collected from apparently healthy participants who did not report diarrhea or high fever at the time of the survey in northern Laos, and enteric viruses were detected using polymerase chain reaction (PCR) and reverse transcription (RT)-PCR. Individual characteristics, including the gut microbiome, were compared between asymptomatic carriers and noncarriers of each enteric virus. Of the participants ( N = 255), 12 (4.7%) were positive for norovirus genogroup I (GI), 8 (3.1%) for human adenovirus, and 1 (0.4%) for norovirus GII; prevalence tended to be higher in less-modernized villages. Gut microbial diversity (evaluated by the number of operational taxonomic units) was higher in asymptomatic carriers of norovirus GI or human adenovirus than in their noncarriers. Gut microbiome compositions differed significantly between asymptomatic carriers and noncarriers of norovirus GI or human adenovirus (permutational analysis of variance, P < 0.05). These findings imply an association between asymptomatic enteric virus infection and modernization and/or the gut microbiome in northern Laos.

Recovery of microbial DNA by agar-containing solution from extremely low-biomass specimens including skin

Article

Full-text available

Nov 2023

Recovering a sufficient amount of microbial DNA from extremely low-biomass specimens, such as human skin, to investigate the community structure of the microbiome remains challenging. We developed a sampling solution containing agar to increase the abundance of recovered microbial DNA. Quantitative PCR targeting the 16S rRNA gene revealed a significant increase in the amount of microbial DNA recovered from the developed sampling solution compared with conventional solutions from extremely low-biomass skin sites such as the volar forearm and antecubital fossa. In addition, we confirmed that the developed sampling solution reduces the contamination rate of probable non-skin microbes compared to the conventional solutions, indicating that the enhanced recovery of microbial DNA was accompanied by a reduced relative abundance of contaminating microbes in the 16S rRNA gene amplicon sequencing data. In addition, agar was added to each step of the DNA extraction process, which improved the DNA extraction efficiency as a co-precipitant. Enzymatic lysis with agar yielded more microbial DNA than conventional kits, indicating that this method is effective for analyzing microbiomes of low-biomass specimens.

Moving Beyond OTU Methods

Chapter

May 2023

This chapter investigates the movement of moving beyond OTU methods and discusses the necessity and possibility of this movement. First, it describes clustering-based OTU methods and the purposes of using OTUs and definitions of species and species-level analysis in microbiome studies. Then, it introduces the OTU-based methods that move toward single-nucleotide resolution. Third, it describes moving beyond the OTU methods. Finally, it discusses the necessity and possibility of moving beyond OTU methods as well as the issues of sub-OTU methods, assumption of sequence similarity predicting the ecological similarity, and functional analysis and multi-omics integration.KeywordsClustering-based OTU methodsHierarchical clustering OTU methodsHeuristic clustering OTU methodsTaxonomyOTUsSequencing errorSpecies and species-level analysisEukaryote speciesProkaryote or bacterial species16S rRNA methodPhysiological characteristicsSingle-nucleotide resolution-based OTU methodsDistribution-based clustering (DBC)Swarm2Entropy-based methodsOligotypingDenoising-based methodsPyrosequencing flowgramsCluster-free filtering (CFF)DADA2UNOISE2UNOISE3DeblurSeekDeepSub-OTU methodsSequence similarityEcological similarityFunctional analysisMulti-omics integration

DNA barcoding reveals hidden nemertean diversity from the marine protected area Namuncurá–Burdwood Bank, Southwestern Atlantic

Article

Full-text available

Mar 2023
POLAR BIOL

The implementation of molecular data for the analysis of nemertean diversity has unraveled the taxonomic status of several species and many higher taxa within the group. Nowadays, a large proportion of novel putative species are being discovered and it is necessary to add molecular data to the morphological description to obtain a correct identification. In this study, we used mitochondrial cytochrome oxidase I gene (COI) as molecular marker to investigate the diversity of nemerteans from a marine Sub-Antarctic environment. We used Automatic Barcode Gap Discovery (ABGD), Poisson Tree Processes (PTP) and Bayesian implementation of the PTP model (bPTP) as well as reciprocal monophyly on neighbor-joining (NJ) and maximum likelihood (ML) trees for species delimitation. ABGD showed a clear barcoding gap (6 to 10%) and the presence of 15 different putative nemertean species in the dataset of 54 COI sequences from the marine protected area “Namuncurá–Burdwood Bank” (MPA-BB), expanding the known biodiversity for the Sub-Antarctic region. Ten monostiliferan and five heteronemertean species were found. Our results also confirm the presence of two Parborlasia corrugatus cryptic species in the Antarctic and Sub-Antarctic region. This work highlights the importance of the MPA-BB as a biodiversity hotspot and provide molecular data for the Phylum Nemertea in the Southwestern Atlantic Ocean.

Advances in sequencing technology, databases, and analyses tools for the assessment of microbial diversity

Chapter

Jan 2022

Advances in sequencing technology have played a critical role in our understanding of the microbial diversity on the planet. The qualitative assessment of microbial diversity began with the use of Sanger sequencing-based cloning-dependent approach in the late 20th century. The quantitative assessment of microbial diversity began with LIBSHUFF, which laid the foundation of statistical tools for microbial diversity studies. However, the development of high-throughput sequencing, such as 454 pyrosequencing, Ion Torrent, Illumina, Nanopore posed specific challenges for post-sequencing analysis. Today, researchers widely use Usearch, K-shuff, FLASH, MOTHUR, Quantitative Insights Into Microbial Ecology (QIIME), Rtools, and many others along with many databases like NCBI, RDP, Greengenes, SILVA, and EzBioCloud for accurate taxonomic assignment of Bacteria, Archaea, and Eukarya. In this chapter, the developments in sequencing technology, databases and analyses tools will be discussed. In addition, the current strategy which is used for the assessment of microbial diversity will be discussed briefly.

Bitki büyüme düzenleyici aktinomisetler: izolasyon, tanı ve kullanım özellikleri - Plant growth stimulating actinomycetes: isolation, identification and varies of usage

Thesis

Feb 2020

Alper Dede

The population frequencies of species and the estimation of population parameters

Article

Jan 1953
BIOMETRIKA

I.J. Good

Molecular Analysis of Bacterial Community Structure and Diversity in Unimproved and Improved Upland Grass Pastures

Article

Apr 1999

Bacterial community structure and diversity in rhizospheres in two types of grassland, distinguished by both plant species and fertilization regimen, were assessed by performing a 16S ribosomal DNA (rDNA) sequence analysis of DNAs extracted from triplicate soil plots. PCR products were cloned, and 45 to 48 clones from each of the six libraries were partially sequenced, Phylogenetic analysis of the resultant 275 clone sequences indicated that there was considerable variation in abundance in replicate unfertilized, unimproved soil samples and fertilized, improved soil samples but that there were no significant differences in the abundance of any phylogenetic group. Several clone sequences were identical in the 16S rDNA region analyzed, and the clones comprised eight pairs of duplicate clones and two sets of triplicate clones. Many clones were found to be most closely related to environmental clones obtained in other studies, although three clones were found to be identical to culturable species in databases. The clones were clustered into operational taxonomic units at a level of sequence similarity of >97% in order to quantify diversity. In all, 34 clusters containing two or more sequences were identified, and the largest group contained nine clones. A number of diversity, dominance, and evenness indices were calculated, and they all indicated that diversity was high, reflecting the low coverage of rDNA libraries achieved. Differences in diversity between sample types were not observed. Collector's curves, however, indicated that there were differences in the underlying community structures; in particular, there was reduced diversity of organisms of the or subdivision of the class Proteobacteria (or-proteobacteria) in improved soils.

Phylogenetic inference package, version 3.51c

Article

Jan 1993

New York Encyclopedia of Statistical Sciences

Article

Mar 1984

A Simplified Monte Carlo Significance Test Procedure

Article

Sep 1968

Adery C. A. Hope

The use of Monte Carlo test procedures for significance testing, with smaller reference sets than are now generally used, is advocated. It is shown that, for given α = 1/n, n a positive integer, the power of the Monte Carlo test procedure is a monotone increasing function of the size of the reference set, the limit of which is the power of the corresponding uniformly most powerful test. The power functions and efficiency of the Monte Carlo test to the uniformly most powerful test are discussed in detail for the case where the test criterion is N(γ, 1). The cases when the test criterion is Student's t‐statistic and when the test statistic is exponentially distributed are considered also.

Cramér-Von Mises Statistic

Article

Jan 1982

Anthony N. Pettitt

This article has no abstract.

PHYLIP (phylogenetic inference package)

Article

Nov 1988

J. Felsenstein

On the Population Frequency of Species and the Estimation of Population Parameters

Article

Dec 1953
BIOMETRIKA

I. J. Good

A random sample is drawn from a population of animals of various species. (The theory may also be applied to studies of literary vocabulary, for example.) If a particular species is represented r times in the sample of size N , then r / N is not a good estimate of the population frequency, p , when r is small. Methods are given for estimating p , assuming virtually nothing about the underlying population. The estimates are expressed in terms of smoothed values of the numbers n r ( r = 1, 2, 3, ...), where n r is the number of distinct species that are each represented r times in the sample. ( n r may be described as ‘the frequency of the frequency r ’.) Turing is acknowledged for the most interesting formula in this part of the work. An estimate of the proportion of the population represented by the species occurring in the sample is an immediate corollary. Estimates are made of measures of heterogeneity of the population, including Yule's ‘characteristic’ and Shannon's ‘entropy’. Methods are then discussed that do depend on assumptions about the underlying population. It is here that most work has been done by other writers. It is pointed out that a hypothesis can give a good fit to the numbers n r but can give quite the wrong value for Yule's characteristic. An example of this is Fisher's fit to some data of Williams's on Macrolepidoptera.

PHYLIP—phylogeny interference package

Article