ArticlePDF Available

Conversion of DNA Sequences: From a Transposable Element to a Tandem Repeat or to a Gene.

Authors:
Genes 2019, 10, 1014; doi:10.3390/genes10121014 www.mdpi.com/journal/genes
Review
Conversion of DNA Sequences: From a Transposable
Element to a Tandem Repeat or to a Gene
Ana Paço
1,
*, Renata Freitas
2,3,4
and Ana Vieira-da-Silva
1
1
MED-Mediterranean Institute for Agriculture, Environment and Development, University of Évora,
7002–554 Évora, Portugal; ana.vieiradsilva@gmail.com
2
IBMC-Institute for Molecular and Cell Biology, University of Porto, R. Campo Alegre 823,
4150–180 Porto, Portugal; Renata.Freitas@ibmc.up.pt
3
I3S-Institute for Innovation and Health Research, University of Porto, Rua Alfredo Allen, 208,
4200–135 Porto, Portugal
4
ICBAS-Institute of Biomedical Sciences Abel Salazar, University of Porto, 4050-313 Porto, Portugal
* Correspondence: apaco@uevora.pt; Tel.: +351-266-760-878
Received: 19 October 2019; Accepted: 29 November 2019; Published: 5 December 2019
Abstract: Eukaryotic genomes are rich in repetitive DNA sequences grouped in two classes
regarding their genomic organization: tandem repeats and dispersed repeats. In tandem repeats,
copies of a short DNA sequence are positioned one after another within the genome, while in
dispersed repeats, these copies are randomly distributed. In this review we provide evidence that
both tandem and dispersed repeats can have a similar organization, which leads us to suggest an
update to their classification based on the sequence features, concretely regarding the presence or
absence of retrotransposons/transposon specific domains. In addition, we analyze several studies
that show that a repetitive element can be remodeled into repetitive non-coding or coding
sequences, suggesting (1) an evolutionary relationship among DNA sequences, and (2) that the
evolution of the genomes involved frequent repetitive sequence reshuffling, a process that we have
designated as a “DNA remodeling mechanism”. The alternative classification of the repetitive DNA
sequences here proposed will provide a novel theoretical framework that recognizes the importance
of DNA remodeling for the evolution and plasticity of eukaryotic genomes.
Keywords: tandem repeats; dispersed sequences; origin of tandem repeats; mobilization of tandem
repeats; DNA remodeling mechanism
1. Introduction
Eukaryotic genomes contain a high diversity of repetitive DNA sequences [1,2]. The
amplification/deletion of these sequences contributed significantly to the extraordinary variation in
genome size found between taxa [3–7]). In the animal kingdom, the genome size could vary from 20
Mb to 130 Gb, which is mainly due to differences in the content of repetitive sequences [8]. The same
large variation was identified in plants. For instance, the fraction of repetitive genomic DNA is 13–
14% (125 Mb–157 Mb) in Arabidopsis thaliana but, contrastingly, is 77% (2.5 GB) in Zea mays [9].
The biological role of this repetitive DNA fraction has been a topic of great interest namely to
understand evolution and disease [10–13]. In summary, this fraction seems to be involved in DNA
packaging, the evolutionary events of the genome (through promoting DNA instability), gene
expression, and epigenetic mechanisms [14–25]. The analysis of the DNA sequences located in
chromosomal breakpoint regions strongly suggests that repetitive sequences work as a driving force
in the occurrence of chromosomal rearrangements, since these regions are extremely rich in these
sequences [26,27]. The repetitive nature, per se, of the different classes of repeats located in these
breakpoint regions favours recombinational events between homologous sequences in non-
Genes 2019, 10, 1014 2 of 15
homologous regions, which may culminate in chromosomal restructurings [21]. Besides, an analysis
devoted to the transcriptional activity of some repetitive sequences input a role of these sequences in
control of gene expression, cellular response to stress and centromeric function, specifically by RNA
interference mechanisms [28].
Despite the increasing evidence pointing to a functional significance of the repetitive DNA
fraction, there are still limitations to characterizing their role in different biological processes due to
their high diversity, genomic abundance, complex evolution mode, and difficulty in isolation and
sequencing [29–31]. Thus, the real biological relevance of this fraction of eukaryotic genomes is yet to
be revealed.
This review presents a compilation of evidence on the organization and evolutionary
relationship between different classes of repetitive sequences, and between repetitive sequences and
genes. This evidence shows that the classification of repetitive sequences based on its genomic
organization as “in tandem” or “dispersed repeats” is not “as black and white”, which reinforces the
need for an updated classification. Further, this evidence also shows a high remodulation of the
repetitive sequences in the eukaryotic genomes, which could culminate in the origin of a new
sequence type, namely genes. This new way to face the classification and study of the repetitive
sequences will contribute to a better understanding of genomic plasticity and its contribution to
eukaryotic species evolution and adaptation to environment.
2. Tandem Repeats and Dispersed Repeats: Is Their Organization So Different?
Repetitive sequences are classified in two classes, tandem and dispersed repeats, according to
the genomic organization of their copies (Figure 1) [32,33]. In turn, each class is divided into several
subclasses or families [3–6,34–37]. With the goal to facilitate the annotation and classification of the
repetitive elements in eukaryotic genomes, an open access database for repetitive sequence families
(Dfam) was built [37,38].
Figure 1. Repetitive DNA sequences in eukaryotic genomes. This schematization collects the information of
several works [16,20,33,39–42]. Here, only the largest subclasses of tandem and dispersed repeats are
represented, not including the genic repetitive DNA sequences families, as tandem paralogues genes, ribosomal
genes (tandem organization), retropseudogenes, transfer RNA genes, and dispersed paralogues genes
(dispersed organization).
2.1. Tandem Repeats Organization
Traditionally, tandem repeats have been structurally characterized by a sequential arrangement
of repeat units, positioned one after the other in two possible repeat orientations, head-to-tail repeats
(direct repeats) or head-to-head repeats (inverted repeats) [33]. Excluding the genic repetitive DNA
sequences (e.g., ribosomal genes), three distinct subclasses are mainly recognized, namely
microsatellites, minisatellites, and satellites (satellite DNAs—satDNAs; nongenic tandem repeats)
(Figure 1).
It is generally considered that the main difference between micro, mini, and satellites relates to
the length of their array of repeat units in a chromosomal location [33,39,43]. Micro and minisatellites
Genes 2019, 10, 1014 3 of 15
are classified as short tandem repeats, composed by shorter arrays of copies: ranging from 10 to 100
repeat units for microsatellites, and up to 100 repeat units for minisatellites [39,44]. However, it is
important to mention that there is no consensus on the definition of microsatellites and minisatellites,
since there is also no consensus in the repeat unit size, or in the minimal number of repeat units in an
array of copies differentiating micro and minisatellites. The threshold for repeat unit size varies
between six and 10 nucleotides [33,39,45].
The SatDNAs have traditionally been considered as organized into long arrays, with millions of
copies and, thus, have been named as long tandem repeats [39,46]. However, satDNAs with a
different organization have already been reported. Louzada and colleagues (2015) [47], identify a
satDNA (PMsat) with high sequence conservation in the genomes of five rodents. However, the
PMSat is not always organized into long array repeat units, even in species where it is highly
abundant, as is the case of Peromyscus maniculatus bairdii. In this species the authors found short
PMSsat arrays and even dispersed isolated monomers, which reinforces that the limitation of
techniques to isolate and analyse the repetitive fractions of genomes (as sequencing, assembly and
mapping technologies) is a determining factor for the study of repetitive sequences, as well as for its
accurate classification. Other reports have also come to prove the same, showing a dispersed genomic
organization of short arrays of satDNA repeat units [44,48].
Besides the length of arrays, the repeat units of satDNA (monomers) can also show a great
variation in size, ranking from five nucleotides, as in human satellite III [49], which is similar to a
micro or a minisatellite, but with longer arrays of repeat units, up to several hundred base pairs as in
the Microtus MSAT-2570 [50]. However, for plants and animals, the most common length is 150–180
bp and 300–360 bp, respectively, which is believed to be associated with the requirements of DNA
length wrapped around one or two nucleosomes [11,51,52].
The genomic distribution presented by micro, mini, and satDNAs is also traditionally
considered distinct in the literature. Generally, both microsatellites and minisatellites are distributed
throughout the genome (dispersed), in both euchromatic and heterochromatic regions [33,45].
Nevertheless, minisatellites are also characterized by their accumulations in (sub)telomeric regions
[43,53]. The satDNAs are mainly located in heterochromatic regions of the chromosomes, thus
satDNAs are preferentially found in and around centromeres [54–57]. However, once again,
exceptions have been reported as to what is generally assumed. Indeed, satDNA can also be located
at interstitial and terminal positions of chromosomes [26,47,48].
2.2. Dispersed Repeats Organization
Dispersed repeats are mainly represented by transposable elements (TEs). With the recent
proliferation of genomic sequencing studies, TEs have emerged as highly diverse, ubiquitous and
abundant genomic elements, constituting approximately half of the human genome and up to 95%
of DNA in plants [58–61]. The diversity of TEs reflects their evolutionary mode [62]. By the
accumulation of mutations, TEs generate new families and subfamilies, “escaping” to selection [63].
Despite this diversity, it was possible to group them into two major classes according to their modes
of transposition (mobilization in genomes): retrotransposons (RE, class I elements) and DNA
transposons (class II elements) (Figure 1) [60,61].
Traditionally, the dispersed repeats consist of sequences represented several times in the
genome, whose copies are not clustered or are organized in short clusters, presenting a wide
distribution throughout the genome [39,43,64]. However, some works show dispersed repeats with
an accumulated genomic organization [65–70], which can lead to the supposition of some tandem
organization for these sequences. As referred to previously, the genomic accumulation of tandem
repeats is common in certain genomic locations, such as the telomeric sequences (microsatellites of
(TTAGGG/CCCTAA)n) in the genomes of all mammals at telomeric and interstitial positions [71–73],
and satDNAs at centromeric regions [57]. In fact, different works have specifically explored the
organization of tandem repeats at telomeric and centromeric regions [26,66]. Nevertheless, how TE
blocks accumulated in certain regions of the genome are in fact organized is not reported in many
Genes 2019, 10, 1014 4 of 15
studies. Only a few works have revealed that TE clusters could be interrupted by other TEs or also
by genes, where there might also be a short tandem arrangement of some TEs [67,69].
2.3. Repetitive Sequences: A New Classification Based on the Presence or Absence of
Retrotransposons/Transposon Specific Domains
In this review, we suggest an alternative to the traditional classification of DNA repetitive
sequences mainly based on the genomic organization of its copies. We suggest that the repetitive
sequences should not be divided as tandem repeats and dispersed repeats, once repeats organized in
tandem could also present a dispersed organization, and vice-versa. As referred to previously,
several tandem repeats have a dispersed organization in distinct genomes. In contraposition, some
TE copies may be positioned one after the other in tandem organization, presenting some
chromosomal regions several complete or incomplete copies of a specific TE. As such, these sequences
should be classified regarding other characteristics, namely the presence or absence of
retrotransposons/transposon specific domains in the sequence of its copies. Therefore, the repetitive
sequences could perhaps be classified as repeats presenting retrotransposons/transposon specific
domains or repeats not presenting retrotransposons/transposon specific domains.
3. Repetitive DNA Remodelling
Different studies show that a repetitive element can be remodelled into a different sequence,
repetitive or not, which proves an evolutionary relationship among DNA sequences, and suggests
that the evolution of the genome is characterized by a frequent repetitive sequence reshuffling, a
process that we have called a “DNA remodelling mechanism”. In fact, some authors show that the
tandem repeats could have their origin in TEs [74], and that satDNAs could also evolve to coding
sequences [75].
3.1. Transposable Elements in Origin and Genomic Distribution of Micro, Mini and Satellite DNAs
Sequence similarities between tandem repeats and TEs [76–78] indicate a strong evolutionary
relationship between these repetitive sequences. In addition, it is also believed that TEs are involved
in the origin of some sequence motifs that characterized some satDNAs, as the CENP-B box,
presenting this sequence motif strong similarity with the terminal inverted repeats of pogo
transposons [79]. In fact, computer simulations have suggested that satDNA monomers could be
generated from a wide variety of non-satellite sequences and propagated into an array by unequal
crossing-over [80]. These non-satellite sequences are often TEs [26,31,76,81–88]. In Table 1, different
examples known in the literature are listed where TEs or parts of TEs were converted into other
repetitive sequence or altered genes.
Table 1. Transposable elements in the origin of other repetitive sequence or altered genes.
Transposable element New sequence or new sequence variable Reference
SINE-like elements Satellite 1 of Xenopus leavis [89]
LTR retrotransposons RPCS satDNA of Ctenomys rodents [81]
SINE-like elements Hy/Pol III satDNA european salamander [82]
pDv mobile element pvB370 satDNA of Drosophyla virilis [83]
LINE-1 elements Common cetacean satDNA [76]
TART and HeT-A retrotransposons 18HT satDNA of Drosophila melanogaster [90]
Atenspm2 transposons Ensat1 of Arabidopsis thaliana [84]
Crwydryn retrotransposon E3900 satDNA of rye [91]
MITE elements D1100 satDNA of rye [91]
SGM-IS transposons SGM satDNA Drosophila guanche [92]
Ty3/gypsy-retroelement 250 elements of satDNA of wheat [93]
MITE-like elements Xstir satDNA of Xenopus leavis [94]
MITE elements HindIII satDNA of oysters [95]
Sore1 retrotransposon Sobo satDNA of potatoes [96]
Genes 2019, 10, 1014 5 of 15
Table 1. Cont.
CR1 retrotransposons HinfI satDNA of chicken [97]
Ty3/gypsy-like ogre elements PisTR-A satDNa of pea [31]
MITE-like elements BIV160 satDNA of bivalves [77]
CR1-C retrotransposons Cen2, 3, 4, 7 and Cen11 satDNAs of chicken [98]
CRM1 and CRM4 retrotransposons CRM1TR satDNA of maize [87]
Helytrons elements CTRs satDNA of Drosophyla [88]
LINE-1 elements PROsat of Phodopus roborovskii [26]
Alu elements A-rich primates’ microsatellites [99]
SINE elements BARE-1, WIS2-1A and
PREM1 microsatellites of barley [100]
LINE-1 elements A-rich mammalian microsatellites [101]
Alu elements (GAA)n human microsatellite [102]
MITE elements GTCY(n) microsatellites of insects [86]
Alu elements pλg3 human minisatellite [103]
MaLR retrotyransposon Ms6-hm mouse minisatellites [104]
SINE B1 elements (GGCAGA)n mouse minisatellite [105]
Alu elements (CGGGAGGC)n human minisatellite [106]
Alu elements Minisatellites of human [107]
Gmr9/Gm ogre retrotransposons Gmr9-associated minisatellites of soybean [108]
LINE-1 elements TRIM5 gene with a cyclophilin A domain [109]
SINE: Short interspersed nuclear element, LTR: Long terminal repeats retrotransposons, LINE: Long
interspersed nuclear element, MITE: Miniature inverted repeat transposable elements, satDNA:
Satellite DNA.
The evidence for TE conversion into new non-coding repetitive sequences or genes are
reinforced by the fact that, in humans and mice—the first fully-sequenced genomes—it was estimated
that the repetitive DNA derived from TEs comprises from 40% to almost half of these genomes
[110,111]. These values could be quite underestimated, with substantial amounts of older sequences
not being detected due to their already highest divergence compared to the consensus sequences
used for their detection. Ahmed and Liang (2012) [107], for example, considered that the ability of
TEs to contribute to genome expansion is due, not only to retrotransposition (increasing its copy
number in the genomes), but also by generating tandem repeats.
The exact mechanisms underlying the origin/expansion of tandem repeats from TEs are not yet
completely known, but probably involved more than one mechanism. A tandem repeat sequence
arises after amplification events followed by subsequent molecular mechanisms. It is widely accepted
that the first repetitions of microsatellites may have originated by chance, and then expanded by
slipped-strand mispairing, as proposed by Levinson and Gutman (1987) [112]. However, some
studies suggested that TEs contain one or more sites predisposed to the formation of microsatellites.
The Poly(A) tract at the 3’ end of mammalian non-LTR retrotransposons (autonomous
LINEs/nonautonomous SINEs) provides a susceptible site to reverse transcription errors, which
could lead to the genesis of A-rich microsatellites [113]. The description of microsatellites located at
the 5’ end and internal regions of retroelements is also available in the literature [100].
Regarding minisatellites, several reports point to their origin from a variety of TE families and
subfamilies [103–106,108], namely from nonautonomous non-LTR retrotransposons as Alu and B1
SINE elements [105,106] or from LTR retrotransposons [103,104,108]. According to Haber and Louis
(1998) [114], the origin (initial event) of the first repetitions of some minisatellites appears to have
been mediated by replication slippage or unequal crossing-over, involving very short repeats (5–10
bp) that flank a motif which will be amplified as the repetition unit of these minisatellites (Figure 2).
The evidence that repetitive elements, such as Alu elements, commonly have short direct repeats in
their sequences, makes them very prone to the origin of minisatellites by this mechanism [106].
Subsequently, the amplification of the duplicated motifs into a minisatellite array could then occur
by additional replication slippage events, gene conversion, or by unequal crossing-over between the
longer homologous regions [114]. This mechanism seems plausible to explain the origin of
Genes 2019, 10, 1014 6 of 15
minisatellites; however, it cannot explain the origin of the satDNA with larger repeat units, due to
the distance over which the initial event must have occurred (replication slippage or unequal
crossing-over involving flanking short repeats). Nevertheless, a very similar mechanism was
proposed to explain the origin of satDNAs, as for maize centromeric repeats, with most of their
monomers presenting more than 700 bp [87].
Figure 2. Initial event in the origin of the first minisatellites repetitions. Origin of a duplication by replication
slippage or unequal crossing-over between short flanking repeats, followed by a subsequent expansion into a
minisatellite.
Other interesting theories on the origin of satDNAs from TEs have been suggested. Wong and
Choo (2004) [74] proposed the “first steps” hypothesis for the origin of satDNA repetitions, based on
the duplication of part of a TE sequence by unequal crossing-over between homologous TE elements,
which could be in the same or in different chromosomes (Figure 3). Once a tandem repetition of full
or partial TEs is generated in a genome, the expansion of these novel repeat units can slowly occur
over time. Mutational changes, followed by successive rounds of crossing-over homogenization
(concerted evolution of tandem repeats), can justify the divergence observed between the emergent
satDNA and the original TE, presenting only conserved parts of their sequences [74]. This mechanism
is recurrently used to explain the origin of satDNAs. An example is the work of Dias et al. (2015) [88],
suggesting the emergence of a satDNA from central tandem repeats of a helitron (DINE-TR1) in
Drosophila species.
Figure 3. Initial steps for the origin of satDNA repeats from parts of a TE. The duplication of part of a TE sequence
occurs by unequal crossing-over between homologous dispersed repeats present in chromosomes A and B (chrA
and ChrB). The expansion of these novel repeat units can occur through time and result in a satDNA array of
copies.
The DNA transposons, or specifically their transposase activity, have been also referred to in the
birth of sequence duplications. Kapitonov and Jurka (1999) [84] propose that the breaks induced by
transposases during transposition (endonucleolytic tranposase activity) could favour recombination
processes in order to repair the double strand breaks. This event may originate the first repetitions of
a tandem repeat, which could afterwards be amplified in a large array of copies.
Beyond the role suggested in the origin of the tandem repeat, the TEs were also implicated in its
relocation/distribution throughout the genome [31,54]. It is logical to believe that when tandem
Genes 2019, 10, 1014 7 of 15
repeats are included within the mobile element sequence (for example, when the tandem repeats
have its origin by duplication of part of TE sequence), maintaining the competence for mobilization
for these TEs. The transposition mechanism can easily disperse tandem repeats throughout the
genome (Figure 4A). This hypothesis is commonly accepted for short tandem repeats as micro and
[115–117], which present short arrays of copies compared to satDNAs [39]. In fact, a considerable part
of micro and minisatellites in eukaryotic genomes are embedded within mobile elements [115,117],
which points to an important role of TEs in its genomic distribution, explaining its common dispersed
chromosomal location. Moreover, TEs are present in pericentromeric regions of a wide range of
species [27,70,74], being these regions also mainly built by satDNAs, which certainly facilitate the
dispersion of these highly tandem repeats by retrotransposition.
Regarding LINE-1 retrotransposons, several reports suggest a location of these elements in
pericentromeric regions of different mammalian species chromosomes [27,66,70,118–120], pointing
to an intermingling of these retrotransposons with satDNAs [66,118]. This complex organization
pattern of repetitive sequences in the pericentromeric regions eventually favours the dispersion of
satDNAs to other genomic locations by LINE-1 retrotransposition, since these elements frequently
allow for the transduction of flanking non-LINE-1 DNA to new genomic locations (Figure 4B). This
transduction is a consequence of a LINE-1 incorrect retrotransposition process [121–123]. Sometimes
by retrotransposition, the TE sequences, along with its adjacent DNA, are copied and subsequently
integrated into another genomic locations. This results in the duplication and genomic dispersion of
the TE flanking sequences [124], as in, for example, satDNA monomers.
Figure 4. Dispersion of tandem repeats by transposition. (A) Origin of tandem repeats from a part of a
transposable element (TE) and its dispersion by transposition. The first duplications of a tandem repeat were
originated from a part of a TE. As these repeats are included in the TE sequence, could then be dispersed by
transposition. After, these repetitions may be amplified and homogenized in an array of copies. (B) Transduction
of tandem repeats flanking a retrotransposon and its consequent dispersion throughout the genome by
retrotransposition. Retrotransposon evidenced by a green block and tandem repeats evidenced by pink blocks.
During the evolutionary time, the tandem repeats that were moved to new chromosomal locations could be
amplified and homogenized, originating arrays of copies in these locations.
Genes 2019, 10, 1014 8 of 15
Furthermore, we can further speculate that DNA transposons can also allow the transduction of
tandem repeats sequences [61], in a process similar to the one recognized in bacteria, for the
transference of genes (e.g., antibiotic resistance genes) within and between bacterial genomes
[125,126]. Sequences flanked by two “cut-and-paste” transposons can probably be mobilized in a
genome, when its transposases use Terminal Inverted Repeats (TIRs) of the two different transposons
to induce breaks for the mobilization (Figure 5A). If the TIRs of each DNA transposon are exclusively
used, only these elements will be mobilized. Interestingly, as referred to previously, some similarity
exists between the CENP-B box motifs and transposase recognition sites of DNA transposons [79],
which may also lead to the identification of the CENP-B boxes as a break site for transposition [127].
Therefore, the common presence of CENP-B box motif in different satDNA families [127–130] can be
involved in the mobilization of satDNAs copies during transposition [127]. No copies (monomers) of
a satDNA described as presenting CENP-B boxes have these sequence motifs [131]. Thus, several
monomers flanked by a DNA transposon and a CENP-box could be mobilized at the same time to
another location, and subsequently amplified by different recombinational mechanisms (Figure 5B).
This capacity of DNA transposons for the relocation of sequences flanked by them, or specifically by
their TIRs, is indeed already used in medicine for gene therapy [132].
Figure 5. Dispersion of tandem repeats by “cut-and-paste” transposons. (A) Mobilization of sequences flanked
by two “cut-and-paste” transposons. The breaks for the mobilization induced by transposases occur at the
terminal inverted repeats (TIRs) of the two transposons. Yellow boxes: tandem repeats monomers, blue boxes:
TIRs, violet boxes: transposase genes, grey boxes: remaining sequences of the chromosomes A and B. Chr:
chromosome. (B) Mobilization of sequences flanked by a “cut-and-paste” transposon and a CENP-B box. The
breaks for the mobilization occur at the TIRs of a transposon and a CENP-B box. Yellow boxes: tandem repeats
monomers, blue boxes: TIRs, violet boxes: transposase genes, Orange box: CENP-B box, grey boxes: remaining
sequences of the chromosomes.
3.2. Repetitive Sequences in the Origin of Coding Sequences
Transposable elements and satDNAs could also be involved in the evolution of genes, but most
interestingly in the origin of new genes or gene variants. The noticeable ability of the TEs to produce
genetic mutations when integrating at new genomic sites was recognized more than 50 years ago
[20,133]. Nevertheless, despite most of these insertions being either neutral or deleterious to their
host, its inclusions into new locations may also be advantageous, promoting gene evolution and the
codification of more efficient protein variants. One of the most publicized discoveries about this
subject is the resistance to HIV-1 (Human Immunodeficiency Virus) infection in owl monkeys, which
presents an altered TRIM5 gene with a cyclophilin A domain acquired by LINE-1 retrotransposition
Genes 2019, 10, 1014 9 of 15
[109]. The binding of this cyclophilin domain to the HIV-1 viral capsid leads to a disruption of the
infection process [134]. However, these primates are permissive to other immunodeficiency virus,
such as the simian immunodeficiency virus (SIV) [109].
Recently, some works have shed light on questions about the de novo origin of protein-coding
genes (or variants) from non-coding DNA [75,135–137], such as satDNAs. It is believed that the origin
of completely novel genes from non-coding DNA is an evolutionary process comprising two big
steps. In the first step, the non-coding DNA sequences are transcribed and then acquire translatable
open reading frames [135] (Figure 6). Some works have already reported open reading frames in the
monomers of satDNAs [138,139], an important step that might have allowed these sequences to
evolve into coding sequences.
Figure 6. DNA remodelling process. Evolution of a satellite DNA sequence from a transposable element and its
subsequent conversion in a coding sequence. The reverse sense of the process was not proved yet. ??- Is up to
now unknown if occurs the opposite sense of this process for DNA sequences evolution.
4. Concluding Remarks
Pioneer studies on the eukaryotic genomic repetitive fraction has led to the classification of the
repetitive sequences into two major groups according to the organization of copies within the
genome: tandem repeats (as satDNAs) and dispersed repeats (TEs). Because of that, these sequences
have been mostly investigated separately, with the important evolutionary relationships that exist
between them not being considered. However, more precise genomic and bioinformatic analyses
have now shown that these sequences do not have such a tight genomic organization. Some satDNAs
may present dispersed isolated monomers in a genome [47] while TEs may have a kind of tandem
organization of their copies [69]. Moreover, it became evident that a repetitive element can often be
remodelled into a different sequence, a repetitive non-coding sequence or even a coding sequence.
This suggests that the repetitive DNA elements in eukaryotic genomes seem to be in frequent
remodulation, changing its organization and function. Therefore, an update in repetitive sequence
classification is now mandatory. Here, we have proposed a new classification for these sequences,
not based on the genomic organization of their copies, but on other sequence features, namely the
presence or absence of retrotransposon/transposon specific domains in their copies. Thus, we
propose two new groups for the classification of the repetitive sequence: repeats presenting
retrotransposons/transposons specific domains and repeats not presenting
retrotransposons/transposons specific domains. This new classification demonstrates more clearly
the evolutionary relationship between these sequences, promoting also more works to study together
sequences that were previously considered very distinct. Their joint study is indeed very important
for a better understanding of their function in genomes, showing that the evolutionary relationship
Genes 2019, 10, 1014 10 of 15
between these sequences and the way that they can convert to each other is highly associated to the
evolution of the genomes themselves.
Accordingly, we believe that future combined studies regarding TEs and tandem repeats,
namely concerning chromosomal location and molecular similarity, will increase our knowledge
about the evolution of eukaryotic genomes. The combined studies of related repetitive sequences can
help us to understand the reason for some evolutionary tracks of sequences, and to understand these
tracks in such a way that this genome plasticity makes the eukaryotic species better adapted to
environmental conditions.
Author Contributions: Conceptualization, A.P., R.F. and A.V.d.S. and Y.Y.; writing—original draft preparation,
A.P., R.F. and A.V.d.S.; writing—review and editing, A.P., R.F. and A.V.d.S.
Funding: This work was financially supported by a Newfelpro Post-doctoral grant (No. 82) from Republic of
Croatia co-financed through the Marie Curie FP7-PEOPLE-2011-COFUND program.
Acknowledgments: We would like to acknowledge Raul Guizzo for carefully reviewing the manuscript.
Conflicts of Interest: The authors declare no conflict of interest. The funders had no role in the design of the
study; in the collection, analyses, or interpretation of data and in the writing of the manuscript.
References
1. Biscotti, M.A.; Olmo, E.; Heslop-Harrison, J.S. Repetitive DNA in eukaryotic genomes. Chromosome Res.
2015, 23, 415–420.
2. Pucci, M.B.; Nogaroto, V.; Moreira-Filho, O.; Vicari, M.R. Dispersion of transposable elements and
multigene families: Microstructural variation in Characidium (Characiformes: Crenuchidae) genomes.
Genet. Mol. Biol. 2018, 41, 585–592.
3. Venner, S.; Feschotte, C.; Biémont, C. Dynamics of transposable elements: Towards a community ecology
of the genome. Trends Genet. 2009, 25, 317–323.
4. Devos, K.M. Grass genome organization and evolution. Curr. Opin. Plant Biol. 2010, 13, 139–145.
5. Sun, C.; Shepard, D.B.; Chong, R.A.; Arriaza, J.L.; Hall, K.; Castoe, T.A.; Feschotte, C.; Pollock, D.D.;
Mueller, R.L. LTR retrotransposons contribute to genomic gigantism in plethodontid salamanders. Genome
Biol. Evol. 2011, 4, 168–183.
6. Lee, S.; Kim, N. Transposable elements and genome size variations in plants. Genomics Inform. 2014, 12, 87–
97.
7. Deniz, Ö.; Frost, J.M.; Branco, M.R. Regulation of transposable elements by DNA modifications. Nat. Rev.
Genet. 2019, doi:10.1038/s41576-019-0117-3.
8. Gregory, T.R. Animal genome size database. Nucleic Acids Res. 2007, 35, D332–D338.
9. Shapiro, J.A.; von Sternberg, R. Why repetitive DNA is essential to genome function. Biol. Rev. Camb. Philos.
Soc. 2005, 80, 227–250.
10. Ferree, P.M.; Prasad, S. How can satellite DNA divergence cause reproductive Isolation? Let us count the
chromosomal ways. Genet. Res. Int. 2012, 2012, 430136.
11. Plohl, M.; Meštrović, N.; Mravinac, B. Satellite DNA Evolution. In Repetitive DNA; Garrido-Ramos, M.A.,
Ed.; Karger Genome Dyn: Basel, Switzerland, 2012; pp. 126–152.
12. Plohl, M.; Meštrović, N.; Mravinac, B. Centromere identity from the DNA point of view. Chromosoma 2014,
123, 313–325.
13. Garrido-Ramos, M.A. Satellite DNA: An evolving topic. Genes 2017, 8, 230.
14. Wichman, H.A.; Payne, C.T.; Ryder, O.A.; Hamilton, M.J.; Maltbie, M.; Baker, R.J. Genomic distribution of
heterochromatic sequences in equids: Implications to rapid chromosomal evolution. J. Hered. 1991, 82, 369–
377.
15. Garagna, S.; Pérez-Zapata, A.; Zuccotti, M.; Mascheretti, S.; Marziliano, N.; Redi, C.A.; Aguilera, M.;
Capanna, E. Genome composition in Venezuelan spiny-rats of the genus Proechimys (Rodentia,
Echimyidae). I. Genome size, C-heterochromatin and repetitive DNAs in situ hybridization patterns.
Cytogenet. Cell Genet. 1997, 78, 36–43.
16. Feschotte, C.; Pritham, E.J. DNA transposons and the evolution of eukaryotic genomes. Annu. Rev. Genet.
2007, 41, 331–368.
Genes 2019, 10, 1014 11 of 15
17. Vourc’h, C.; Biamonti, G. Long non-coding RNAs. Progress in Molecular and Subcellular Biology. In
Transcription of Satellite DNAs in Mammals; Ugarković, D., Ed.; Springer: Berlin/Heidelberg, Germany, 2011;
pp. 95–118.
18. Zhu, Q.; Pao, G.M.; Huynh, A.M.; Suh, H.; Tonnu, N.; Nederlof, P.M.; Gage, F.H.; Verma, I.M. BRCA1
tumour suppression occurs via heterochromatin-mediated silencing. Nature 2011, 477, 179–184.
19. Hall, L.E.; Mitchell, S.E.; O’Neill, R.J. Pericentric and centromeric transcription: A perfect balance required.
Chromosome Res. 2012, 20, 535–546.
20. Rebollo, R.; Romanish, M.T.; Mager, D.L. Transposable elements: An abundant and natural source of
regulatory sequences for host genes. Annu. Rev. Genet. 2012, 46, 21–42.
21. Enukashvily, N.I.; Ponomartsev, N.V. Mammalian satellite DNA: A speaking dumb. Adv. Protein Chem.
Struct. Biol. 2013, 90, 31–65.
22. Belyayev, A. Bursts of transposable elements as an evolutionary driving force. J. Evol. Biol. 2014, 27, 2573–
2584.
23. Cridland, J.M.; Thornton, K.R.; Long, A.D. Gene expression variation in Drosophila melanogaster due to rare
transposable element insertion alleles of large effect. Genetics 2014, 199, 85–93.
24. Friedli, M.; Trono, D. The developmental control of transposable elements and the evolution of higher
species. Annu. Rev. Cell Dev. Biol. 2015, 31, 429–451.
25. Kim, Y.J.; Han, K. Endogenous retrovirus-mediated genomic variations in chimpanzees. Mob. Genet.
Elements 2015, 4, 1–4.
26. Paço, A.; Adega, F.; Chaves, R. Line-1 retrotransposons: From “parasite” sequences to functional elements.
J. Appl. Genet. 2015, 56, 133–145.
27. Paço, A.; Adega, F.; Meštrovic, N.; Plohl, M.; Chaves, R. The puzzling character of repetitive DNA in
Phodopus genomes (Cricetidae, Rodentia). Chromosome Res. 2015, 23, 427–440.
28. Feliciello, I.; Akrap, I.; Ugarković, D. Satellite DNA modulates gene expression in the beetle Tribolium
castaneum after heat stress. PLoS Genet. 2015, 11, e1005547.
29. Song, J.; Dong, F.; Lilly, J.W.; Stupar, R.M.; Jiang, J. Instability of bacterial artificial chromosome (BAC)
clones containing tandemly repeated DNA sequences. Genome 2001, 44, 463–469.
30. Alkan, C.; Kidd, J.M.; Marques-Bonet, T.; Aksay, G.; Antonacci, F.; Hormozdiari, F.; Kitzman, J.O.; Baker,
C.; Malig, M.; Mutlu, O.; et al. Personalized copy number and segmental duplication maps using next-
generation sequencing. Nat. Genet. 2009, 41, 1061–1067.
31. Macas, J.; Koblížková, A.; Navrátilová, A.; Neumann, P. Hypervariable 3 UTR region of plant LTR-
retrotransposons as a source of novel satellite repeats. Gene 2009, 448, 198–206.
32. Kass, D.H.; Batzer, M.A. Genome Organization/Human; ELS: Princeton, NJ, USA, 2001; pp. 1–8.
33. Richard, G.F.; Kerrest, A.; Dujon, B. Comparative genomics and molecular dynamics of DNA repeats in
eukaryotes. Microbiol. Mol. Biol. Rev. 2008, 72, 686–727.
34. Petrov, D.A. Evolution of genome size: New approaches to an old problem. Trends Genet. 2001, 17, 23–28.
35. Boulesteix, M.; Weiss, M.; Biémont, C. Differences in the genome size between closely related species: The
Drosophila melanogaster species subgroup. Mol. Biol. Evol. 2006, 23, 162–167.
36. Pritham, E.J. Transposable elements and factors influencing their success in eukaryotes. J. Hered. 2009, 100,
648–655.
37. Wheeler, T.J.; Clements, J.; Eddy, S.R.; Hubley, R.; Jones, T.A.; Jurka, J.; Smit, A.F.A.; Finn, R.D. Dfam: A
database of repetitive DNA based on profile hidden Markov models. Nucleic Acids Res. 2013, 41, D70–D82.
38. Hubley, R.; Finn, R.D.; Clements, J.; Eddy, S.R.; Jones, T.A.; Bao, W.; Smit, A.F.; Wheeler, T.J. The Dfam
database of repetitive DNA families. Nucleic Acids Res. 2016, 44, D81–D89.
39. Slamovits, C.H.; Rossi, M.S. Satellite DNA: Agent of chromosomal evolution in mammals. A review. J.
Neotrop. Mamm. 2002, 9, 297–308.
40. Jurka, J.; Kapitonov, V.V.; Kohany, O.; Jurka, M.V. Repetitive sequences in complex genomes: Structure
and evolution. Annu. Rev. Genomics Hum. Genet. 2007, 8, 241–259.
41. Kapitonov, V.V.; Jurka, J. A universal classification of eukaryotic transposable elements implemented in
Repbase. Nat. Rev. Genet. 2008, 9, 411–412.
42. Kapitonov, V.V.; Tempel, S.; Jurka, J. Simple and fast classification of non-LTR retrotransposons based on
phylogeny of their RT domains protein sequence. Gene 2009, 448, 207–213.
43. Strachan, T.; Read, A.P. Human Molecular Genetics 3; Garland Science: London, UK, 2004.
Genes 2019, 10, 1014 12 of 15
44. Ruiz-Ruano, F.J.; López-León, M.D.; Cabrero, J.; Camacho, J.P. High-throughput analysis of the satellitome
illuminates satellite DNA evolution. Sci. Rep. 2016, 6, 28333.
45. Schlötterer, C.; Harr, B. Microsatellite Instability; ELS: Princeton, NJ, USA, 2001; pp. 1–4.
46. Subirana, J.A.; Messeguer, X. Evolution of tandem repeat satellite sequences in two closely related
Caenorhabditis species. Diminution of Satellites in Hermaphrodites. Genes 2017, 8, 351.
47. Louzada, S.; Vieira-da-Silva, A.; Mendes-da-Silva, A.; Kubickova, S.; Rubes, J.; Adega, F.; Chaves, R. A
novel satellite DNA sequence in the Peromyscus genome (PMSat): Evolution via copy number fluctuation.
Mol. Phylogenet. Evol. 2015, 92, 193–203.
48. Paço, A.; Adega, F.; Meštrović, N.; Plohl, M.; Chaves, R. Evolutionary story of a satellite DNA from
Phodopus sungorus (Rodentia, Cricetidae). Genome Biol. Evol. 2014, 6, 2944–2955.
49. Frommer, M.; Prosser, J.; Tkachuk, D.; Reisner, A.H.; Vincent, P.C. Simple repeated sequences in human
satellite DNA. Nucleic Acids Res. 1982, 10, 547–563.
50. Modi, W.S. Rapid, localized amplification of a unique satellite DNA family in the rodent Microtus
chrotorrhinus. Chromosoma 1993, 102, 484–490.
51. Schmidt, T.; Heslop-Harrison, J.S. Genomes, genes and junk: The largescale organization of plant
chromosomes. Trends Plant. Sci. 1998, 3, 195–199.
52. Henikoff, S.; Ahmad, K.; Malik, H.S. The centromere paradox: Stable inheritance with rapidly evolving
DNA. Science 2001, 293, 1098–1102.
53. Li, W.H. Molecular Evolution; Sinauer: Sunderland, MA, USA, 1997.
54. Palomeque, T.; Lorite, P. Satellite DNA in insects: A review. Heredity 2008, 100, 564–573.
55. Brajković, J.; Feliciello, I.; Bruvo-Madarić, B.; Ugarković, D. Satellite DNA-like elements associated with
genes within euchromatin of the beetle Tribolium Castaneum. G3 (Bethesda) 2012, 2, 931–941.
56. Vittorazzi, S.E.; Lourenço, L.B.; Recco-Pimentel, S.M. Long-time evolution and highly dynamic satellite
DNA in leptodactylid and hylodid frogs. BMC Genet. 2014, 15, 111.
57. Escudeiro, A.; Adega, F.; Robinson, T.J.; Heslop-Harrison, J.S.; Chaves, R. Conservation, divergence, and
functions of centromeric satellite DNA families in the Bovidae. Genome Biol. Evol. 2019, 11, 1152–1165.
58. Kronmiller, B.A.; Wise, R.P. TEnest: Automated chronological annotation and visualization of nested plant
transposable elements. Plant Physiol. 2008, 146, 45–59.
59. Goerner-Potvin, P.; Bourque, G. Computational tools to unmask transposable elements. Nat. Rev. Genet.
2018, 19, 688–704.
60. Bourque, G.; Burns, K.H.; Gehring, M.; Gorbunova, V.; Seluanov, A.; Hammell, M.; Imbeault, M.; Izsvák,
Z.; Levin, H.L.; Macfarlan, T.S.; et al. Ten things you should know about transposable elements. Genome
Biol. 2018, 19, 199.
61. Arkhipova, I.R.; Yushenova, I.A. Giant transposons in eukaryotes: Is bigger better? Genome Biol. Evol. 2019,
11, 906–918.
62. Grahn, R.A.; Rinehart, T.A.; Cantrell, M.A.; Wichman, H.A. Extinction of LINE-1 activity coincident with a
major mammalian radiation in rodents. Cytogenet. Genome Res. 2005, 110, 407–415.
63. Konkel, M.K.; Batzer, M.A. A mobile threat to genome stability: The impact of non-LTR retrotransposons
upon the human genome. Semin. Cancer Biol. 2010, 20, 211–221.
64. Soares, M.L.; Edwards, C.A.; Dearden, F.L.; Ferrón, S.R.; Curran, S.; Corish, J.A.; Rancourt, R.C.; Allen, S.E.;
Charalambous, M.; Ferguson-Smith, M.A.; et al. Targeted deletion of a 170-kb cluster of LINE-1 repeats and
implications for regional control. Genome Res. 2018, doi:10.1101/gr.221366.117.
65. Dobigny, G.; Ozouf-Costaz, C.; Bonillo, C.; Volobouev, V. Viability of X-autosome translocations in
mammals: An epigenomic hypothesis from a rodent case-study. Chromosoma 2004, 113, 34–41.
66. Marchal, J.A.; Acosta, M.J.; Bullejos, M.; Puerma, E.; Díaz de la Guardia, R.; Sánchez, A. Distribution of L1-
retroposons on giant sex chromosomes of Microtus cabrerae (Arvicolidae, Rodentia): Functional and
evolutionary implications. Chromosome Res. 2006, 14, 177–186.
67. Giordano, J.; Ge, Y.; Gelfand, Y.; Abrusán, G.; Benson, G.; Warburton, P.E. Evolutionary history of
mammalian transposons determined by genome-wide defragmentation. PLoS Comput. Biol. 2007, 3, e137.
68. Rebuzzini, P.; Castiglia, R.; Nergadze, S.G.; Mitsainas, G.; Munclinger, P.; Zuccotti, M.; Capanna, E.; Redi,
C.A.; Garagna, S. Quantitative variation of LINE-1 sequences in five species and three subspecies of the
subgenus Mus and in five Robertsonian races of Mus musculus domesticus. Chromosome Res. 2009, 17, 65–76.
69. Dhillon, B.; Gill, N.; Hamelin, R.C.; Goodwin, S.B. The landscape of transposable elements in the finished
genome of the fungal wheat pathogen Mycosphaerella graminicola. BMC Genomics 2014, 15, 1132.
Genes 2019, 10, 1014 13 of 15
70. Vieira-da-Silva, A.; Adega, F.; Guedes-Pinto, H.; Chaves, R. LINE-1 distribution in six rodent genomes
follow a species-specific pattern. J. Genet. 2016, 95, 21–33.
71. Slijepcevic, P. Telomeres and mechanisms of Robertsonian fusion. Chromosoma 1998, 107, 136–140.
72. Bolzán, A.D.; Bianchi, M.S. Telomeres, interstitial telomeric repeat sequences, and chromosomal
aberrations. Mutat. Res. 2006, 612, 189–214.
73. Paço, A.; Chaves, R.; Vieira-da-Silva, A.; Adega, F. The involvement of repetitive sequences in the
remodelling of karyotypes: The Phodopus genomes (Rodentia, Cricetidae). Micron 2013, 46, 27–34.
74. Wong, L.H.; Choo, K.H.A. Evolutionary dynamics of transposable elements at the centromere. Trends
Genet. 2004, 20, 611–616.
75. Xie, C.; Zhang, Y.E.; Chen, J.Y.; Liu, C.J.; Zhou, W.Z.; Li, Y.; Zhang, M.; Zhang, R.; Wei, L.; Li, C.Y.
Hominoid-specific de novo protein-coding genes originating from long non-coding RNAs. PLoS Genet.
2012, 8, e1002942.
76. Kapitonov, V.V.; Holmquist, G.P.; Jurka, J. L1 repeat is a basic unit of heterochromatin satellites in
cetaceans. Mol. Biol. Evol. 1998, 15, 611–612.
77. Plohl, M.; Petrović, V.; Luchetti, A.; Ricci, A.; Šatović, E.; Passamonti, M.; Mantovani, B. Long-term
conservation vs. high sequence divergence: The case of an extraordinarily old satellite DNA in bivalve
molluscs. Heredity 2010, 104, 543–551.
78. Meštrović, N.; Mravinac, B.; Pavlek, M.; Vojvoda-Zeljko, T.; Šatović, E.; Plohl, M. Structural and functional
liaisons between transposable elements and satellite DNAs. Chromosome Res. 2015, 23, 583–596.
79. Kipling, D.; Warburton, P.E. Centromeres, CENP-B and Tigger too. Trends Genet. 1997, 13, 141–145.
80. Smith, G.P. Evolution of repeated DNA sequences by unequal crossover. Science 1976, 191, 528–535.
81. Rossi, M.S.; Pesce, C.G.; Reig, O.A.; Kornblihtt, A.R.; Zorzópulos, J. Retroviral-like features in the repetitive
unit of the major satellite DNA from the South American rodents of the genus Ctenomys. DNA Seq. 1993, 3,
379–381.
82. Batistoni, R.; Pesole, G.; Marracci, S.; Nardi, I. A tandemly repeated DNA family originated from SINE-
related elements in the european Plethodontid Salamanders (Amphibia, Urodela). J. Mol. Evol. 1995, 40, 608–
615.
83. Heikkinen, E.; Launonen, V.; Muller, E.; Bachmann, L. The pvB370 BamHI satellite DNA family of the
Drosophila virilis group and its evolutionary relation to mobile dispersed genetic pDv elements. J. Mol. Evol.
1995, 41, 604–614
84. Kapitonov, V.V.; Jurka, J. Molecular paleontology of transposable elements from Arabidopsis thaliana.
Genetica 1999, 107, 27–37.
85. López-Flores, I.; de la Herrán, R.; Garrido-Ramos, M.A.; Boudry, P.; Ruiz-Rejón, C.; Ruiz-Rejón, M. The
molecular phylogeny of oysters based on a satellite DNA related to transposons. Gene 2004, 339, 181–188.
86. Coates, B.S.; Kroemer, J.A.; Sumerford, D.V.; Hellmich, R.L. A novel class of miniature inverted repeat
transposable elements (MITEs) that contain hitchhiking (GTCY)n microsatellites. Insect Mol. Biol. 2011, 20,
15–27.
87. Sharma, A.; Wolfgruber, T.K.; Presting, G.G. Tandem repeats derived from centromeric retrotransposons.
BMC Genomics 2013, 14, 1–11.
88. Dias, G.B.; Heringer, P.; Svartman, M.; Kuhn, G.C.S. Helitrons shaping the genomic architecture of
Drosophila: Enrichment of DINE-TR1 in α- and β-heterochromatin, satellite DNA emergence, and piRNA
expression. Chromosome Res. 2015, 23, 597–613.
89. Pasero, P.; Sjakste, N.; Blettry, C.; Got, C.; Marilley, M. Long-range organization and sequence-directed
curvature of Xenopus laevis satellite 1 DNA. Nucleic Acids Res. 1993, 21, 4703–4710.
90. Agudo, M.; Losada, A.; Abad, J.P.; Pimpinelli, S.; Ripoll, P.; Villasante, A. Centromeres from telomeres?
The centromeric region of the Y chromosome of Drosophila melanogaster contains a tandem array of
telomeric HeT-A- and TART-related sequences. Nucleic Acids Res. 1999, 27, 3318–3324.
91. Langdon, T.; Seago, C.; Jones, R.N.; Ougham, H.; Thomas, H.; Forster, J.W.; Jenkins, G. De novo evolution
of satellite DNA on the rye B chromosome. Genetics 2000, 154, 869–884.
92. Miller, W.J.; Nagel, A.; Bachmann, J.; Bachmann, L. Evolutionary dynamics of the SGM transposon family
in the Drosophila obscura species group. Mol. Biol. Evol. 2000, 17, 1597–1609.
93. Cheng, Z.J.; Murata, M. A centromeric tandem repeat family originating from a part of Ty3/gypsy-
retroelement in wheat and its relatives. Genetics 2003, 164, 665–672.
Genes 2019, 10, 1014 14 of 15
94. Hikosaka, A.; Kawahara, A. Lineage-specific tandem repeats riding on a transposable element of MITE in
Xenopus evolution: A new mechanism for creating simple sequence repeats. J. Mol. Evol. 2004, 59, 738–746.
95. López-Flores, I.; Garrido-Ramos, M.A. The repetitive DNA content of Eucaryotic genomes. In Repetitive
DNA; Garrido-Ramos, M.A., Ed.; Karger Genome Dyn: Basel, Switzerland, 2012; pp. 1–28.
96. Tek, A.L.; Song, J.; Macas, J.; Jiang, J. Sobo, a recently amplified satellite repeat of potato, and its
implications for the origin of tandemly repeated sequences. Genetics 2005, 170, 1231–1238.
97. Li, J.; Leung, F.C. A CR1 element is embedded in a novel tandem repeat (HinfI repeat) within the chicken
genome. Genome 2006, 49, 97–103.
98. Shang, W.H.; Hori, T.; Toyoda, A.; Kato, J.; Popendorf, K.; Sakakibara, Y.; Fujiyama, A.; Fukagawa, T.
Chickens possess centromeres with both extended tandem repeats and short non-tandem-repetitive
sequences. Genome Res. 2010, 20, 1219–1228.
99. Arcot, S.S.; Wang, Z.; Weber, J.L.; Deininger, P.L.; Batzer, M.A. Alu repeats: A source for the genesis of
primate microsatellites. Genomics 1995, 29, 136–144.
100. Ramsay, L.; Macaulay, M.; Cardle, L.; Morgante, M.; degli Ivanissevich, S.; Maestri, E.; Powell, W.; Waugh,
R. Intimate association of microsatellite repeats with retrotransposons and other dispersed repetitive
elements in barley. Plant J. 1999, 17, 415–425.
101. Duffy, A.J.; Coltman, D.W.; Wright, J.M. Microsatellites at a common site in the second ORF of L1 elements
in mammalian genomes. Mamm. Genome 1996, 7, 386–387.
102. Clark, R.M.; Dalgliesh, G.L.; Endres, D.; Gomez, M.; Taylor, J.; Bidichandani, S.I. Expansion of GAA triplet
repeats in the human genome: Unique origin of the FRDA mutation at the center of an Alu. Genomics 2004,
83, 373–383.
103. Armour, J.A.L.; Wong, Z.; Wilson, V.; Royle, N.J.; Jeffreys, A.J. Sequences flanking the repeat arrays of
human minisatellites: Association with tandem and dispersed repeat elements. Nucleic Acids Res. 1989, 17,
4925–4935.
104. Kelly, R.G. Similar origins of two mouse minisatellites within transposon-like LTRs. Genomics 1994, 24, 509–
515.
105. Bois, P.; Williamson, J.; Brown, J.; Dubrova, Y.E.; Jeffreys, A.J. A novel unstable mouse VNTR family
expanded from SINE B1 elements. Genomics 1998, 49, 122–128.
106. Jurka, J.; Gentles, A.J. Origin and diversification of minisatellites derived from human Alu sequences. Gene
2006, 365, 21–26.
107. Ahmed, M.; Liang, P. Transposable elements are a significant contributor to tandem repeats in the human
genome. Comp. Funct. Genomics 2012, 2012, 947089.
108. Mogil, L.S.; Slowikowski, K.; Laten, H.M. Computational and experimental analyses of retrotransposon-
associated minisatellite DNAs in the soybean genome. BMC Bioinform. 2012, 13 (Suppl. 2), S13.
109. Sayah, D.M.; Sokolskaja, E.; Berthoux, L.; Luban, J. Cyclophilin A retrotransposition into TRIM5 explains
owl monkey resistance to HIV-1. Nature 2004, 430, 569–573.
110. International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human
genome. Nature 2001, 409, 860–921.
111. Mouse genome sequencing consortium; Waterston, R.H.; Lindblad-Toh, K.; Birney, E.; Rogers, J.; Abril, J.F.;
Agarwal, P.; Agarwala, R.; Ainscough, R.; Alexandersson, M.; et al. Initial sequencing and comparative
analysis of the mouse genome. Nature 2002, 420, 520–562.
112. Levinson, G.; Gutman, G.A. Slipped-strand mispairing: A major mechanism for DNA sequence evolution.
Mol. Biol. Evol. 1987, 4, 203–221.
113. Buschiazzo, E.; Gemmell, N.J. The rise, fall and renaissance of microsatellites in eukaryotic genomes.
BioEssays 2006, 28, 1040–1050.
114. Haber, J.E.; Louis, E.J. Minisatellite origins in yeast and humans. Genomics 1998, 48, 132–135.
115. Inukai, T. Role of transposable elements in the propagation of minisatellites in the rice genome. Mol. Genet.
Genomics 2004, 271, 220–227.
116. Van’t Hof, A.E.; Brakefield, P.M.; Saccheri, I.J.; Zwaan, B.J. Evolutionary dynamics of multilocus
microsatellite arrangements in the genome of the butterfly Bicyclus anynana, with implications for other
Lepidoptera. Heredity 2007, 98, 320–328.
117. Smýkal, P.; Kalendar, R.; Ford, R.; Macas, J.; Griga, M. Evolutionary conserved lineage of Angela-family
retrotransposons as a genome-wide microsatellite repeat dispersal agent. Heredity 2009, 103, 157–167.
Genes 2019, 10, 1014 15 of 15
118. Mayorov, V.I.; Adkison, L.R.; Vorobyeva, N.V.; Khrapov, E.A.; Kholodhov, N.G.; Rogozin, I.B.; Nesterova,
T.B.; Protopopov, A.I.; Sablina, O.V.; Graphodatsky, A.S.; et al. Organization and chromosomal localization
of a B1-like containing repeat of Microtus subarvalis. Mamm. Genome 1996, 7, 593–597.
119. Waters, P.D.; Dobigny, G.; Pardini, A.T.; Robinson, T.J. LINE-1 distribution in Afrotheria and Xenarthra:
Implications for understanding the evolution of LINE-1 in eutherian genomes. Chromosoma 2004, 113, 137–
144.
120. Acosta, M.J.; Marchal, J.A.; Fernández-Espartero, C.H.; Bullejos, M.; Sánchez, A. Retroelements (LINEs and
SINEs) in vole genomes: Differential distribution in the constitutive heterochromatin. Chromosome Res.
2008, 16, 949–959.
121. Goodier, J.L.; Ostertag, E.M.; Du, K.; Kazazian, H.H., Jr. A novel active L1 retrotransposon subfamily in the
mouse. Genome Res. 2001, 11, 1677–1685.
122. Babushok, D.V.; Ohshima, K.; Ostertag, E.M.; Chen, X.; Wang, Y.; Mandal, P.K.; Okada, N.; Abrams, C.S.;
Kazazian, H.H., Jr. A novel testis ubiquitin-binding protein gene arose by exon shuffling in hominoids.
Genome Res. 2007, 17, 1129–1138.
123. Cordaux, R.; Batzer, M.A. The impact of retrotransposons on human genome evolution. Nat. Rev. Genet.
2009, 10, 691–703.
124. Ewing, A.D. Transposable element detection from whole genome sequence data. Mob. DNA 2015, 6, 24.
125. Ochman, H.; Lawrence, J.G.; Groisman, E.A. Lateral gene transfer and the nature of bacterial innovation.
Nature 2000, 405, 299–304.
126. Sun, W.; Gu, J.; Wang, X.; Qian, X.; Tuo, X. Impacts of biochar on the environmental risk of antibiotic
resistance genes and mobile genetic elements during anaerobic digestion of cattle farm wastewater.
Bioresour. Technol. 2018, 256, 342–349.
127. Meštrović, N.; Pavlek, M.; Car, A.; Castagnone-Sereno, P.; Abad, P.; Plohl, M. Conserved DNA motifs,
including the CENP-B Box-like, are possible promoters of satellite DNA array rearrangements in
nematodes. PLoS ONE 2013, 8, e67328.
128. Canapa, A.; Barucca, M.; Cerioni, P.N.; Olmo, E. A satellite DNA containing CENP-B box-like motifs is
present in the Antarctic scallop Adamussium colbecki. Gene 2000, 247, 175–180.
129. Lorite, P.; Carrillo, J.A.; Tinaut, A.; Palomeque, T. Evolutionary dynamics of satellite DNA in species of the
genus Formica (Hymenoptera, Formicidae). Gene 2004, 332, 159–168.
130. Mravinac, B.; Ugarković, Đ.; Franjević, D.; Plohl, M. Long inversely oriented subunits form a complex
monomer of Tribolium brevicornis satellite DNA. J. Mol. Evol. 2005, 60, 513–525.
131. Masumoto, H.; Masukata, H.; Muro, Y.; Nozaki, N.; Okazaki, T. A human centromere antigen (CENP-B)
interacts with a short specific sequence in alphoid DNA, a human centromere satellite. J. Cell Biol. 1989,
109, 1963–1973.
132. Rajabpour, F.V.; Raoofian, R.; Habibi, L.; Akrami, S.M.; Tabrizi, M. Novel trends in genetics: Transposable
elements and their application in medicine. Arch. Iran. Med. 2014, 17, 702–712.
133. Dubin, M.J.; Mittelsten Scheid, O.; Becker, C. Transposons: A blessing curse. Curr. Opin. Plant Biol. 2018, 42,
23–29.
134. Frausto, S.D.; Lee, E.; Tang, H. Cyclophilins as modulators of viral replication. Viruses 2013, 5, 1684–1701.
135. Wu, D.D.; Irwin, D.M.; Zhang, Y.P. De novo origin of human protein-coding genes. PLoS Genet. 2011, 7,
e1002379.
136. Murphy, D.N.; McLysaght, A. De novo origin of protein-coding genes in murine rodents. PLoS ONE 2012,
7, e48650.
137. Wu, B.; Knudsona, A. Tracing the de novo origin of protein-coding genes in yeast. MBio 2018, 9, e01024-18.
138. Schmidt, T.; Metzlaff, M. Cloning and characterization of a Beta vulgaris satellite DNA family. Gene 1991,
101, 247–250.
139. Passamonti, M.; Mantovani, B.; Scali, V. Characterization of a highly repeated DNA family in Tapetinae
Species (Mollusca Bivalvia: Veneridae). Zool. Sci 1998, 15, 599–605.
© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access
article distributed under the terms and conditions of the Creative Commons Attribution
(CC BY) license (http://creativecommons.org/licenses/by/4.0/).
... Zhong et al. (2016) explained that K a and K s values are considered to be important indicators for studying the selection pressure or intensity of protein-coding genes and for estimating the approximate date/s of repetitive events. These result concluded that tandem duplication typically copies one gene each time and help in the evolutionarily successful duplication events are most likely to target genes at the end of a pathway, or genes representing flexible steps, such as those involved in environmental responses (Paço et al. 2019). However, segmental duplication allows for multiple genes to be copied each time, which permits the retention, evolution, and divergence of redundant networks (Lallemand et al. 2020;Sharp et al. 2005). ...
... These results suggested that the SBT gene family in banana experienced strong purifying selective pressure. Tandemly duplicated genes are located next to the original copy or are separated by several unrelated genes(Paço et al. 2019). ...
Article
Subtilisin-like serine proteases (SBT), a serine proteolytic enzymes play an important role in plant growth function and during different stresses responses. The systematic analysis of the SBT gene family in Musa acuminate (MaSBT) has been done and their responses to abiotic stresses in banana variety cv. G-9 were also analyzed. Total of 67 MaSBT genes were identified and based on phylogeny these were grouped into five districted subgroups. Cis-acting element analysis indicated that almost all MaSBT promoters contain regulatory elementary related to growth and development, hormonal regulation, and stress responses. The gene structure and domain analysis showed a maximum of seventeen exons and four functional domains in MaSBT. The 42 orthologous genes, 07 MaSBT paralogous genes were also identified through synteny analysis. The Ka/Ks study indicated that four MaSBT paralogous gene pairs were tandemly duplicated, while the other three were segmental duplications. Further, the expression pattern via RNA-seq data revealed that MaSBTs exhibited differential expression specifically in response to the abiotic stress of low nitrogen. and also during the flowering time. The MaSBT-1.7 gene was found involved in the response to salt stress and flowering. These findings establish a cornerstone for future research on banana's salt stress mechanism. The study offers valuable insights into SBT encoding genes, shedding light on their roles in growth, development, and abiotic stress responses.
... Several models aim to explain generation of tandem repeats from TEs (Grabundzija et al. 2016;Hikosaka and Kawahara 2004;McGurk and Barbash 2018;Xiong et al. 2016,). TEs not only serve as origin for satDNA but were proposed to be facilitators/drivers of their dispersal (Cohen et al. 2010;Grabundzija et al. 2016;Hofstatter et al. 2022;Kuhn et al. 2021;Paço et al. 2019;Tunjić-Cvitanić et al. 2021;Zattera and Bruschi 2022). Whether tandem repeats are derived from TEs by tandemization of their parts or TEs capture parts of satDNA arrays and continue propagating them is probably situation-and genome-dependent. ...
Article
Full-text available
Research on bivalves is fast-growing, including genome-wide analyses and genome sequencing. Several characteristics qualify oysters as a valuable model to explore repetitive DNA sequences and their genome organization. Here we characterize the satellitomes of five species in the family Ostreidae (Crassostrea angulata, C. virginica, C. hongkongensis, C. ariakensis, Ostrea edulis), revealing a substantial number of satellite DNAs (satDNAs) per genome (ranging between 33 and 61) and peculiarities in the composition of their satellitomes. Numerous satDNAs were either associated to or derived from transposable elements, displaying a scarcity of transposable element-unrelated satDNAs in these genomes. Due to the non-conventional satellitome constitution and dominance of Helitron-associated satDNAs, comparative satellitomics demanded more in-depth analyses than standardly employed. Comparative analyses (including C. gigas, the first bivalve species with a defined satellitome) revealed that 13 satDNAs occur in all six oyster genomes, with Cg170/HindIII satDNA being the most abundant in all of them. Evaluating the “satDNA library model” highlighted the necessity to adjust this term when studying tandem repeat evolution in organisms with such satellitomes. When repetitive sequences with potential variation in the organizational form and repeat-type affiliation are examined across related species, the introduction of the terms “TE library” and “repetitive DNA library” becomes essential. Supplementary Information The online version contains supplementary material available at 10.1007/s42995-024-00218-0.
... Tandem repeat are core repeating units of 1 to 200 bases repeated several times in tandem and are widely present in eukaryotic and some prokaryotic genomes [18,19]. A total of 19 tandem repeats were detected in the T. foenum-graecum mt genome, with length distributions ranging from 5 to 57, and 13 tandem repeats had a match rate of > 97%, as shown in Table 4. ...
Article
Full-text available
Background Trigonella foenum-graecum L. is a Leguminosae plant, and the stems, leaves, and seeds of this plant are rich in chemical components that are of high research value. The chloroplast (cp) genome of T. foenum-graecum has been reported, but the mitochondrial (mt) genome remains unexplored. Results In this study, we used second- and third-generation sequencing methods, which have the dual advantage of combining high accuracy and longer read length. The results showed that the mt genome of T. foenum-graecum was 345,604 bp in length and 45.28% in GC content. There were 59 genes, including: 33 protein-coding genes (PCGs), 21 tRNA genes, 4 rRNA genes and 1 pseudo gene. Among them, 11 genes contained introns. The mt genome codons of T. foenum-graecum had a significant A/T preference. A total of 202 dispersed repetitive sequences, 96 simple repetitive sequences (SSRs) and 19 tandem repetitive sequences were detected. Nucleotide diversity (Pi) analysis counted the variation in each gene, with atp6 being the most notable. Both synteny and phylogenetic analyses showed close genetic relationship among Trifolium pratense , Trifolium meduseum , Trifolium grandiflorum , Trifolium aureum , Medicago truncatula and T. foenum-graecum . Notably, in the phylogenetic tree, Medicago truncatula demonstrated the highest level of genetic relatedness to T. foenum-graecum , with a strong support value of 100%. The interspecies non-synonymous substitutions (Ka)/synonymous substitutions (Ks) results showed that 23 PCGs had Ka/Ks < 1, indicating that these genes would continue to evolve under purifying selection pressure. In addition, setting the similarity at 70%, 23 homologous sequences were found in the mt genome of T. foenum-graecum. Conclusions This study explores the mt genome sequence information of T. foenum-graecum and complements our knowledge of the phylogenetic diversity of Leguminosae plants.
... We analyzed a limited number of published SINE-satellite examples and found them inconclusive. Please refer to specific reviews for information about other TE origins of satellites [6,7]. ...
Article
Full-text available
Background The genomes of many eukaryotes contain DNA repeats in the form of both tandem and interspersed elements with distinct structure, evolutionary histories, and mechanisms of emergence and amplification. Although there is considerable knowledge regarding their diversity, there is little evidence directly linking these two types. Results Different tandem repeats derived from portions of short interspersed elements (SINEs) belonging to different families were identified in 56 genomes of squamate reptiles. All loci of SINE-derived satellites (sSats) were thoroughly analyzed. Snake sSats exhibited high similarity in both structure and copy number, while other taxa may have highly diverse (geckos), rare (Darevskia lizards), or missing sSats (agamid lizards). Similar to most satellites associated with heterochromatin, sSats are likely linked to subtelomeric chromosomal regions. Conclusions Discovered tandem repeats derived from SINEs exhibit satellite-like properties, although they have not amplified to the same degree as typical satellites. The autonomous emergence of distinct sSats from diverse SINE families in numerous squamate species suggests a nonrandom process of satellite genesis originating from repetitive SINEs.
... The classi cation of tandem repeats into three classes, namely microsatellite, minisatellite, and satellite, is based on the length of their repeating pattern [4]. Although there is no unanimous agreement, the majority of researchers take patterns of sizes 1 to 7 (inclusive) for microsatellites. ...
Preprint
Full-text available
Tandem repeats (TRs) are subsequences of DNA or any genomic sequence composed of many consecutive repeats of a pattern in the same direction. TRs form about three percent of human DNA. Tandem repeats are extremely unstable and highly vulnerable to mutations. Mutated TRs can cause several diseases, such as neurodegeneration and ovarian insufficiency. The tandem repeats are divided into various classes, and consequently, computer specialists have developed many software packages, each of which is usually able to detect a class of consecutive repeats. The boundaries of these classes of tandem repeats are blurred, and it is often necessary to use more than one software tool to detect all TRs of a range of pattern lengths. Besides, if a TR with a specific pattern length is of interest, the complete software has to be run, and the output has to be manually searched. In this research, a single software package is developed to discover TRs of all repeating pattern lengths. Users are allowed to specify the range of pattern lengths of interest as input to the algorithm. The Multi-head Reader Arm (MRA) Algorithm is based on the idea of a multi-head reader arm moving on a given genomic sequence from beginning to end. Each pair of heads is composed of the base head and one of the other heads and it is responsible for finding all TRs of a specific pattern length. The innovative MRA has the ability to discover both exact and inexact TRs of all pattern lengths. Compared to the existing state-of-the-art TR detection software, we have demonstrated that MRA is superior with respect to simplicity, accuracy, computational time, and space requirement.
... The total length of the scattered repetitive sequences was 47506 bp, accounting for 13.75% of the total length of the mt genome. The length of each repeat sequence and the number of repeat types are detailed in Table 3. widely found in eukaryotes and some prokaryotes [22]. A total of 19 tandem repeats were detected in the T. foenum-graecum mt genome, with length distributions ranging from 5-57, and 13 tandem repeats had a match rate of > 97%, as shown in Table 5. ...
Preprint
Full-text available
Background Trigonella foenum-graecum L. (T. foenum-graecum) is a Leguminosae plant, and the stems, leaves, and seeds of this plant are rich in chemical components that are of high research value. The chloroplast (cp) genome of T. foenum-graecum has been reported, but the mitochondrial (mt) genome remains unexplored. Results In this paper, we use second- and third-generation sequencing methods, which have the dual advantage of combining high accuracy and longer read length. The T. foenum-graecum mitochondrial genome was assembled and other analyses such as annotation of the assembled sequences were performed. The results showed that the mitochondrial genome of T. foenum-graecum was 345,604 bp in length and 45.28% in GC content. There are 59 genes, including: 33 protein-coding genes (PCGs), 21 tRNA genes, 4 rRNA genes and 1 pseudo gene. Among them, 11 genes contained introns. Significant AT preferences for codons in the mitochondrial genome of T. foenum-graecum A total of 202 dispersed repetitive sequences, 96 simple repetitive sequences (SSRs) and 19 tandem repetitive sequences were detected. Nucleotide polymorphism analysis counted the variation in each gene, with atp6 being the most notable. Both synteny and phylogenetic analyses showed that T. foenum-graecum was similar to Trifolium pratense, Trifolium meduseum, Trifolium grandiflorum, Trifolium aureum, Medicago truncatula, which are five species of Leguminosae with high similarity. Among them, the highest similarity with Medicago truncatula was 100%. The interspecies non-synonymous substitutions (Ka)/synonymous substitutions (Ks) results showed that 23 Protein-coding genes had Ka/Ks < 1, indicating that these genes would continue to evolve under purifying selection pressure. In addition, 23 homologous sequences were detected in the mitochondrial genome of T. foenum-graecum, and tRNAs were more conserved than PCGs during gene migration. Conclusions This paper explores the mitochondrial genome sequence information of T. foenum-graecum and advances the phylogenetic diversity of Leguminosae plants.
Article
Full-text available
Lithocarpus litseifolius (Hance) Chun (L. litseifolius 1837) is an evergreen tree of the Fagaceae. It is commonly known as sweet tea and is a natural sweetener with high levels of dihydrochalcone. In addition, L. litseifolius is a precious medicinal material, and its phlorizin has a unique role in the treatment of diabetes. This investigation aimed to assemble and scrutinize the entire mitochondrial (mt) genome of L. litseifolius. The circular mt genome of L. litseifolius spans 573,177 bp and has a GC content of 45.61%. The mt genome of L. litseifolius comprises 61 genes, of which 21 are tRNA genes, 3 are rRNA genes, 36 are protein-coding genes (PCGs), and 1 is a pseudogene. Tetramer repeats made up 32.57% of all identified simple repeat sequences (SSRs), making them the most abundant type of SSR. 35 PCGs with a combined length of 32,208 bp were predicted to include a total of 461 RNA editing sites in the L. litseifolius mt genome. Besides, nine homologous genes between the chloroplast and mt genomes of L. litseifolius were identified. Furthermore, this study demonstrated that while plant mt genome sizes vary considerably, the GC content of these genomes has remained largely constant. The most conservative genes are atp6, rps1, ccmC, rpl2, nad4, nad7, and trnY-GTA. The phylogenetic analysis confirmed that L. litseifolius was genetically more closely related to Quercus variabilis. This study establishes the groundwork for investigations on the systematic evolution, genetic variability, and breeding of L. litseifolius.
Preprint
Full-text available
Background Neotropical annual killifish are able to survive in seasonal ponds due to their ability to undergo embryonic diapauses in the dry season and grow, reproduce and die in the span of a few months during the rainy season. The genus group Austrolebias is endemic to the South American basins and shows remarkable speciation and genetic plasticity. Austrolebias charrua co-exists with another annual killifish, Cynopoecilus melanotaenia, from which it diverged about 25 million years ago. Despite their similar life histories, both species show important differences in genome size. It is of interest to explore the genomic structure of these species as a basis for understanding their evolution and unique adaptations. Results We have sequenced the genomes of A. charrua and C. melanotaenia and have determined that they show important structural differences between them. While A. charrua has undergone an evolutionarily recent and massive genome expansion, with a size (3Gb) that triples that of most characterized teleosts, C. melaotaenia has retained a genome size of 1Gb. The expansion of the genome in A. charrua has occurred due to amplification of repetitive elements, most recently of the LINE class of elements. We explore and characterize in detail the contribution to genome expansion of repetitive elements at the level of superfamilies, as well as analyze the relationship between these elements and coding genes in Austrolebias charrua. We also examine the selection pressures on gene sequences and identify functions that are under positive or purifying selection, and compare these data with that derived from other species. Conclusions Our study adds a crucial element to the understanding of annual fish evolution and life history. We show that the genetic variability and plasticity in A. charrua is accompanied by a recent genome-wide expansion with an important contribution of repetitive elements. By comparing these findings with data from other species, we show that Austrolebias has undergone bursts of repetitive element expansion, with specific superfamilies of retrotransposons and DNA transposons being the most prevalent and recent. In addition, we characterize genes that are potentially implicated in adaptive traits because of their interaction with mobile elements or because they display evidence of positive selection. These genes are candidates for functional studies aimed at unraveling the genetic basis for annualism in this group of teleosts.
Preprint
Full-text available
Lithocarpus litseifolius (Hance) Chun (L. litseifolius 1837) is an evergreen tree of Fagaceae, commonly known as sweet tea. L. litseifolius is a natural sweetener with high levels of dihydrochalcone. In addition, L. litseifolius is a precious medicinal material, its phlorizin has a unique role in the treatment of diabetes. This investigation aimed to assemble and scrutinize the entire mitochondrial (mt) genome of L. litseifolius. The circular mt genome of L. litseifolius spans 573,177 bp and has a GC content of 45.61%. The mt genome of L. litseifolius comprises 61 genes, of which 21 are tRNA genes, 3 are rRNA genes, 36 are protein-coding genes (PCGs), and 1 is a pseudogene. Tetramer repeats made up 32.57% of all identified simple repeat sequences (SSRs), making them the most abundant type of SSR. 35 PCGs with a combined length of 32,208 bp were predicted to include a total of 461 RNA editing sites in the L. litseifolius mt genome. Besides, nine homologous genes between the chloroplast and mt genomes of L. litseifolius were identified. Furthermore, our findings demonstrated that while plant mt genome sizes vary considerably, the GC content of these genomes has remained largely constant. Seven genes were found to be associated with conservatism: atp6, rps1, ccmC, rpl2, nad4, nad7, and trnY-GTA. The phylogenetic analysis confirmed that L. litseifolius was genetically more clustered with Quercus variabilis. This study establishes the groundwork for investigations on the systematic evolution, genetic variability, and breeding of L. litseifolius.
Article
Full-text available
Background: Repetitive sequences constitute the major portion of genomic DNA in most of the organisms and are responsible for variation in DNA structure, function, etc., These sequences also have the potential to adopt various noncanonical DNA structures. Methods: By using a swift, manual approach mirror repeats has been identified within the complete engrailed homeobox-1 gene (en-1) of X. tropicalis. Another tool Non- B DNA motif search was also deployed for comparative analysis. Results: A total of 166 mirror repeats were identified within the complete en-1 gene of X. tropicalis. The similar sequences were also searched among the genome of different organisms such as Xenopus laevis, Caenorhabditis elegans, Drosophila melanogaster, etc., Conclusion: To the best of our knowledge, it was novel identification of mirror repeats in the engrailed-1 gene of X. tropicalis. Few of these sequences may adopt various noncanonical B-DNA forms and are potent sites for mutation and recombination events.
Article
Full-text available
Repetitive satellite DNA (satDNA) sequences are abundant in eukaryote genomes, with a structural and functional role in centromeric function. We analyzed the nucleotide sequence and chromosomal location of the five known cattle (Bos taurus) satDNA families in seven species from the tribe Tragelaphini (Bovinae subfamily). One of the families (SAT1.723) was present at the chromosomes’ centromeres of the Tragelaphini species, as well in two more distantly related bovid species, Ovis aries and Capra hircus. Analysis of the interaction of SAT1.723 with centromeric proteins revealed that this satDNA sequence is involved in the centromeric activity in all the species analyzed and that it is preserved for at least 15–20 Myr across Bovidae species. The satDNA sequence similarity among the analyzed species reflected different stages of homogeneity/heterogeneity, revealing the evolutionary history of each satDNA family. The SAT1.723 monomer-flanking regions showed the presence of transposable elements, explaining the extensive shuffling of this satDNA between different genomic regions.
Article
Full-text available
Homeobox (HOX) transcription factors, encoded by a subset of homeodomain superfamily genes, play pivotal roles in many aspects of cellular physiology, embryonic development, and tissue homeostasis. Findings over the past decade have revealed that mutations in HOX genes can lead to increased cancer predisposition, and HOX genes might mediate the effect of many other cancer susceptibility factors by recognizing or executing altered genetic information. Remarkably, several lines of evidence highlight the interplays between HOX transcription factors and cancer risk loci discovered by genome-wide association studies, thereby gaining molecular and biological insight into cancer etiology. In addition, deregulated HOX gene expression impacts various aspects of cancer progression, including tumor angiogenesis, cell autophagy, proliferation, apoptosis, tumor cell migration, and metabolism. In this review, we will discuss the fundamental roles of HOX genes in cancer susceptibility and progression, highlighting multiple molecular mechanisms of HOX involved gene misregulation, as well as their potential implications in clinical practice.
Article
Full-text available
Transposable elements (TEs) are ubiquitous in both prokaryotes and eukaryotes, and the dynamic character of their interaction with host genomes brings about numerous evolutionary innovations and shapes genome structure and function in a multitude of ways. In traditional classification systems, TEs are often being depicted in simplistic ways, based primarily on the key enzymes required for transposition, such as transposases/recombinases and reverse transcriptases. Recent progress in whole-genome sequencing and long-read assembly, combined with expansion of the familiar range of model organisms, resulted in identification of unprecedentedly long transposable units spanning dozens or even hundreds of kilobases, initially in prokaryotic and more recently in eukaryotic systems. Here, we focus on such oversized eukaryotic TEs, including retrotransposons and DNA transposons, outline their complex and often combinatorial nature and closely intertwined relationship with viruses, and discuss their potential for participating in transfer of long stretches of DNA in eukaryotes.
Article
Full-text available
HOX and TALE genes encode homeodomain (HD)-containing transcription factors that act in concert in different tissues to coordinate cell fates and morphogenesis throughout embryonic development. These two evolutionary conserved families contain several members that form different types of protein complexes on DNA. Mutations affecting the expression of HOX or TALE genes have been reported in a number of cancers, but whether and how the two gene families could be perturbed together has never been explored systematically. As a consequence, the putative collaborative role between HOX and TALE members for promoting or inhibiting oncogenesis remains to be established in most cancer contexts. Here, we address this issue by considering HOX and TALE expression profiling in normal and cancer adult tissues, using normalized RNA-sequencing expression data deriving from The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) research projects. Information was extracted from 28 cancer types originating from 21 different tissues, constituting a unique comparative analysis of HOX and TALE expression profiles between normal and cancer contexts in human. We present the general and specific rules that could be deduced from this large-scale comparative analysis. Overall this work provides a precious annotated support to better understand the role of specific HOX/TALE combinatorial codes in human cancers.
Article
Full-text available
Background Chemotherapy is the primary established systemic treatment for patients with breast cancer, especially those with the triple-negative subtype. Simultaneously, the resistance of triple-negative breast cancer (TNBC) to chemotherapy remains a major clinical problem. Our previous study demonstrated that the expression levels of PTN and its receptor PTPRZ1 were upregulated in recurrent TNBC tissue after chemotherapy, and this increase was closely related to poor prognosis in those patients. However, the mechanism and function of chemotherapy-driven increases in PTN/PTPRZ1 expression are still unclear. Methods We compared the expression of PTN and PTPRZ1 between normal breast and cancer tissues as well as before and after chemotherapy in cancer tissue using the microarray analysis data from the GEPIA database and GEO database. The role of chemotherapy-driven increases in PTN/PTPRZ1 expression was examined with a CCK-8 assay, colony formation efficiency assay and apoptosis analysis with TNBC cells. The potential upstream pathways involved in the chemotherapy-driven increases in PTN/PTPRZ1 expression in TNBC cells were explored using microarray analysis, and the downstream mechanism was dissected with siRNA. Results We demonstrated that the expression of PTN and PTPRZ1 was upregulated by chemotherapy, and this change in expression decreased chemosensitivity by promoting tumour proliferation and inhibiting apoptosis. CDKN1A was the critical switch that regulated the expression of PTN/PTPRZ1 in TNBC cells receiving chemotherapy. We further demonstrated that the mechanism of chemoresistance by chemotherapy-driven increases in the CDKN1A/PTN/PTPRZ1 axis depended on the NF-κB pathway. Conclusions Our studies indicated that chemotherapy-driven increases in the CDKN1A/PTN/PTPRZ1 axis play a critical role in chemoresistance, which suggests a novel strategy to enhance chemosensitivity in breast cancer cells, especially in those of the triple-negative subtype.
Article
Full-text available
Abstract Transposable elements (TEs) are major components of eukaryotic genomes. However, the extent of their impact on genome evolution, function, and disease remain a matter of intense interrogation. The rise of genomics and large-scale functional assays has shed new light on the multi-faceted activities of TEs and implies that they should no longer be marginalized. Here, we introduce the fundamental properties of TEs and their complex interactions with their cellular environment, which are crucial to understanding their impact and manifold consequences for organismal biology. While we draw examples primarily from mammalian systems, the core concepts outlined here are relevant to a broad range of organisms.
Article
Full-text available
Background Some researchers reported that pleiotrophin (PTN) is associated with the development and metastasis of various tumors and it is a poor prognostic factor for the tumor patients. However, the results of other researches are inconsistent with them. It is obliged to do a meta-analysis to reach a definite conclusion. Methods The published studies relevant to PTN were searched in the databases including PubMed, Embase and Web of Science until March 20, 2018. A meta-analysis was conducted to evaluate the role of PTN in clinicopathological characteristics and overall survival (OS) of cancer patients. Results Our meta-analysis indicated that the high expression of PTN was remarkably associated with advanced TNM stage (OR = 2.79, 95%CI: 1.92–4.06, P<0.00001) and poor OS (HR = 1.77, 95%CI: 1.41–2.22, P<0.00001) in tumor patients. The expression of PTN was not associated with tumor size (OR = 1.12, 95% CI: 0.55–2.26, P = 0.76), lymph node metastasis (LNM) (OR = 1.95, 95%CI: 0.62–6.12, P = 0.25), distant metastasis (DM) (OR = 2.78, 95%CI: 0.72–10.74, P = 0.14) and histological grade (OR = 1.95, 95%CI: 0.98–3.87, P = 0.06). Conclusion The high expression of PTN is significantly relevant to the advanced TNM stage and poor OS in tumor patients. PTN can serve as a promising biomarker to predict unfavorable survival outcomes, and it may be a potential target for tumor treatment.
Article
The originally published article contained an error in Figure 2a: for the left side of the figure part (showing piRNA-directed DNA methylation of mouse transposable elements), DNMT3A/B should have been DNMT3C. The article has now been corrected online.
Article
Osteosarcoma is the most common malignant bone tumor in children and adolescents. Aberrant expression of HOXA5 results in various diseases, including cancers. However, the specific function and molecular mechanism of HOXA5 in osteosarcoma is not fully understood. In the present study, we focused on HOXA5 in U2OS and MG63 cells in vitro. We observed lower expression of HOXA5 in U2OS, MG63, and SaOS2 human osteosarcoma cells, compared with hFOB1.19 human osteoblastic cells. HOXA5 overexpression in U2OS and MG63 cells markedly reduced cell survival and proliferation and elevated cell apoptosis and caspase-3 activity. HOXA5 also activated the p38α MAPK pathway by increasing p53. Treating U2OS and MG63 cells with the p53 inhibitor α-pifithrin or the p38α MAPK inhibitor SB203580 led to higher cell survival and proliferation and lower cell apoptosis, compared with the pcDNA3.1-HOXA5 group. In conclusion, our study showed that the p53 and p38α MAPK signal axis facilitated HOXA5's role in inhibiting growth and stimulating apoptosis of osteosarcoma cells.
Article
An ancient gene cluster controls the formation of repetitive body parts in a sea anemone