ArticlePDF Available

The Y‐STR landscape of coastal southeastern Han: Forensic characteristics, haplotype analyses, mutation rates, and population genetics

Authors:

Abstract and Figures

The Y-STR landscape of Coastal Southeastern Han (CSEH) living in Chinese southeast areas (including Guangdong, Fujian and Zhejiang provinces) is still unclear. We investigated 62 Y-STR markers in a reasonably large number of 1,021 unrelated males and 1,027 DNA-confirmed father-son pairs to broaden the genetic backgrounds of CSEH. In total, 85 null alleles, 121 off-ladder alleles, and 95 copy number variants were observed, and 1,012 distinct haplotypes were determined with the overall HD and DC values of 0.999974 and 0.9912. We observed 369 mutations in 76,099 meiotic transfers, and the average estimated Y-STR mutation rate was 4.85 × 10-3 (95% CI, 4.4 × 10-3 -5.4 × 10-3 ). The Spearman correlation analyses indicated that GD values (R2 = 0.6548) and average allele sizes (R2 = 0.5989) have positive correlations with Y-STR mutation rates. Our RM Y-STR set including 8 candidate RM Y-STRs, of which DYS534, DYS630, and DYS713 are new candidates in CSEH, distinguished 18.52% of father-son pairs. This study also clarified the population structures of CSEH which isolated in population-mixed South China relatively. The strategy, SM Y-STRs for familial searching and RM Y-STRs for individual identification regionally, could be applicable based on enough knowledge of the Y-STR mutability of different populations. This article is protected by copyright. All rights reserved.
Content may be subject to copyright.
Electrophoresis 2021,0,1–16 1
Haoliang Fan1
Ying Zeng1
Weiwei Wu3
Hong Liu2
Quyi Xu2
Weian Du1
Honglei Hao3
Changhui Liu2
Wenyan Ren3
Weibin Wu1
Ling Chen1
Chao Liu1,2
1School of Forensic Medicine,
Southern Medical University,
Guangzhou, P. R. China
2Guangzhou Forensic Science
Institute, Guangzhou,
P. R. China
3Zhejiang Key Laboratory of
Forensic Science and
Technology, Institute of Forensic
Science of Zhejiang Provincial
Public Security Bureau,
Hangzhou, P. R. China
Received February 4, 2021
Revised April 16, 2021
Accepted May 15, 2021
Research Paper
The Y-STR landscape of coastal
southeastern Han: Forensic characteristics,
haplotype analyses, mutation rates, and
population genetics
The Y-STR landscape of Coastal Southeastern Han (CSEH) living in Chinese southeast
areas (including Guangdong, Fujian, and Zhejiang provinces) is still unclear. We investi-
gated 62 Y-STR markers in a reasonably large number of 1021 unrelated males and 1027
DNA-confirmed father-son pairs to broaden the genetic backgrounds of CSEH. In total, 85
null alleles, 121 off-ladder alleles, and 95 copy number variants were observed, and 1012
distinct haplotypes were determined with the overall HD and DC values of 0.999974 and
0.9912. We observed 369 mutations in 76 099 meiotic transfers, and the average estimated
Y-STR mutation rate was 4.85 ×10–3 (95% CI, 4.4 ×10–3–5.4 ×10–3 ). The Spearman cor-
relation analyses indicated that GD values (R2=0.6548) and average allele sizes (R2=
0.5989) have positive correlations with Y-STR mutation rates. Our RM Y-STR set includ-
ing 8 candidate RM Y-STRs, of which DYS534, DYS630, and DYS713 are new candidates
in CSEH, distinguished 18.52% of father–son pairs. This study also clarified the popula-
tion structures of CSEH which isolated in population-mixed South China relatively. The
strategy, SM Y-STRs for familial searching and RM Y-STRs for individual identification
regionally, could be applicable based on enough knowledge of the Y-STR mutability of
different populations.
Keywords:
Capillary electrophoresis / Forensic landscape / Mutation rate / Population ge-
netics / Y-STR DOI 10.1002/elps.202100037
Additional supporting information may be found online in the Supporting Infor-
mation section at the end of the article.
Correspondence: Dr. Ling Chen, School of Forensic Medicine,
Southern Medical University, Guangzhou 510515, P. R. China.
E-mail: lingpzy@163.com
Dr. Chao Liu, School of Forensic Medicine, Southern Medical Uni-
versity, Guangzhou 510515, P. R. China.
E-mail: liuchaogzf@163.com
Abbreviations: AMOVA,analysis of molecular variance;CI,
condence interval;CNAS,China National Accreditation Ser-
vice for Conformity Assessment;CSEH,Coastal Southeastern
Han;DC,discrimination capacity;FJ H ,Fujian Han;FM Y-STRs,
fast mutating Y-STRs;FUH,fraction of unique haplotypes;GD,
gene diversity;GDH,Guangdong Han;G/L,gains:losses mu-
tations;HD,haplotype diversity;ILAC,International Labora-
tory Accreditation Cooperation;ISFG,International Society of
Forensic Genetics;iTOL,Interactive Tree of Life;LM,length
of repetitive motif for each Y-STR;Lo,each Y-STR location
in Y chromosome;μ,mutation rate;MDS,multidimensional
scaling plot;MEGA,Molecular Evolutionary Genetics Anal-
ysis;MM Y-STRs,moderately mutating Y-STRs;MP,match
probability;MSY,male-specic region of the Y chromosome;
N,numbers of father-son pairs;p,short arm;q,long arm;R,
1 Introduction
Male specificity, haploid, and escape from crossing over are
the properties of human Y chromosome which make it an un-
usual component of the genome, and the genetic variations
of Y chromosome have become increasingly popular in a vi-
tal part of studies of forensic genetics [1], genealogy [2], hu-
man evolution [3], archaeology [4], population history [5,6],
and male medical genetics [6,7]. The human Y chromosome
(Fig. 1A) includes the pseudoautosomal regions that are two
segments of sequence homology at the tips of the short and
R Project for Statistical Computing;RM Y-STRs,rapidly mu-
tating Y-STRs;RST,pairwise genetic distances;S/M,single-
repeat:multi-repeat mutations;SM Y-STRs,slowly mutating
Y-STRs;SWGDAM,Chinese National Standards, Scientic
Working Group on DNA Analysis Methods;XDG,X degener-
ate class;XTR,X transposed region;YHRD,Y-Chromosome
STR Haplotype Reference Database;Y-STRs,Y chromosome
short tandem repeats;;ZJH,Zhejiang Han
H. Fan, Y. Zeng and W. Wu contributed equally to this work.
Color online: See article online to view Figs. 1–3 in color.
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
2H. Fan et al. Electrophoresis 2021, 0,116
Figure 1. Compositions of Human Y chromosome and information of 62 Y-STR markers.
A. Of MSY, approximately half is a variably sized block of heterochromatin and the remaining 23 Mb of euchromatin is composed of
three major sequence classes: X degenerate (XDG) class, X transposed region (XTR), and ampliconic regions (intrachromosomal repeats
of high sequence similarity); B. The information included the locations in Y Chromosome and different groups of 62 Y-STRs genotyped
in CSEH. The set of 13 RM Y-STRs highlighted by Ballantyne et al. [14] in Europe. Besides, we obtained three new candidate RM Y-STRs
in CSEH (DYS534, DYS630, and DYS713).
long arms and the male-specific region of the Y chromosome
(MSY) escaped from crossing over. Of MSY, Y chromosome
short tandem repeats (Y-STRs), one kind of the genetic vari-
ations of Y chromosome, are widely used in forensic genet-
ics. Commonly, Y-STRs are used for solving sexual assault
cases unambiguously where identification of the DNA from
the male perpetrator’s sperm cells requires overcoming huge
excess of female background, that is practically impossible to
apply autosomal STRs to identify the male perpetrator even
after using differential lysis to enrich sperm DNA [1,8–10]. In
addition, the Y-STR haplotype obtained from the DNA mix-
tures in crime scenes can determine the paternal lineage to
which the male crime scene trace donor belongs [1]. How-
ever, with one or more mutations exist, the familial searching,
forensic genealogy, and surname prediction should be cau-
tious furthest, especially for justice applications, only based
on forensic Y-STR evidence without other unambiguous evi-
dence.
In general, forensic DNA analysis seeks individual
identification [11]. Male relative differentiation using Y-
chromosome markers has been achieved by rapidly mu-
tating Y-STRs, which are termed RM Y-STRs [12,13], with
mutation rates (μ)>10–2. A large empirical Y-STR muta-
tion rate study conducted with 186 bioinformatically iden-
tified Y-STRs in 2000 DNA-confirmed European father–
son pairs, which identified a set of 13 RM Y-STR mark-
ers (DYF387S1, DYF399S1, DYF403S1, DYF404S1, DYS449,
DYS518, DYS526, DYS547, DYS570, DYS576, DYS612,
DYS626, and DYS627), which are capable of differentiating
nearly 70% of father–son pairs, 56% of brothers, and 67%
of cousins in 103 pairs from 80 male pedigrees [14]. Subse-
quently, the possibility of differentiating closely related and
unrelated males by the set of 13 RM Y-STR markers, which
allows Y-STR analysis to approach the level of individual iden-
tification, demonstrated by lots of confirmatory literature [15–
21]. However, the identified set of 13 RM Y-STR markers pro-
vides limitations for male relative differentiation, particularly
regarding close patrilineal relatives and different mutation
rates for the same Y-STR marker in different populations,
which limits applications in forensic genetics and genetic ge-
nealogy [22]. Hence, in order to strengthen the capacity of RM
Y-STR system for differentiating relative or unrelated males,
there are two ways to improve the RM Y-STR system, (a) find
novel RM Y-STR markers to expand the RM Y-STRs pool [11],
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
Electrophoresis 2021, 0,116 H. Fan et al. 3
and (b) validate and evaluate currently known Y-STRs as new
candidate RM Y-STR markers in different populations for re-
gional individual identification. Depend on different muta-
tion rates of Y-STR markers, the strategy of forensic DNA
analysis, slowly mutating Y-STRs (SM Y-STRs, μ<10–3) [13]
for familial searching and RM Y-STRs for individual identifi-
cation, is extremely feasible with the establishment of foren-
sic Y-STR databases and the accumulation of Y-STR profiles.
The coastal southeast China, including Guangdong, Fu-
jian, and Zhejiang provinces, has 213 million populations
which accounted for 15.9% of total Chinese according to
the 2010 national census and the overwhelming majority of
people are Han Chinese. Previous studies had reported the
mutation rates of Y-STRs in Zhejiang Han [23] and Guang-
dong Han [24,25] at the provincial level, however, no litera-
ture integrally concerned the local region in coastal south-
east China. At the same time, considering the different divi-
sions of provincial administrate regions in different Chinese
historical periods and growing frequently small-scale inter-
provincial migration in southeast China, a comprehensive
and systematic study on Coastal Southeastern Han (CSEH)
is needed to clarify the population structures and definite the
mutation rates of more genetic markers for accurate famil-
ial searching and regional individual identification based on
Y-STRs with different mutation rates.
Consequently, in order to address three main issues in
CSEH, (a) the limited knowledge on currently known Y-STRs
mutability, (b) the lack of well-matched RM Y-STRs for foren-
sic Y chromosome applications, (c) the unclear population
structures, we investigated frequently used 62 Y-STRs includ-
ing 74 Y-STR loci in 1027 DNA-confirmed father-son pairs.
It is a relatively large-scale investigation, with the largest
number of Y-STRs in China so far, to determine the muta-
tion rates and characteristics, and explore the molecular fac-
tors influencing the Y-STR mutability. In addition, we char-
actered forensic features and system efficiencies of CSEH
based on different sets with distinct numbers of Y-STRs, and
conducted population genetics analyses to clear up the pop-
ulation structures. Finally, the paternal Y-STR landscape of
CSEH has been clarified by diversification analyses, and the
candidate RM Y-STRs should be more suitable for CSEH and
have potential forensic applications for regional male relative
differentiation.
2 Materials and methods
2.1 Sample preparation
Buccal swab samples were collected from two-generation
pedigrees in the coastal region of southeastern China with
1021 unrelated fathers who had one or two sons, which were
presented in Supplementary Table S1 with detail. A total of
1027 DNA-confirmed father-son pairs (2048 individuals in
all) were recruited and the inclusion criteria included: (1) par-
ents and grandparents for each volunteer (especially for the
fathers in father–son pairs), who are Han Chinese, were lived
in coastal provinces (Guangdong, Fujian, and Zhejiang) with
no divorce or cross-province migration; (2) three generations
of the volunteers’ spouses lived in the same provinces and
were all Han Chinese, which confirmed by the volunteers’
self-declared statements; (3) the paternity relationships of
father-son pairs had been confirmed using autosomal STRs
(GlobalFilerTM PCR Amplification Kit, Thermo Fisher Scien-
tific, Waltham, Massachusetts, USA) with paternity index val-
ues greater than 10 000. All participants had signed the in-
formed consent form (the underage sons were signed by their
fathers in accordance with Chinese laws and regulations) and
the study was approved by the Biomedical Ethical Committee
of the Southern Medical University (No. 2020-001).
2.2 Y-STR genotyping
Genomic DNA extraction was performed using the Freedom
EVO®150–8 system (Tecan Corporation, Switzerland). As
shown in Supporting Information Table S2, we used two mul-
tiplex amplification system in present study: AGCU Y-LM
Kit (AGCU ScienTech Incorporation, Wuxi, Jiangsu, China)
and HomyGene RM Y32 Kit (HomyGene, Foshan, Guang-
dong, China). In total, 62 Y-STR markers (including 74 Y-STR
loci), which were composed of 54 single-copy, 8 multi-copy
Y-STR makers, and one Y-InDel (rs199815934) were geno-
typed in 1027 father–son pairs (totally 2048 individuals). Am-
plifications for both systems were performed on GeneAmp
PCR System 9700 Thermal Cycler (Thermo Fisher Scientific,
Waltham, Massachusetts, USA) according to the manufac-
turer’s protocols. Amplified products were electrophoresed
on a 3500XL Genetic Analyzer (Thermo Fisher Scientific,
Waltham, Massachusetts, USA) and data were analyzed using
GeneMapper®ID-X Software v1.6 (Thermo Fisher Scientific,
Waltham, Massachusetts, USA).
2.3 Quality control
We strictly followed the recommendations of the Chinese Na-
tional Standards, the Scientific Working Group on DNA Anal-
ysis Methods (SWGDAM) [26] and the DNA Commission of
the International Society of Forensic Genetics (ISFG) on the
analysis of Y-STRs [27,28]. The typical control DNA of 9948
was positive and the sdH2O were employed as negative in
each batch of PCR amplification and electrophoresis. In ad-
dition, the laboratory has been accredited in accordance with
ISO/IEC 17 025:2005 and the China National Accreditation
Service for Conformity Assessment (CNAS), which is also ap-
proved by the International Laboratory Accreditation Cooper-
ation (ILAC).
The haplotype data of 1021 male individuals from three
different southeastern coastal provinces of China in the
present study have been submitted to YHRD database and
received the accession number YA004709 (Guangdong Han,
n=601), YA004710 (Zhejiang Han, n=340), and YA004711
(Fujian Han, n=80). Three Y-STRs, DYS458, DYS576, and
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
4H. Fan et al. Electrophoresis 2021, 0,116
Y-GATA-A10 which were included in both kits were used
for the evaluations of the precision and accuracy for this
study. The Y-STR profiles, with variants (null alleles, off-
ladder alleles, or copy number variants) for all samples or
with at least one mutation event in father–son pairs, were
re-amplification and re-genotyped by Promega PowerPlex®
Y23 System (Promega Corporation, Fitchburg, Wisconsin,
USA), AGCU Y37 Kit (AGCU ScienTech Incorporation,
Wuxi, Jiangsu, China), AGCU Y SUPP PLUS Kit (AGCU Sci-
enTech Incorporation, Wuxi, Jiangsu, China), Geno-ID Y41
Human Typing (Guangzhou Koalson Intelligent BioRobotics,
Guangzhou, Guangdong, China), or independently amplifi-
cations with single primer-pairs for confirmations.
2.4 Statistical analysis
Haplotype and allele frequencies were calculated by Arlequin
version 3.5 [29]. The forensic parameters, gene diversity (GD),
haplotype diversity (HD), discrimination capacity (DC), and
match probability (MP), were conducted as our previous
studies [30–32]. GD was calculated as the formula: GD =
n
n1(1 P2
i), where n is the number of alleles at each Y-
STR locus, and Piis the frequency of the i-th allele. The same
formula was used when estimating HD, while nand Piwere
the total number of haplotypes and the frequency of the i-th
haplotype, respectively. DC was determined as the ratio be-
tween the number of different haplotypes and the sample
size. MP was defined as MP =P2
i,wherePiwas the fre-
quency of the i-th haplotype.
Mutation rates were directly estimated as the accumu-
lated number of mutations divided by the total number
of meiotic transmissions. For the multi-copy markers
(DYS385a/b, DYF387S1a/b, DYS459a/b, DYS527a/b,
DYF404S1a/b, DYF399S1a/b/c, DYF403S1a1/a2/a3,
DYS464a/b/c/d), the numbers of allelic transfers were
counted as the copy times of the numbers of meiosis. Since
two fragments with different lengths can be amplified with
same primers, the DYS389 were separated into DYS389I
and DYS389II, and the PCR product of DYS389I was a
subset of DYS389II. Due to the locus-specific structure of
the DYS389, only one mutation should be counted instead
of two if one-step slippages were observed in both DYS389I
and DYS389II. The estimation of 95% confidence interval
(CI) was produced using the binomial probability distri-
bution at the webpage (http://statpages.info/confint.html).
Spearman correlation analyses were carried out based on
the R Project for Statistical Computing (R, R version 3.5.3)
(https://www.r-project.org/), and all mean comparison
tests (utilizing Mann–Whitney Utest, Chi-square test, and
Fisher’s exact test) were conducted in SPSS 22.0 [33].
Genetic relationships between different populations
were quantified by means of RST [34,35]. Population pairwise
genetic distances (RST) and corresponding pvalues between
different populations were estimated by analysis of molecu-
lar variance (AMOVA) by the online tool available at YHRD
(https://yhrd.org/). In the calculation of RST with the online
YHRD tool, haplotypes presenting null, intermediate, dupli-
cated or triplicated alleles were removed, and the number of
repeats in DYS389I was subtracted from that of DYS389II.
Genetic similarities and differences were further visualized
by multidimensional scaling plot (MDS) using R. Addition-
ally, phylogenetic relationships among different populations
disseminated were depicted in Molecular Evolutionary
Genetics Analysis 7.0 (MEGA 7.0) software [36] by Neighbor-
Joining phylogenetic tree [37] based upon genetic distance
matrices and visualized by the Interactive Tree of Life v5
(iTOL) [38].
3 Results
3.1 Allele frequencies and gene diversity
Allele frequencies of CSEH population (1021 unrelated male
individuals), which consists of 601 Guangdong Han (GDH),
80 Fujian Han (FJH), and 340 Zhejiang Han (ZJH), for each
Y-STR markers are listed in Supporting Information Table
S3. A total of 1663 alleles were observed at 62 Y-STR mark-
ers and the corresponding allelic frequencies ranged from
0.0010 to 0.9490 (DYS645). The number of alleles at each
locus ranged from 3 for DYS645 to 358 for DYF399S1a/b/c
in CSEH. As shown in Supporting Information Tables S4–
S6, the allele frequencies of GDH, FJH, and ZJH are de-
tailed. The allele frequencies of GDH ranged from 0.0020
to 0.9450, the allele frequencies of FJH varied from 0.0130
to 0.9120, and the ranges of ZJH varied from 0.0030 to
0.9650, with the maximum values all observed at DYS645
in each population. The total alleles of GDH, FJH, and ZJH
were 1384, 600, and 1034, and the ranges of allele num-
bers were 3 (DYS645) - 281 (DYF403S1a1/a2/a3) for GDH,
2 (DYS645) - 66 (DYF399S1a/b/c) for FJH, and 2 (DYS645) -
204 (DYF399S1a/b/c) for ZJH, respectively.
The average GD of 62 Y-STRs in CSEH was 0.7269 with
the range from 0.0973 (DYS645) to 0.9952 (DYF399S1a/b/c),
and the highest GD values of single-copy and multi-
copy Y-STRs were 0.9005 (DYF403S1b) and 0.9952
(DYF399S1a/b/c). Except DYS459a/b (0.6378), the GD
values of all other multi-copy Y-STR markers were higher
than 0.9 (Supporting Information Table S7). In addition,
the average GD values of GDH, FJH, and ZJH were 0.7276,
0.7139, and 0.7238, and the Y-STRs with minimum GD for
each population were all observed at DYS645 locus, while
DYF403S1a1/a2/a3 (0.9956) in GDH, and DYF399S1a/b/c
both in FJH (0.9943) and ZJH (0.9957) were the Y-STR
markers with the maximum GD values.
3.2 Variant alleles
Overall, as summarized in Supporting Information Tables
S8–S10, we observed 85 null alleles, 121 off-ladder alleles,
and 95 copy number variants in 1021 unrelated males of
CSEH, which were found at 10, 22, and 24 Y-STR markers,
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
Electrophoresis 2021, 0,116 H. Fan et al. 5
Ta b l e 1 . Forensic parameters evaluated for different sets in CSEH (n=1021)
Number of
observed
haplotypes
Set 1 Set 2 Set 3 Set 4 Set 5 Set 6 Set 7 Set 8
1 686 926 991 997 997 997 1003 1003
2 7531 1512 12129 9
3147
491
57
63
81
10 2
13 1
21 1
Number of Y-STR
loci
9 17 2527 294532 74
Number of
haplotypes
798 966 1006 1009 1009 1009 1012 1012
Shared haplotypes 54 20 2 1 1 1 1 1
FUH 0.8596 0.9586 0.9851 0.9881 0.9881 0.9881 0.9911 0.9911
HD 0.997561 0.999832 0.999955 0.999965 0.999965 0.999965 0.999974 0.999974
DC 0.7816 0.9461 0.9853 0.9882 0.9882 0.9882 0.9912 0.9912
MP 3.69 ×10–3 1.20 ×10–3 1.04 ×10–3 1.03 ×10–3 1.03 ×10–3 1.03 ×10–3 1.02 ×10–3 1.02 ×10–3
FUH, Fraction of unique haplotypes; HD, Haplotype diversity; DC, Discrimination capacity; MP, Match probability. Set 1: YHRD Core Loci;
Set 2: AmpFLSTR®Yler®;Set 3: Promega PowerPlex®Y23; Set 4: AmpFLSTR®Yler®Plus; Set 5:YHRDMaxLoci;Set 6:AGCUY-LM
Kit; Set 7: HomyGene RM Y32 Kit; Set 8: Set 6 plus Set 7. The detailed information of each set is presented in Supplementary Figure S1.
respectively. 85 null alleles were observed and DYF403S1b
occurred 19 times, and four Y-STRs were observed more
than 10 null alleles (DYF399S1a/b/c, DYS526a, DYS526b,
and DYF403S1a1/a2/a3). A total of 121 off-ladders were
detected at 22 Y-STRs with 58 different genotypes, while 95
copy number variants were comprised of 24 duplications,
39 triplications, 27 quadruplications, and 5 quintuplicates,
which were observed at 24 Y-STR markers. All variant alleles,
which were confirmed by re-amplification and genotyping
in this study, should be interpreted with caution to exclude
DNA mixtures in forensics and may be caused by one-step
slippage replication [39,40], gene conversion [41], non-allelic,
and homologous recombination [42].
To sum up, the variant alleles of GDH, FJH, and ZJH are
presented in Supporting Information Tables S8–S10. Sepa-
rately, (a) null alleles: we observed 62, 11, and 12 null alleles
at 10, 5, and 7 different Y-STRs in GDH, FJH, and ZJH, re-
spectively; (b) off-ladders: GDH were found 75 off-ladders at
17 Y-STRs with 35 different genotypes, FJH were discovered
10 off-ladders including nine diverse genotypes at 6 Y-STRs
and only 49.2 at DYF403S1b occurred twice, while in ZJH, 36
off-ladders contained 24 disparate genotypes were observed
at 14 Y-STRs; (c) copy number variants: 55 copy number vari-
ants that included 52 different genotypes were detected at 21
Y-STRs in GDH, seven copy number variants with all unique
genotypes were uncovered at four Y-STR markers, and then
33 copy number variants were obtained at 11 Y-STRs with 30
distinct genotypes in ZJH.
3.3 Haplotype diversity
According to various combinations of Y-STRs recom-
mended by YHRD and the commercial and available kits,
we established eight different sets, which contained 9–74
Y-STR loci described in Supporting Information Figure
S1 with detail, to compare the system efficiencies of dis-
tinct sets in CSEH. The number of different and shared
haplotypes and forensic parameters for different sets in
CSEH are presented in Table 1. A total of 1012 distinct
haplotypes were determined for the 74 Y-STR loci in CSEH,
of which 1003 (99.11%) haplotypes were unique while 9
(0.89%) haplotypes occurred twice. The overall HD for 74
Y-STR loci was 0.999974 with a DC value of 0.9912. The
haplotype data and haplotype frequencies of CSEH are
shown in Supporting Information Table S11. One com-
mon haplotype (H001, Fujian_Han077/Zhejiang_Han016)
was shared in two individuals from FJH and ZJH, while
for Set 3 (Promega PowerPlex®Y23), another shared
haplotype was observed in two individuals from GDH
(Guangdong_Han343) and FJH (Fujian_Han059). When
evaluated by Set 2 (AmpFLSTR®Yfiler®), we obtained 20
shared haplotypes in all, of that, 17 were shared in two
different populations, and two were shared among three dif-
ferent populations (GDH, FJH, and ZJH), which shared in
seven (GDH185_343_402_467_575/FJH034_059/ZJH090),
4 (GDH197_379/FJH039/ZJH054), and three individuals
(GDH147/FJH007/ZJH335), respectively.
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
6H. Fan et al. Electrophoresis 2021, 0,116
Furthermore, we assessed the performances of eight sets
in three different populations, and the numbers of haplo-
types and forensic-related parameters are listed in Support-
ing Information Table S12. Based on the consequences in
different populations with eight distinct sets contained vari-
eties of Y-STR combinations which demonstrated in Table 1
and Supporting Information Table S12, as a whole, with the
number of analyzed Y-STR loci increased, more unique hap-
lotypes were obtained, and HD as well as DC were also in-
creased. Whereas, the shared haplotypes and MP values were
decreased with the increasing number of Y-STR loci. Interest-
ingly, when the number reached 25–27 Y-STR loci, the ability
to improve the system effectiveness instantly was limited in
populations of this study.
3.4 Estimated mutation rates of Y-STRs
In this study, a bunch of 62 tri-, tetra-, penta-, and
hexanucleotide Y-STR markers was detected by multiplex
fluorescence-based fragment length analysis in 1027 CSEH
father-son pairs (2048 individuals in total), a total of 76 099
meiotic transfers were observed directly with 369 confirmed
mutations at 49 (79.03%) of 62 Y-STR markers which were re-
genotyped for confirmation, and the estimated mutation rates
ranged from 0 (13 single-copy Y-STRs, details in Table 2) to
19.95 ×10–3 (DYF399S1a/b/c) with an average estimated mu-
tation rate of 4.85 ×10–3 (95% CI, 4.4 ×10–3–5.4 ×10–3 )in
CSEH. Based on mutation rates recommended by Ballantyne
et al. [14], Y-STRs can divide into four groups for various ap-
plications: (a) rapidly mutating (RM) Y-STRs (μ>1×10–2);
(b) fast mutating (FM) Y-STRs (5 ×10–3–1 ×10–2 ); (c) mod-
erately mutating (MM) Y-STRs (1 ×10–3–5 ×10–3 ); (d) slowly
mutating (SM) Y-STRs (μ<10–3). As mentioned in Table 2,
the assessed Y-STRs in CSEH consisted of 8 RM Y-STRs, 8
FM Y-STRs, 28 MM Y-STRs, and 18 SM Y-STRs, which were
demonstrated in Fig. 1B intuitionally.
Moreover, the estimated mutation rates and mutation
summaries of 62 Y-STRs in GDH, FJH, and ZJH are shown
in Supporting Information Tables S13–S15. The average esti-
mated mutation rates were 4.65 ×10–3 (95% CI, 4.0 ×10–3
5.3 ×10–3), 5.66 ×10–3 (95% CI, 3.9 ×10–3 –7.9 ×10–3),
and 5.01 ×10–3 (95% CI, 4.2 ×10–3–6.0 ×10–3 )forGDH,
FJH, and ZJH, respectively. There was no statistically signif-
icant difference among the average estimated mutation rates
of the three populations (χ2=1.333, p=0.514). The Y-STR
markers with the highest estimated mutation rates of GDH,
FJH, and ZJH were DYF399S1a/b/c (16.52 ×10–3, 95% CI,
11.2 ×10–3–23.5 ×10–3 ), DYS526b (4.94 ×10–2, 95% CI,
1.4 ×10–2–12.2 ×10–2 ), and DYF399S1a/b/c (23.90 ×10–3,
95% CI, 15.5 ×10–3–35.1 ×10–3 ), respectively. There was no
mutation event observed in double-copy Y-STRs (DYS385a/b,
DYS527a/b, DYS459a/b, DYF387S1a/b, and DYF404S1a/b)
of FJH. Besides, for the set of 62 Y-STRs, there were 5 RM Y-
STRs, 13 FM Y-STRs, 25 MM Y-STRs, and 19 SM Y-STRs in
GDH, 18 RM Y-STRs, 1 MM Y-STRs, and 43 SM Y-STRs in
FJH as well as 9 RM Y-STRs, 13 FM Y-STRs, 16 MM Y-STRs,
and 24 SM Y-STRs in ZJH.
3.5 Characteristic of mutations
For the total 369 mutation events observed in CSEH (Table 2),
the overwhelming majority of 359 mutations were single-
repeat changes (97.29%), and then only 10 multi-repeat
events were observed, among which seven were double-step
changes (1.90%) and three were triple-step changes (0.81%).
Besides, the ratio between single-repeat changes and multi-
repeat mutations (S/M ratio) were 35.90:1, which was statis-
tically significant (Z=–7.905, p<0.001). Furthermore, the
S/M ratio of each population that were summarized in Ta-
ble 3, 33.50:1 in GDH (Z=–7.188, p<0.001), 33.00:1 in FJH
(Z=–4.427, p<0.001), and 41.67:1 in ZJH (Z=–6.558,
p<0.001), had significant differences. However, the differ-
ences of single-repeat and multi-repeat mutation events in
three different populations were not statistically significant
(χ2=0.100, p=0.951).
In addition, a slight excess of 201 (54.47%) gain repeats
over 168 (45.53%) loss repeats were observed among 369 mu-
tations in CSEH with the overall ratio of repeat gains:loss
(G/L ratio) was 1.20:1, while the difference was not statisti-
cally significant (Z=–1.748, p=0.080). The detailed G/L ra-
tios of three populations were demonstrated in Table 3. The
G/L ratio of GDH was 1.62:1 with statistical significance (Z
=–2.297, p=0.022), while the G/L ratios of FJH and ZJH
had no significant difference (Z=–1.425, p=0.154; Z=
–0.507, p=0.612). What’s more, there was a statistically sig-
nificant difference between the gains and the losses among
GDH, FJH, and ZJH (χ2=13.113, p=0.001).
When the markers were isolated by the copy times, we
found that the average estimated mutation rates of single-
copy markers (4.09 ×10–3, 95% CI, 3.6 ×10–3 –4.7 ×10–3)
in CSEH were lower than the multi-copy markers (6.89 ×
10–3, 95% CI, 5.8 ×10–3 –8.1 ×10–3) with a significant sta-
tistical difference (χ2=24.107, p<0.001). As illustrated in
Table 3, the average estimated mutation rates of double-copy,
triple-copy, and tetra-copy markers in CSEH were3.39 ×10–3
(95% CI, 2.4 ×10–3–4.7 ×10–3 ), 15.83 ×10–3 (95% CI, 12.9 ×
10–3–19.3 ×10–3 ), and 2.19 ×10–3 (95% CI, 1.0 ×10–3–4.2 ×
10–3), respectively. The average estimated mutation rates of Y-
STRs with different copy times had significant differences in
CSEH (χ2=170.913, p<0.001). For single-copy and multi-
copy Y-STR markers in GDH (χ2=11.715, p=0.001) and
ZJH (χ2=11.789, p=0.001), the average estimated mutation
rates had statistically significant, but no significant difference
for the average estimated mutation rates of single- and multi-
copy Y-STRs in FJH (χ2=1.137, p=0.286). What’s more,
the estimated Y-STR mutation rates with different copies in
GDH, FJH, and ZJH were shown in Table 3, there were statis-
tically significant differences for the estimated mutation rates
of Y-STRs with distinct copy numbers (GDH, χ2=63.951, p
<0.001; FJH, Fisher’s exact test, p<0.001; ZJH, χ2=87.004,
p<0.001). However, there was no significant difference for
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
Electrophoresis 2021, 0,116 H. Fan et al. 7
Ta b l e 2 . Estimated mutation rates and mutation summaries of 62 Y-STRs in CSEH (N=1027)
Y-STR
Meiosis
number
Mutation
count
Single-
step
mutation
count
Double-
step
mutation
count
Triple-step
mutation
count
Number
ofgains
Number
oosses
Mutation
rate (×103)
Binomial 95%
CI (×103)
DYF403S1b 1029 5 5 0 0 2 3 4.86 1.6–11.3
DYS19 1028 4 4 0 0 3 1 3.89 1.1–9.9
DYS388 1027 0 0 0 0 0 0 0.00 0–3.6
DYS389I 1027 1 1 0 0 0 1 0.97 0–5.4
DYS389II 1027 7 7 0 0 4 3 6.82 2.7–14.0
DYS390 1027 0 0 0 0 0 0 0.00 0–3.6
DYS391 1027 3 3 0 0 2 1 2.92 0.6–8.5
DYS392 1027 0 0 0 0 0 0 0.00 0–3.6
DYS393 1027 2 2 0 0 2 0 1.95 0.2–7.0
DYS437 1027 3 3 0 0 3 0 2.92 0.6–8.5
DYS438 1027 0 0 0 0 0 0 0.00 0–3.6
DYS439 1029 4 4 0 0 4 0 3.89 1.1–9.9
DYS443 1028 0 0 0 0 0 0 0.00 0–3.6
DYS444 1027 4 4 0 0 2 2 3.89 1.1–9.9
DYS446 1027 1 1 0 0 0 1 0.97 0–5.4
DYS447 1028 3 1 1 1 1 2 2.92 0.6–8.5
DYS448 1027 0 0 0 0 0 0 0.00 0–3.6
DYS449 1028 9 9 0 0 6 3 8.75 4.0–16.6
DYS456 1027 4 4 0 0 1 3 3.89 1.1–9.9
DYS458 1028 5 5 0 0 2 3 4.86 1.6–11.3
DYS460 1027 2 2 0 0 2 0 1.95 0.2–7.0
DYS481 1027 5 5 0 0 3 2 4.87 1.6–11.3
DYS508 1027 2 2 0 0 1 1 1.95 0.2–7.0
DYS510 1027 5 4 1 0 3 2 4.87 1.6–11.3
DYS516 1027 4 4 0 0 2 2 3.89 1.1–9.9
DYS518 1027 9 9 0 0 3 6 8.76 4.0–16.6
DYS520 1028 0 0 0 0 0 0 0.00 0–3.6
DYS522 1027 4 4 0 0 2 2 3.89 1.1–9.9
DYS526a 1028 2 2 0 0 1 1 1.95 0.2–7.0
DYS526b 1029 19 19 0 0 9 10 18.46 11.2–28.7
DYS531 1027 1 1 0 0 1 0 0.97 0–5.4
DYS533 1027 2 2 0 0 1 1 1.95 0.2–7.0
DYS534 1028 11 11 0 0 8 3 10.70 5.4–19.1
DYS547 1027 8 8 0 0 1 7 7.79 3.4–15.3
DYS549 1028 5 5 0 0 3 2 4.86 1.6–11.3
DYS552 1027 5 5 0 0 4 1 4.87 1.6–11.3
DYS557 1029 4 4 0 0 4 0 3.89 1.1–9.9
DYS570 1027 7 7 0 0 5 2 6.82 2.7–14.0
DYS576 1028 11 11 0 0 9 2 10.70 5.4–19.1
DYS587 1027 0 0 0 0 0 0 0.00 0–3.6
DYS593 1027 0 0 0 0 0 0 0.00 0–3.6
DYS596 1027 0 0 0 0 0 0 0.00 0–3.6
DYS612 1029 8 8 0 0 5 3 7.77 3.4–15.3
DYS617 1027 0 0 0 0 0 0 0.00 0–3.6
DYS622 1027 4 4 0 0 2 2 3.89 1.1–9.9
DYS626 1029 10 9 1 0 6 4 9.72 4.7–17.8
DYS627 1028 14 13 1 0 6 8 13.62 7.5–22.7
DYS630 1032 13 13 0 0 5 8 12.60 6.7–21.4
DYS635 1027 2 2 0 0 1 1 1.95 0.2–7.0
DYS643 1027 2 2 0 0 2 0 1.95 0.2–7.0
DYS645 1027 0 0 0 0 0 0 0.00 0–3.6
DYS713 1028 12 11 0 1 7 5 11.67 6.0–20.3
Y_GATA_A10 1027 1 1 0 0 1 0 0.97 0–5.4
Y_GATA_H4 1027 0 0 0 0 0 0 0.00 0–3.6
DYS385a/b 2061 6 6 0 0 3 3 2.91 1.1–6.3
(Continued)
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
8H. Fan et al. Electrophoresis 2021, 0,116
Ta b l e 2 . (Continued)
Y-STR
Meiosis
number
Mutation
count
Single-
step
mutation
count
Double-
step
mutation
count
Triple-step
mutation
count
Number
ofgains
Number
oosses
Mutation
rate (×103)
Binomial 95%
CI (×103)
DYS527a/b 2059 9 8 1 0 7 2 4.37 2.0–8.3
DYS459a/b 2054 1 1 0 0 1 0 0.49 0–2.7
DYF387S1a/b 2070 7 7 0 0 4 3 3.38 1.4–7.0
DYF404S1a/b 2066 12 12 0 0 10 2 5.81 3.0–10.1
DYF399S1a/b/c 3107 62 60 2 0 23 39 19.95 15.3–25.5
DYF403S1a1/
a2/a3
3083 36 36 0 0 20 16 11.68 8.2–16.1
DYS464a/b/c/d 4112 9 8 0 1 4 5 2.19 1.0–4.2
Total 76 099 369 359 7 3 201 168 4.85 4.4–5.4
N, numbers of father-son pairs.
Ta b l e 3 . Average estimated mutation rates and characteristics of mutations in GDH (N=601), FJH (N=81), ZJH (N=345), and CSEH
(N=1027)
GDH FJH ZJH CSEH
Total meiosis number 44 531 6002 25 566 76 099
Mutation events 207 34 128 369
S/M ratio 33.50:1 33.00:1 41.67:1 35.90:1
G/L ratio 1.62:1 0.48:1 0.94:1 1.20:1
Average estimated
mutation rates (×10–3)
62 Y-STRs 4.65 5.66 5.01 4.85
single-copy Y-STRs 3.97 5.03 4.08 4.09
double-copy Y-STRs 3.98 0.00 3.18 3.39
triple-copy Y-STRs 13.26 22.54 18.73 15.83
tetra-copy Y-STRs 2.49 3.08 1.45 2.19
N, numbers of father-son pairs.
the estimated mutation rates of double-copy Y-STRs (χ2=
3.428, p=0.330), single-copy (χ2=1.061, p=0.787), triple-
copy (χ2=4.024, p=0.259), and tetra-copy (Fisher’s exact
test, p=0.748) Y-STRs among CSEH, GDH, FJH, and ZJH.
3.6 Relevant factors with the estimated mutation
rates of Y-STRs
In order to explore the correlations between the estimated
mutation rates of Y-STR markers and the influence factors in
CSEH, we gathered the information of locations (Supporting
Information Table S2), length of repetitive motif (Supporting
Information Table S2), average allele size (Supporting Infor-
mation Table S2), and GD values (Supporting Information Ta-
ble S7) of 62 Y-STR markers to conduct Spearman correlation
analyses. The overall results based on the set of 62 Y-STRs in
CSEH are visualized in Fig. 2A, we found that the estimated
mutation rates of Y-STRs had positive correlations with GD
values (Supporting Information Figure S2, R2=0.6548, p=
7.788 ×10–9) and average allele sizes (Supporting Informa-
tion Figure S3, R2=0.5989, p=2.723 ×10–7).
The examined 62 Y-STR markers were classified into five
different groups (Yp11.2, Yq11.21, Yq11.221, Yq11.222, and
Yq11.223) according to the Y-STR positions in different re-
gions of Y chromosome (details in Fig. 1B and Supporting
Information Table S16), the average estimated mutation rates
of various regions ranged from 3.16 ×10–3 (Yq11.221, 95%
CI, 2.4 ×10–3–4.1 ×10–3 ) to 6.24 ×10–3 (Yp11.2, 95% CI,
5.3 ×10–3–7.4 ×10–3 ) and the S/M ratios varied from 17.00:1
to 69.50:1, while the average estimated mutation rates of five
different regions were statistically different in CSEH (χ2=
23.774, p<0.001). Otherwise, the average mutation rates of
the short (p) and long (q) arms were 6.24 ×10–3 (95% CI,
5.3 ×10–3–7.4 ×10–3 ) and 4.35 ×10–3 (95% CI, 3.7 ×10–3
4.9 ×10–3) with no significant difference (χ2=8.911, p=
0.003).
Referred to the classification methods drawn by Ge et al.
[43], we simply classified the allele sizes into four groups in
quartiles (according to the allelic ladders of each Y-STR mark-
ers) which were presented in Fig. 2B, alleles of each Y-STR
markers were categorized into short (0–25%), moderate-short
(26–50%), moderate-long (51–75%), and long (76–100%) al-
leles, respectively. The detailed information for different
groups categorized by allele sizes is shown in Supplementary
Table S17. The mutation rates of long alleles (17.98 ×10–3,
95% CI, 13.8 ×10–3–23 ×10–3 ) were evidently higher than
the moderate-long (4.92 ×10–3, 95% CI, 4.2 ×10–3 –5.7 ×
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
Electrophoresis 2021, 0,116 H. Fan et al. 9
Figure 2. Correlations between the mutation rates of Y-STRs and associative factors.
A. Spearman correlations between the estimated mutation rates of 62 Y-STRs and relevant factors in CSEH (AAS, Average allele size for
each Y-STR; GD, Genetic diversity value for each Y-STR; Lo, Each Y-STR location in Y Chromosome; LM, Length of repetitive motif for each
Y-STR); B. Detailed information of four different groups categorized by allele sizes in CSEH (Mutation counts, mutation rates, proportions
of Gains and Losses, and G/L ratios).
10–3), moderate-short (3.38 ×10–3 , 95% CI, 2.8 ×10–3–4.1 ×
10–3), and short (4.36 ×10–3 , 95% CI, 2.6 ×10–3–6.8 ×10–3 )
alleles, which declared that the long allele group was prone
to mutate compared with the groups of short and moderate
alleles in CSEH. Most mutation events (289/369 =78.32%)
occurred in moderate alleles (including moderate-short and
moderate-long alleles), and the G/L ratio of long allele group
was 0.39:1, while the S/M ratio of short allele group was only
0.12:1. Besides, as the Fig. 2B intuitively demonstrated, repeat
losses more frequently occurred in longer alleles.
3.7 Candidate RM Y-STR set in CSEH
Based on Y-STR mutation rate more than 10–2, we identified
a candidate set of RM Y-STRs in CSEH: DYS526b, DYS534,
DYS576, DYS627, DYS630, DYS713, DYF403S1a1/a2/a3,
and DYF399S1a/b/c, of which, DYF403S1a1/a2/a3 and
DYF399S1a/b/c have three copies, and the rest six Y-
STRs were single-copy markers, which were illustrated
in Fig. 1B, containing three new candidate RM Y-STRs
(DYS534, DYS630, and DYS713) in CSEH.
The average estimated mutation rate of our candidate
RM Y-STR set was 14.40 ×10–3 (95% CI, 12.4 ×10–3
16.7 ×10–3) with a range from 10.70 ×10–3 (95% CI, 5.4
×10–3–19.1 ×10–3 ) at DYS534 and DYS576 to 19.95 ×10–3
(95% CI, 15.3 ×10–3–25.5 ×10–3 ) at DYF399S1a/b/c. A to-
tal of 178 variant events were observed with the S/M ratio of
43.5:1, of multi-step mutation events, three were double-step
and only one was triple-step mutations. Moreover, we observe
87 gain repeats and 91 loss repeats, giving an almost balanced
G/L ratio (0.96:1). Our candidate RM Y-STR set, 13 RM Y-
STR set (reported by Ballantyne et al. [14]) and Set 8 (Sup-
porting Information Figure S1 and Table S12) were collected
for comparison (Supporting Information Table S18), the per-
formances and system effectiveness for each of sets were
roughly the same, no matter in CSEH or separately in GDH,
FJH, and ZJH. In addition, we conducted our candidate RM
Y-STR set and 13 RM Y-STR set to evaluate the abilities to
differentiate male relatives. The probabilities of observing at
least one mutation with the eight RM Y-STR set and 13 RM
Y-STR set in 1015 DNA-confirmed father-son pairs (only in-
cluding one father-one son pairs in CSEH) were 0.1852 (95%
CI, 0.1618 to 0.2105) and 0.2099 (95% CI, 0.1852 to 0.2362).
The ability (distinguishing 18.52–20.99% CSEH father–son
pairs) was much lower than the 13 RM Y-STR set in Eu-
rope (70% of father–son pairs) because of the limited Y-STR
markers and different mutation rates in different populations
[14,23,43–60], which indicated that the 13 RM Y-STR set was
not always suitable for every population, and new RM Y-STRs
needed to be found and evaluated for different populations.
3.8 Population genetics
The Coastal Southeastern Han (CSEH), who speak Han Chi-
nese including Huetseu Dialect for ZJH, Min Chinese for
FJH, and Hakka Chinese and Cantonese for GDH, have lived
in southeastern China (Fig. 3C) where bordered the East
China Sea, Taiwan Strait, and the South China Sea. In order
to assess the degrees of differentiation of CSEH, we collected
50 778 individuals from 166 worldwide populations who be-
long to nine main language families [21,24,30–32,61–116].
Based on the frequencies of 27 Y-filer Plus markers, we made
a Manhattan distance-based MDS that indicated that CSEH
clustered with populations of southern China in Fig. 3D is a
southern population of China, which is in accordance with
geographical and historical records [117–119]. The MDS plot
(Fig. 3E) that makes further evaluation between CSEH and
85 Chinese populations indicated the same conclusion in line
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
10 H. Fan et al. Electrophoresis 2021, 0,116
Figure 3. Geographical location of CSEH and population genetics analyses between CSEH and various numbers of worldwide popula-
tions.
A. Locations of worldwide populations for population genetics; B. Detailed information of Chinese populations for population genetics;
C. Geographical location of CSEH; D. MDS plot between CSEH and 166 populations (50778 individuals in total) all over the world; E.MDS
plot between CSEH and 85 Chinese populations based on the frequencies of 27 Yler Plus Y-STRs; F. Phylogenetic tree constructed by
Neighbor-Joining method.
with the results of Fig. 3D. The results of calculations of pair-
wise RST and the corresponding pvalue between CSEH and
11 other Chinese Han populations [32,81,82,91,92,120–126]
from different administrative divisions (Fig. 3C) on the basis
of 27 Yfiler Plus haplotypes are listed in Supplementary Table
S20. After Bonferroni’s correction of pvalues (p<0.00005),
CSEH had significant differences with Guangxi Han (RST
=0.0434, p<0.00005), Guizhou Han (RST =0.0083, p
<0.00005), Hainan Han (RST =0.0067, p<0.00005), and
Shanghai Han (RST =0.0079, p<0.00005), while there were
no significant differences between CSEH and other south-
ern and eastern Han Chinese. Furthermore, the phylogenetic
relationships were illustrated by Neighbor-Joining phyloge-
netic tree. As shown in Fig. 3F, Hainan Han separated with
other southern Han Chinese relatively, which was in accor-
dance with our previous study [30]. Obviously, populations
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
Electrophoresis 2021, 0,116 H. Fan et al. 11
from southern and eastern China clustered together accord-
ing to the geographical locations, CSEH was located among
them with a relative isolated position. In general, CSEH is
one of southern Han Chinese, while has relatively far dis-
tances with other southern and eastern Han Chinese.
4 Discussion
In the present study, the paternal Y-STR landscape of CSEH
depicted from the aspects of forensic characteristics, haplo-
type analyses, mutation rates as well as population genet-
ics. We screened 62 tri-, tetra-, penta-, and hexanucleotide
Y-STR markers for haplotypes and mutations in a reason-
ably large number of up to 1021 unrelated males and 1027
DNA-confirmed father-son pairs per marker by two multiplex
fluorescence-based kits. This study therefore considerably in-
creases the current knowledge of CSEH population and Y-
STR mutation rates of CSEH.
The GD values of most multi-copy Y-STR markers (ex-
cept to DYS459a/b) in CSEH were greater than 0.9, while
DYS459a/b was less than 0.9 with a GD of 0.6378, which was
in accordance with Shaanxi Han (0.6410, n=105) [127] and
Caucasians (0.6667, n=148) [128]. Generally, the multi-copy
Y-STRs are more diverse than single-copy Y-STRs in foren-
sic genetics. From the haplotype analyses of CSEH, GDH,
FJH, and ZJH, we found that, with the number of analyzed
Y-STR loci increased, more unique haplotypes were obtained
and HD and DC were also increased, on the other hand, the
MP values were decreased, which was in line with the re-
sults of our previous study [30]. At the same time, the shared
haplotypes were decreased with the increasing number of Y-
STRs, which was inevitable in regional populations, such as
at provincial level, due to the small-scale migrations and pop-
ulation expansions which will more frequent with the rapid
developments of economies and transportations. Therefore,
selecting more suitable Y-STRs for region-wide applications
and the combined applications of genetic markers other than
Y-STRs, such as Y-SNPs, are very necessary in forensics [90].
A total of 76 099 meiotic transfers were directly observed
across all 62 markers in 1027 CSEH father-son pairs, and the
obtained average Y-STR mutation rate, 4.85 ×10–3 (95% CI,
4.4 ×10–3–5.4 ×10–3 ), corresponded closely to Southern Chi-
nese Han population studied by Wang et al. [24] (4.8 ×103,
95% CI 2.3 ×103–8.3 ×103) and Eastern Chinese Han
population reported by Wu et al. [23] (4.1 ×10–3, 95% CI, 3.6
×10–3–4.7 ×10–3 ). The GD values (R2=0.6548, p=7.788 ×
10–9) and the average allele sizes (R2=0.5989, p=2.723 ×
10–7) had positive relationships with the estimated mutation
rates for the examined 62 Y-STRs in CSEH. However, the
Spearman correlations demonstrated that Y-STR locations
and the length of the repetitive motif had no apparent
relationships with the mutation rates in CSEH. The average
estimated mutation rates of five different regions (Fig. 1B and
Supporting Information Table S16) were statistically different
in CSEH (χ2=23.774, p<0.001), and apparently, the mu-
tation rates significantly increased with the Y-STR locations
away from centromere (from Yq11.21 to Yq11.223) which
were in accordance with previous study [129]. In addition,
the long allele group categorized by allele sizes was prone
to mutate compared with the groups of short and moderate
alleles, and repeat losses were more frequently occurred in
longer alleles in CSEH corresponded to previous study [43].
Our candidate RM Y-STR set, 13 RM Y-STR set (reported
by Ballantyne et al. [14]) and Set 8 (Supporting Information
Figure S1 and Table S12) were collected for comparisons, the
performances and system effectiveness for each of sets were
roughly the same, no matter in CSEH or separately in GDH,
FJH and ZJH. In addition, we conducted our candidate RM Y-
STR set and 13 RM Y-STR set to evaluate the abilities to differ-
entiate male relatives. The probabilities of observing at least
one mutation with the 8 RM Y-STR set and 13 RM Y-STR set
in 1015 DNA-confirmed father-son pairs (only including one
father-one son pairs in CSEH) were 0.1852 (95% CI, 0.1618
to 0.2105) and 0.2099 (95% CI, 0.1852 to 0.2362). The ability
(distinguishing 18.52–20.99% CSEH father–son pairs) was
much lower than the 13 RM Y-STR set in Europeans (70%
of father–son pairs) because of the limited Y-STR markers
and different mutation rates in different populations (Sup-
porting Information Table S19), which indicated that the 13
RM Y-STR set was not always suitable for every population,
and new RM Y-STRs needed to be found and evaluated for
different populations.
From the comparisons of Y-STR mutation rates in
different populations [14,23,43–60], the mutation rate of
DYF399S1a/b was 21.19 ×10–3 in CSEH compared with
77.3 ×10–3 for Europeans, 70.1 ×10–3 for Pakistanis, 62.9
×10–3 for Hebei Han, 60.0 ×10–3 for south and east Turks,
45.9 ×10–3 for Beijing Han, 40.6 ×10–3 for Guangxi Han,
and 38.5 ×10–3 for Sichuan Yi, which was 1–4 folds greater
than CSEH. The same issues occurred at other Y-STRs, for
example, DYS518 and DYF387S1. The same Y-STR have dif-
ferent mutation rates in different populations, even the same
populations in different regions. Thus, making clear the back-
grounds of Y-STR mutation rates in different populations or
regions are of great benefit for forensic applications. We iden-
tified eight candidate RM Y-STRs, including three brand-new
DYS534, DYS630, and DYS713, which were firstly reported
in CSEH or other Chinese populations as far as we know. In
addition, compared with the European-based set of 13 RM
Y-STRs [14], our candidate RM Y-STR set had inadequate ca-
pability to differentiate between male relatives (CSEH father–
son pairs). Even though the RM Y-STRs have potential appli-
cation in male relative differentiation [22], the limited num-
bers of RM Y-STR candidates in CSEH have limitations in
forensic genetics and genetic genealogy, particularly regard-
ing close patrilineal relatives and different mutation rates for
the same Y-STR marker in different populations. Arwin Ralf
et al. [11] applied a newly developed in silico search approach
to the Y-chromosome reference sequence, and highlighted
12 novel RM Y-STRs in Europe. It is one of the efficient ways
to strengthen the capacity of RM Y-STR system for differenti-
ating relative or unrelated males. Another relative simple and
cost-effective way is to validate and select currently known
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
12 H. Fan et al. Electrophoresis 2021, 0,116
Y-STRs as RM Y-STR markers to adapt to different popu-
lations for regional individual identification. Both methods
make the strategy—SM Y-STRs for familial searching and
RM Y-STRs for individual identification—possible to apply
in practice.
Due to three waves of large-scale migrations in the past
two millennia of Chinese history, which took place during
the Western Jin Dynasty (AD 265–316), the Tang Dynasty
(AD 618–907), and the Southern Song Dynasty (AD 1127–
1279), and many smaller southward migrations with contin-
uous southward movements of Han people because of war-
fare, drought, and famine in the north, which accelerated
that Han people and Han culture expanded into southern
China in all ages, gene flows between northern Hans, south-
ern Hans, and southern natives contributed to the admixture
which shaped the genetic profiles of the extant populations
[117–119,130–132]. The results of previous population genet-
ics indicated that CSEH is one of southern Han Chinese,
and has a relatively isolated position in southern Han Chi-
nese, however, the role of the limited gene flows from the
southern and eastern populations in shaping the genetic pool
of CSEH can’t be ignored. From the genetic and geographi-
cal perspectives, it’s necessary to figure out the distributions
and backgrounds of CSEH lived in population-mixed South
China, and get enough knowledge of the Y-STR mutability
for forensic applications, especially for SM Y-STRs for fa-
milial searching and RM Y-STRs for individual identification
regionally.
5 Concluding remarks
In conclusion, we investigated 62 Y-STR markers (74 Y-STR
loci in total) in a reasonably large number of up to 1021
unrelated males and 1027 DNA-confirmed father-son pairs
from 1021 families in CSEH. The allele frequencies ranged
from 0.0010 to 0.9490, and the GD values varied from 0.0973
to 0.9952 with an average of 0.7269. We found 85 null alle-
les, 121 off-ladder alleles, and 95 copy number variants that
were detected at 10, 22, and 24 Y-STRs, respectively. A total
of 1012 distinct haplotypes were determined with the over-
all HD and DC values of 0.999974 and 0.9912. There were
76 099 meiotic transfers observed directly with 369 muta-
tions, giving a S/M ratio of 35.90:1 (359 single-step, 7 double-
step, and 3 triple-step changes), and the average estimated
Y-STR mutation rate was 4.85 ×10–3 (95% CI, 4.4 ×10–3
5.4 ×10–3) across all 62 markers in CSEH. A slight excess of
201 (54.47%) gain repeats over 168 (45.53%) loss repeats were
observed among 369 mutations in CSEH with the overall G/L
ratio was 1.20:1, while the differences of G/L ratios in GDH,
FJH, and ZJH were statistically significant (χ2=13.113, p=
0.001). The average estimated mutation rates of single-copy
markers (4.09 ×10–3, 95% CI, 3.6 ×10–3 –4.7 ×10–3)were
lower than the multi-copy markers (6.89 ×10–3, 95% CI, 5.8
×10–3–8.1 ×10–3 ) in CSEH with a significant statistical dif-
ference (χ2=24.107, p<0.001). In CSEH, GD values (R2
=0.6548, p=7.788 ×10–9) and average allele sizes (R2=
0.5989, p=2.723 ×10–7) had positive correlations with mu-
tation rates. In addition, the average estimated mutation rates
of five different regions were statistically different in CSEH
(χ2=23.774, p<0.001). We identified eight candidate RM
Y-STRs (DYS526b, DYS576, DYS627, DYF403S1a1/a2/a3,
DYF399S1a/b/c, and three new RM Y-STR, DYS534,
DYS630, as well as DYS713) in CSEH, 18.52% of father–son
pairs could be distinguished with one or more mutations.
CSEH, one of southern Han Chinese, and has a relatively iso-
lated position in population-mixed South China. It’s neces-
sary to figure out the distributions and backgrounds of CSEH
and get enough knowledge of the Y-STR mutability for foren-
sic applications, especially for SM Y-STRs for familial search-
ing and RM Y-STRs for individual identification regionally.
Highlights
1. A bunch of 62 tri-, tetra-, penta-, and hexanucleotide Y-
STR markers was detected by multiplex fluorescence-
based fragment length analysis in 1027 DNA-confirmed
father-son pairs of CSEH.
2. The allele frequencies ranged from 0.0010 to 0.9490, and
the GD values varied from 0.0973 to 0.9952 with an av-
erage of 0.7269. In addition, 1012 distinct haplotypes
were determined with the overall HD and DC values of
0.999974 and 0.9912.
3. A total of 369 mutations were observed in 76099 meiotic
transfers, giving S/M and G/L ratios of 35.90:1 and 1.20:1,
and the average estimated Y-STR mutation rate was 4.85
×10–3 (95% CI, 4.4 ×10–3–5.4 ×10–3 )inCSEH.
4. CSEH, one of southern Han Chinese, has a relatively iso-
lated position in population-mixed South China.
5. The strategy, SM Y-STRs for familial searching and RM
Y-STRs for individual identification regionally, could be
applicable and meaningful based on enough knowledge
of the Y-STR mutability of different populations.
First, we would like to thank all donors for this study. In ad-
dition, especially, we would like to thank Qiqian Xie for statis-
tical analyses (all mean comparison tests by SPSS), Zhonghao
Yu for data preparations (Supporting Information Tables S3–S7
and Figures 3A and B),andCheng Xiao for confirmation experi-
ments on variant alleles. This study was supported by the Science
and Technology Program of Guangzhou, China (grant numbers:
2 019 030 014), and the Science and Technology Planning Project
of Guangdong Province, China (2013B021500010).
The authors have declared no conflict of interest.
Data availability statement
The data that support the findings of this study are available
in the supplementary material of this article.
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
Electrophoresis 2021, 0,116 H. Fan et al. 13
Ethics
All participants had signed the informed consent form (the
underage sons were signed by their fathers in accordance
with Chinese laws and regulations). The study was approved
by the Biomedical Ethical Committee of the Southern Medi-
cal University (No. 2020-001).
6 References
[1] Kayser, M., Hum. Genet. 2017, 136, 621–635.
[2] Kayser, M., Vermeulen, M., Knoblauch, H., Schuster, H.,
Krawczak, M., Roewer, L., Forensic Sci. Int. Genet. 2007,
1, 125–128.
[3] Underhill, P. A., Kivisild, T., Ann. Rev. Genet. 2007, 41,
539–564.
[4] Calafell, F., Larmuseau, M. H. D., Hum. Genet. 2017, 136,
559–573.
[5] Kayser, M., Brauer, S., Weiss, G., Underhill, P. A.,
Roewer, L., Schiefenhovel, W., Stoneking, M., Curr.
Biol. 2000, 10, 1237–1246.
[6] Jobling, M. A., Tyler-Smith, C., Nat. Rev. Genet. 2017,
18, 485–497.
[7] Hughes, J. F., Page, D. C., Nat. Genet. 2016, 48, 588–
589.
[8] Parson, W., Niederstatter, H., Brandstatter, A., Berger,
B., Int. J. Legal Med. 2003, 11 7 , 109–114.
[9] Prinz, M., Boll, K., Baum, H., Shaler, B., Forensic Sci. Int.
1997, 85, 209–218.
[10] Roewer, L., Forensic Sci. Med. Pathol. 20 09, 5, 77–
84.
[11] Ralf, A., Lubach, D., Kousouri, N., Winkler, C., Schulz,
I., Roewer, L., Purps, J., Lessig, R., Krajewski, P.,
Ploski, R., Dobosz, T., Henke, L., Henke, J., Larmuseau,
M. H. D., Kayser, M., Hum. Mutat. 2020, 41, 1680–
1696.
[12] Ballantyne, K. N., Keerl, V., Wollstein, A., Choi, Y., Zu-
niga, S. B., Ralf, A., Vermeulen, M., de Knijff, P., Kayser,
M., Forensic Sci. Int. Genet. 2012, 6, 208–218.
[13] Ballantyne, K. N., Kayser, M., Forensic Sci. Rev. 2012,
24, 63–78.
[14] Ballantyne, K. N., Goedbloed, M., Fang, R., Schaap,
O., Lao, O., Wollstein, A., Choi, Y., van Duijn, K., Ver-
meulen, M., Brauer, S., Decorte, R., Poetsch, M., von
Wurmb-Schwark, N., de Knijff, P., Labuda, D., Vezina, H.,
Knoblauch, H., Lessig, R., Roewer, L., Ploski, R., Dobosz,
T., Henke, L., Henke, J., Furtado, M. R., Kayser, M., Am.
J. Hum. Genet. 2010, 87, 341–353.
[15] Ballantyne, K. N., Ralf, A., Aboukhalid, R., Achakzai,
N. M., Anjos, M. J., Ayub, Q., Balazic, J., Ballantyne,
J., Ballard, D. J., Berger, B., Bobillo, C., Bouabdellah,
M., Burri, H., Capal, T., Caratti, S., Cardenas, J., Car-
tault, F., Carvalho, E. F., Carvalho, M., Cheng, B., Coble,
M. D., Comas, D., Corach, D., D’Amato, M. E., Davi-
son, S., de Knijff, P., De Ungria, M. C., Decorte, R.,
Dobosz, T., Dupuy, B. M., Elmrghni, S., Gliwinski, M.,
Gomes, S. C., Grol, L., Haas, C., Hanson, E., Henke,
J., Henke, L., Herrera-Rodriguez, F., Hill, C. R., Holm-
lund, G., Honda, K., Immel, U. D., Inokuchi, S., Jobling,
M. A., Kaddura, M., Kim, J. S., Kim, S. H., Kim, W.,
King, T. E., Klausriegler, E., Kling, D., Kovacevic, L., Ko-
vatsi, L., Krajewski, P., Kravchenko, S., Larmuseau, M.
H., Lee, E. Y., Lessig, R., Livshits, L. A., Marjanovic,
D., Minarik, M., Mizuno, N., Moreira, H., Morling, N.,
Mukherjee, M., Munier, P., Nagaraju, J., Neuhuber, F.,
Nie, S., Nilasitsataporn, P., Nishi, T., Oh, H. H., Olofsson,
J., Onofri, V., Palo, J. U., Pamjav, H., Parson, W., Pet-
lach, M., Phillips, C., Ploski, R., Prasad, S. P., Primorac,
D., Purnomo, G. A., Purps, J., Rangel-Villalobos, H., Re-
bala, K., Rerkamnuaychoke, B., Gonzalez, D. R., Robino,
C., Roewer, L., Rosa, A., Sajantila, A., Sala, A., Salvador,
J. M., Sanz, P., Schmitt, C., Sharma, A. K., Silva, D. A.,
Shin, K. J., Sijen, T., Sirker, M., Sivakova, D., Skaro, V.,
Solano-Matamoros, C., Souto, L., Stenzl, V., Sudoyo, H.,
Syndercombe-Court, D., Tagliabracci, A., Taylor, D., Till-
mar, A., Tsybovsky, I. S., Tyler-Smith, C., van der Gaag,
K. J., Vanek, D., Volgyi, A., Ward, D., Willemse, P., Yap,
E. P., Yong, R. Y., Pajnic, I. Z., Kayser, M., Hum. Mutat.
2014, 35, 1021–1032.
[16] Adnan, A., Ralf, A., Rakha, A., Kousouri, N., Kayser, M.,
Forensic Sci. Int. Genet. 2016, 25, 45–51.
[17] Boattini, A., Sarno, S., Bini, C., Pesci, V., Barbieri, C., De
Fanti, S., Quagliariello, A., Pagani, L., Ayub, Q., Ferri,
G., Pettener, D., Luiselli, D., Pelotti, S., PLoS One 2016,
11 , e0165678.
[18] Boattini, A., Sarno, S., Mazzarisi, A. M., Viroli, C., De
Fanti, S., Bini, C., Larmuseau, M. H. D., Pelotti, S.,
Luiselli, D., Sci. Rep. 2019, 9, 9032.
[19] Niederstatter, H., Berger, B., Kayser, M., Parson, W.,
Forensic Sci. Int. Genet. 2016, 24, 180–193.
[20] Turrina, S., Caratti, S., Ferrian, M., De Leo, D., Tra nsfu -
sion 2016, 56, 533–538.
[21] Westen, A. A., Kraaijenbrink, T., Clarisse, L., Grol, L. J.,
Willemse, P., Zuniga, S. B., Robles de Medina, E. A.,
Schouten, R., van der Gaag, K. J., Weiler, N. E., Kal, A.
J., Kayser, M., Sijen, T., de Knijff, P., Forensic Sci. Int.
Genet. 2015, 14, 174–181.
[22] Roewer, L., WIREs Forensic Sci. 2019, 1, e1336.
[23] Wu, W., Ren, W., Hao, H., Nan, H., He, X., Liu, Q., Lu, D.,
Int. J. Legal Med. 2018, 132, 1317–1319.
[24] Wang, Y., Zhang, Y. J., Zhang, C. C., Li, R., Yang, Y., Ou,
X. L., Tong, D. Y., Sun, H. Y., Forensic Sci. Int. Genet.
2016, 21,59.
[25] Weng, W., Liu, H., Liu, C., Li, S., Liu, C., Wang, H., Nan
fang yi ke da xue xue bao (J. South. Med. Univ.) 2013,
33, 412–415.
[26] Scientic Working Group on DNA Analysis Methods
(SWGDAM), Revised Validation Guidelines,SWGDAM
2004.
[27] Gusmão, L., Butler, J. M., Carracedo, A., Gill, P., Kayser,
M.,Mayr,W.R.,Morling,N.,Prinz,M.,Roewer,L.,
Tyler-Smith, C., Schneider, P. M., International Society
of Forensic Genetics, Int. J. Legal Med. 2006, 120, 191–
200.
[28] Roewer, L., Andersen, M. M., Ballantyne, J., Butler, J.
M., Caliebe, A., Corach, D., D’Amato, M. E., Gusmão, L.,
Hou, Y., de Knijff, P., Parson, W., Prinz, M., Schneider, P.
M., Taylor, D., Vennemann, M., Willuweit, S., Forensic
Sci. Int. Genet. 2020, 48, 102308.
[29] Excofer, L., Lischer, H. E., Mol. Ecol. Resour. 2 010, 10 ,
564–567.
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
14 H. Fan et al. Electrophoresis 2021, 0,116
[30] Fan, H., Wang, X., Chen, H., Long, R., Liang, A., Li, W.,
Chen, J., Wang, W., Qu, Y., Song, T., Zhang, P., Deng, J.,
Forensic Sci. Int. Genet. 2018, 37, e6–e11.
[31] Fan, H., Wang, X., Chen, H., Zhang, X., Huang, P., Long,
R., Liang, A., Song, T., Deng, J., Forensic Sci. Int. Genet.
2018, 34, e20–e22.
[32] Fan, H., Zhang, X., Wang, X., Ren, Z., Li, W., Long, R.,
Liang, A., Chen, J., Song, T., Qu, Y., Deng, J., Forensic
Sci. Int. Genet. 2018, 33, e9–e10.
[33] Hansen, J., Am. Stat. 2000, 59,113.
[34] Excofer, L., Smouse, P. E., Genetics 1994, 136, 343–
359.
[35] Excofer, L., Smouse, P. E., Quattro, J. M., Genetics
1992, 131, 479–491.
[36] Kumar, S., Stecher, G., Tamura, K., Mol. Biol. Evol. 2016,
33, 1870–1874.
[37] Saitou, N., Nei, M., Mol. Biol. Evol. 1987, 4, 406–425.
[38] Letunic, I., Bork, P., Nucleic Acids Res. 2019, 47, W256–
W259.
[39] Rolf, B., Wiegand, P., Brinkmann, B., Forensic Sci. Int.
2002, 126, 200–202.
[40] Huel, R. L., Basic, L., Madacki-Todorovic, K., Smajlovic,
L., Eminovic, I., Berbic, I., Milos, A., Parsons, T. J., Croa-
tian Med. J. 2007, 48, 494–502.
[41] Shi, W., Massaia, A., Louzada, S., Banerjee, R., Hallast,
P., Chen, Y., Bergstrom, A., Gu, Y., Leonard, S., Quail,
M. A., Ayub, Q., Yang, F., Tyler-Smith, C., Xue, Y., Hum.
Genet. 2018, 137, 73–83.
[42] Jobling, M. A., Cytogenet. Genome Res. 2008, 123, 253–
262.
[43] Ge, J., Budowle, B., Aranda, X. G., Planz, J. V., Eisen-
berg, A. J., Chakraborty, R., Forensic Sci. Int. Genet.
2009, 3, 179–184.
[44] Ay, M., Serin, A., Sevay, H., Gurkan, C., Canan, H., Ann.
Hum. Biol. 2018, 45, 506–515.
[45] Chen Shuiqin, L. Q., Rui, F., Yang, Z., Lin, Z., Yong,
F., Minghao, S., Lei, Z., Gao, W., Jianying, Y., Chin. J.
Forensic Med. 2019, 34, 454–458.
[46] Claerhout, S., Vandenbosch, M., Nivelle, K., Gruyters,
L., Peeters, A., Larmuseau, M. H. D., Decorte, R., Foren-
sic Sci. Int. Genet. 2018, 34,110.
[47] Goedbloed, M., Vermeulen, M., Fang, R. N., Lembring,
M., Wollstein, A., Ballantyne, K., Lao, O., Brauer, S.,
Kruger, C., Roewer, L., Lessig, R., Ploski, R., Dobosz, T.,
Henke, L., Henke, J., Furtado, M. R., Kayser, M., Int. J.
Legal Med. 2009, 123, 471–482.
[48] Hohoff, C., Dewa, K., Sibbing, U., Hoppe, K., Forster, P.,
Brinkmann, B., Int. J. Legal Med. 2007, 121, 359–363.
[49] Lin, H., Ye, Q., Tang, P., Mo, T., Yu, X., Tang, J., Legal
Med. 2020, 42, 101643.
[50] Liu Yaju, G. L., Shi, M., Li, Z., Chin. J. Forensic Med.
2016, 31, 22–26.
[51] Oh, Y. N., Lee, H. Y., Lee, E. Y., Kim, E. H., Yang, W.
I., Shin, K. J., Forensic Sci. Int. Genet. 2015, 15, 64–
68.
[52] Petrovic, V., Kecmanovic, M., Keckarevic Markovic, M.,
Keckarevic, D., Forensic Sci. Int. Genet. 2019, 39, e5–
e9.
[53] Wang, Q., Jin, B., An, G., Zhong, Q., Chen, M., Luo, X.,
Li, Z., Jiang, Y., Liang, W., Zhang, L., Int. J. Legal Med.
2019, 133, 45–50.
[54] Wang Xinjie, X. X., Lei, H., Jinye, W., Forensic Sci. Tech-
nol. 2016, 41, 424–428.
[55] Weng, W., Liu, H., Li, S., Ge, J., Wang, H., Liu, C., Int. J.
Legal Med. 2013, 127, 369–372.
[56] Wu, W., Wang, H., Hao, H., Ren, W., Su, Y., Lv, D., Chin.
J. Forensic Med. 2015, 30, 256–259.
[57] Wu, W. W., Su, Y. J., Mei, X. L., Lu, D. J., Zhou, X., Hao,
H. L., Ren, W. Y., Liu, B., Fa yi xue za zhi 2018, 34,411
416.
[58] Yuan, L., Chen, W., Zhao, D., Li, Y., Hao, S., Liu, Y., Lu,
D., Int. J. Legal Med. 2019, 133, 59–63.
[59] Zhang, W., Xiao, C., Yu, J., Wei, T., Liao, F., Wei, W.,
Huang, D., Int. J. Legal Med. 2017, 131, 345–350.
[60] Zhu Chuanhong, Z. X., Yi, S., Daixin, H., Qingqing, H.,
Fanye, X., Jing, S., Ping, N., Yan, L., Forensic Sci. Tech-
nol. 2014, 1, 16–22.
[61] D’Atanasio, E., Iacovacci, G., Pistillo, R., Bonito, M.,
Dugoujon, J. M., Moral, P., El-Chennawi, F., Melhaoui,
M., Baali, A., Cherkaoui, M., Sellitto, D., Trombetta, B.,
Berti, A., Cruciani, F., Forensic Sci. Int. Genet. 2019, 38,
185–194.
[62] Iacovacci, G., D’Atanasio, E., Marini, O., Coppa, A., Sel-
litto, D., Trombetta, B., Berti, A., Cruciani, F., Forensic
Sci. Int. Genet. 2017, 27, 123–131.
[63] Shonhai, M., Nhiwatiwa, T., Nangammbi, T., Mazando,
S., Legal Med. 2020, 43, 101660.
[64] Al-Snan, N. R., Messaoudi, S. A., Khubrani, Y. M., Wet-
ton, J. H., Jobling, M. A., Bakhiet, M., Mol. Genet. Ge-
nomics 2020, 295, 1315–1324.
[65] Zhabagin, M., Sarkytbayeva, A., Tazhigulova, I.,
Yerezhepov, D., Li, S., Akilzhanov, R., Yeralinov, A.,
Sabitov, Z., Akilzhanova, A., Int. J. Legal Med. 2019,
133, 1029–1032.
[66] Khubrani, Y. M., Wetton, J. H., Jobling, M. A., Forensic
Sci. Int. Genet. 2018, 33, 98–105.
[67] Tariq Zeyad, A. A., Alghafri, R., Iratni, R., Forensic Sci.
Int. Rep. 2020, 2, 100057.
[68] Lacerenza, D., Aneli, S., Di Gaetano, C., Critelli, R., Pi-
azza, A., Matullo, G., Culigioni, C., Robledo, R., Robino,
C., Calo, C., Forensic Sci. Int. Genet. 2017, 27, 172–174.
[69] Jankova, R., Seidel, M., Videtic Paska, A., Willuweit, S.,
Roewer, L., Forensic Sci. Int. Genet. 2019, 42, 165–170.
[70] Spolnicka, M., Dabrowska, J., Szablowska-Gnap, E.,
Paleczka, A., Jablonska, M., Zbiec-Piekarska, R., Pieta,
A., Boron, M., Konarzewska, M., Kostrzewa, G., Ploski,
R., Rogalla, U., Wozniak, M., Grzybowski, T., Forensic
Sci. Int. Genet. 2017, 28, e22–e25.
[71] Zgonjanin, D., Alghafri, R., Antov, M., Stojiljkovic, G.,
Petkovic, S., Vukovic, R., Draskovic, D., Forensic Sci. Int.
Genet. 2017, 31, e48–e49.
[72] Garcia, O., Yurrebaso, I., Mancisidor, I. D., Lopez, S.,
Alonso, S., Gusmão, L., Forensic Sci. Int. Genet. 2016,
20,e10e12.
[73] Pickrahn, I., Muller, E., Zahrer, W., Dunkelmann, B.,
Cemper-Kiesslich, J., Kreindl, G., Neuhuber, F., Foren-
sic Sci. Int. Genet. 2016, 21, 90–94.
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
Electrophoresis 2021, 0,116 H. Fan et al. 15
[74] Henry, J., Dao, H., Scandrett, L., Taylor, D., Forensic Sci.
Int. Genet. 2019, 41, e23–e25.
[75] Stange, V. S., Silva Dos Reis, R., Mariano Garcia de
Souza Rodrigues, F., Lima Lugon, M., Mayumi Vieira,
C., de Paula, F., de Vargas Wolfgramm Dos Santos, E.,
Madeira Alvares da Silva-Conforti, A., Drumond Louro,
I., Gusmão, L., Forensic Sci. Int. Genet. 2019, 41, e20–
e22.
[76] Jannuzzi, J., Ribeiro, J., Alho, C., de Oliveira Lazaro, E.
A. G., Cicarelli, R., Simoes Dutra Correa, H., Ferreira,
S., Fridman, C., Gomes, V., Loiola, S., da Mota, M. F.,
Ribeiro-Dos-Santos, A., de Souza, C. A., de Sousa Azu-
lay, R. S., Carvalho, E. F., Gusmão, L., Forensic Sci. Int.
Genet. 2020, 44, 102163.
[77] Luo, Y., Wu, Y., Qian, E., Wang, Q., Wang, Q., Zhang, H.,
Wang, X., Zhang, H., Yang, M., Ji, J., Ren, Z., Zhang, Y.,
Tang, J., Huang, J., PLoS One 2019, 14, e0224601.
[78] Wang, C. Z., Su, M. J., Li, Y., Chen, L., Jin, X., Wen, S. Q.,
Tan,J.,Shi,M.S.,Li,H.,Forensic Sci. Int. Genet. 2019,
40, e252–e255.
[79] Feng, R., Zhao, Y., Chen, S., Li, Q., Fu, Y., Zhao, L., Zhou,
Y., Zhang, L., Mei, X., Shi, M., Yin, J., Int. J. Legal Med.
2020, 134, 981–983.
[80] Liu, Y., Wang, C., Zhou, W., Li, X. B., Shi, M., Bai, R., Ma,
S., Forensic Sci. Int. Genet. 2019, 40, e264–e267.
[81] Wang, C. Z., Zhang, J. S., Li, X. B., Bai, R. F., Shi, M. S.,
Wang, C. C., Int. J. Legal Med. 2020, 134, 2063–2065.
[82] Lang, M., Liu, H., Song, F., Qiao, X., Ye, Y., Ren, H., Li, J.,
Huang, J., Xie, M., Chen, S., Song, M., Zhang, Y., Qian,
X., Yuan, T., Wang, Z., Liu, Y., Wang, M., Liu, Y., Liu, J.,
Hou, Y., Forensic Sci. Int. Genet. 2019, 42, e13–e20.
[83] Tao, R., Jin, M., Ji, G., Zhang, J., Zhang, J., Yang, Z.,
Chen, C., Zhang, S., Li, C., Forensic Sci. Int. Genet. 2019,
40, e268–e270.
[84] Zhang, J., Wang, J., Liu, Y., Shi, M., Bai, R., Ma, S.,
Forensic Sci. Int. Genet. 2017, 31, e54–e56.
[85] Zhang, J., Tao, R., Zhong, J., Sun, D., Qiao, L., Shan, S.,
Yang, Z., Zhang, J., Zhang, S., Li, C., Forensic Sci. Int.
Genet. 2019, 39, e26–e28.
[86] Wang, L., Chen, F., Kang, B., Zheng, H., Zhao, Y., Li, L.,
Zeng, Z., Forensic Sci. Int. Genet. 2016, 22, e25–e27.
[87] Bai, R., Liu, Y., Zhang, J., Shi, M., Dong, H., Ma, S., Bai,
R.F.,Shi,M.,Int. J. Legal Med. 2016, 130, 1191–1194.
[88] Wang, X., Li, Y., Fan, H., BMC Public Health 2019, 19 ,
1524.
[89] Tao, R., Wang, S., Zhang, J., Zhang, J., Yang, Z., Zhang,
S., Li, C., Forensic Sci. Int. Genet. 2019, 39,e10e13.
[90] Yin, C., Su, K., He, Z., Zhai, D., Guo, K., Chen, X., Jin, L.,
Li, S., Genes 2020, 11 , 743.
[91] Zhou, Y., Shao, C., Li, L., Zhang, Y., Liu, B., Yang, Q.,
Tang, Q., Li, S., Xie, J., Forensic Sci. Int. Genet. 2018,
32, e1–e4.
[92] Wang, M., Wang, Z., Zhang, Y., He, G., Liu, J., Hou, Y.,
Forensic Sci. Int. Genet. 2017, 31, e17–e23.
[93] Liu, Y., Wen, S., Guo, L., Bai, R., Shi, M., Li, X., Forensic
Sci. Int. Genet. 2018, 35, e7–e9.
[94] Tang, J., Yang, M., Wang, X., Wang, Q., Wang, Q.,
Zhang, H., Qian, E., Zhang, H., Ji, J., Ren, Z., Wu, Y.,
Huang, J., Ann. Hum. Biol. 2020, 47, 541–548.
[95] He, G., Wang, Z., Su, Y., Zou, X., Wang, M., Chen, X.,
Gao, B., Liu, J., Wang, S., Hou, Y., Sci. Rep. 2019, 9,
7739.
[96] Cao, S., Bai, P., Zhu, W., Chen, D., Wang, H., Jin, B.,
Zhang, L., Liang, W., Forensic Sci. Int. Genet. 2018, 34,
e18–e19.
[97] Song, F., Xie, M., Xie, B., Wang, S., Liao, M., Luo, H., Int.
J. Legal Med. 2020, 134, 513–516.
[98] Liu, Y., Jin, X., Guo, Y., Zhang, X., Zhu, W., Zhang, W.,
Mei, T., Mol. Genet. Genomic Med. 2020, 8, e1338.
[99] Fan, G. Y., An, Y. R., Peng, C. X., Deng, J. L., Pan, L. P.,
Ye, Y. , Int. J. Legal Med. 2019, 133, 795–797.
[100] Dezhi, C., Meili, L., Yingjian, H., Yiping, H., Yu, T., Weibo,
L., Int. J. Legal Med. 2020.
[101] Guo, F., Li, J., Chen, K., Tang, R., Zhou, L., Forensic Sci.
Int. Genet. 2017, 27, 182–183.
[102] Ip, S. C. Y., Lin, S. W., Lam, T. T., Forensic Sci. Int. Genet.
2019, 38,e14e15.
[103] Wang, Y., Li, S., Dang, Z., Kong, X., Zhang, Y., Ma, L.,
Wang, D., Zhang, H., Li, C., Cui, W., Legal Med. 2019,
36,110112.
[104] Xie, M., Song, F., Li, J., Lang, M., Luo, H., Wang, Z., Wu,
J., Li, C., Tian, C., Wang, W., Ma, H., Song, Z., Fan, Y.,
Hou, Y., Forensic Sci. Int. Genet. 2019, 41,1118.
[105] Liu, Y., Guo, L., Yue, J., Li, J., Shi, M., Li, X., Lab. Med.
Clin. 2020, 5, 577–581.
[106] Lin, H., Tang, P., Ye, Q., Yu, X., Mo, T., Tang, J., Forensic
Sci. Technol. 2020, 45, 420–446.
[107] Jin, H., Wang, K., Yan, K., Biol. Chem. Eng. 2020, 6, 84–
87.
[108] Liu, Y., Zang, J., Shi, S., Liu, H., Guo, L., Zhang, Y., Li,
X., Forensic Sci. Technol. 2014, 4, 18–20.
[109] Meng, J., Wang, X., Huang, Y., Jiao, H., Guo, L., Zhang,
Q., Li, Q., J. Zhengzhou Univ. (Med. Sci.) 2018, 54, 452–
457.
[110] Wang, K., Bao, H., Zheng, B., Huang, J., Tang, P., Biol.
Chem. Eng. 2020, 6, 103–107.
[111] Zhang, F., Chen, Q., Zhang, X., Jin, L., Ding, M., J. Foren-
sic Med. 2020, 2, 263–267.
[112] Liu, Y., Mao, J., Zhu, C., Li, X., Shi, M., Basic Clin. Med.
2019, 39, 314–320.
[113] Lei, Q., Xiao, Y., Zhang, W., Zhai, D., Cheng, B., Zeng, F.,
J. Kunming Med. Univ. 2017, 38, 25–29.
[114] Zhao, K., Jin, X., Lin, D., Gong, W., Chin. J. Forensic
Med. 2019, 34, 449–453.
[115] Wu, Z., Chen, T. F., Zeng, Z. F., Zhang, Y. W., Tang, Z.,
Su, K. Y., Fan, C. Y., Li, S. L., Fa yi xue za zhi 2019, 35,
448–454.
[116] Yao, J., Wang, L., Gui, J., Xing, J., Xuan, J., Wang, B., J.
Forensic Med. 2017, 33, 666–668.
[117] Ge, J., Wu, S., Chao, S., Zhongguo yimin shi (The Mi-
gration History of China), Fujian People’s Publishing
House, Fuzhou, China 1997.
[118] Feng, X, The Pattern of Diversity in Unity of the Chi-
nese Nation, Central University for Nationalities Press,
Beijing 1999.
[119] Wen, B., Li, H., Lu, D., Song, X., Zhang, F., He, Y., Li,
F., Gao, Y., Mao, X., Zhang, L., Qian, J., Tan, J., Jin, J.,
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
16 H. Fan et al. Electrophoresis 2021, 0,116
Huang, W., Deka, R., Su, B., Chakraborty, R., Jin, L., Na-
ture 2004, 431, 302–305.
[120] Jiang, W., Gong, Z., Rong, H., Guan, H., Zhang, T., Zhao,
Y., Fu, X., Zha, L., Jin, C., Ding, Y., Int. J. Legal Med. 2017,
131, 115–117.
[121] Li, L., Yu, G., Li, S., Jin, L., Yan, S., Forensic Sci. Int.
Genet. 2016, 20, 101–102.
[122] Shu, L., Li, L., Yu, G., Yu, B., Liu, Y., Li, S., Jin, L., Yan,
S., Forensic Sci. Int. Genet. 2015, 19, 250–251.
[123] Sun, H., Su, K., Fan, C., Long, F., Liu, Y., Sun, J., Mo, X.,
Ge, Y., Zhang, L., Zhai, L., Li, W., Yin, C., Li, S., Forensic
Sci. Int. Genet. 2019, 38, e8–e10.
[124] Wang, H., Ba, H., Yang, C., Zhang, J., Tai, Y., PLoS One
2017, 12, e0180921.
[125] Zhang, S., Tian, H., Wang, Z., Zhao, S., Hu, Z., Li, C., Ji,
C., Forensic Sci. Int. Genet. 2014, 13, 112–120.
[126] Zhou, H., Ren, Z., Zhang, H., Wang, J., Huang, J., Foren-
sic Sci. Int. Genet. 2016, 25, e6–e7.
[127] Shen, C., Li, S., Forensic Sci. Int. 2005, 154, 81–84.
[128] Redd, A. J., Agellon, A. B., Kearney, V. A., Contreras, V.
A., Karafet, T., Park, H., de Knijff, P., Butler, J. M., Ham-
mer, M. F., Forensic Sci. Int. 2002, 130, 97–111.
[129] Gopinath, S., Zhong, C., Nguyen, V., Ge, J., Lagace, R.
E., Short, M. L., Mulero, J. J., Forensic Sci. Int. Genet.
2016, 24, 164–175.
[130] Zhang, T. Y., MingShi (The History of Ming Dynasty, Vol-
ume Three Hundred and Four), Zhonghua Book Com-
pany, Beijing 1974.
[131] Yingming, L., The history of Southeast Asia (in Chi-
nese), People’s Publishing House, Beijing 2010.
[132] Wu Yuqin, Q. S., World History: Ancient History, Higher
Education Press, Beijing 2011.
© 2021 Wiley-VCH GmbH www.electrophoresis-journal.com
... Allele frequencies and GD values at each locus from 2 548 father samples are listed in Table S2A (GD=0.1051), which were also reported in Southern Han Chinese [30] and Southeastern Han Chinese [31]. ...
... from Hunan Han Chinese [42] and that at 62 Y-STRs (97.29%) from Southeastern Han Chinese [31], and there was no statistically significant difference between the percentage of one-step and multi-step mutations Table 4), which is consistent with previous studies [30,31,44]. ...
... from Hunan Han Chinese [42] and that at 62 Y-STRs (97.29%) from Southeastern Han Chinese [31], and there was no statistically significant difference between the percentage of one-step and multi-step mutations Table 4), which is consistent with previous studies [30,31,44]. ...
Article
Full-text available
A total of 2 548 unrelated healthy father–son pairs from a Northern Han Chinese population were genotyped at 41 Y chromosomal short tandem repeat (Y-STRs) including DYS19, DYS388, DYS389I, DYS389II, DYS390, DYS391, DYS392, DYS393, DYS437, DYS438, DYS439, DYS444, DYS447, DYS448, DYS449, DYS456, DYS458, DYS460, DYS481, DYS518, DYS522, DYS549, DYS533, DYS557, DYS570, DYS576, DYS593, DYS596, DYS627, DYS635, DYS643, DYS645, Y-GATA-H4, DYF387S1a/b, DYF404S1a/b, DYS385a/b, and DYS527a/b. In 2 548 father samples, 2 387 unique haplotypes were detected with the haplotype diversity and discrimination capacity values of 0.999 956 608 and 0.96 741 007. The average gene diversity (GD) value was 0.6934 with a range from 0.1051 at DYS645 to 0.9657 at DYS385a/b. When comparing alleles at 24 overlapped Y-STRs between the ForenSeq™ deoxyribonucleic acid (DNA) Signature Prep Kit on the MiSeq FGx® Forensic Genomics System and the Goldeneye® DNA ID Y Plus Kit on the Applied Biosystems™ 3730 DNA Analyzer from 308 father samples in mutational pairs, 258 alleles were detected by massively parallel sequencing (MPS) typing including 156 length-based alleles that could be obtained by capillary electrophoresis (CE) typing, 95 repeat region (RR) variant alleles and seven flanking region variant alleles. Hereof, we found 16 novel RR variant alleles and firstly identified two SNPs (rs2016239814 at DYS19 and rs2089968964 at DYS448) and one 4-bp deletion (rs2053269960 at DYS439) that had been validated by the Database of Short Genetic Variation. Sanger sequencing or MPS was employed to confirm 356 mutations from 104 468 allele transfers generated from CE, where 96.63% resulted in one-step mutations, 2.25% in two-step, and 1.12% in multi-step, and the overall ratio of repeat gains versus losses was balanced (173 gains vs. 183 losses). In 308 father–son pairs, 268 pairs occurred mutations at a single locus, 33 pairs at two loci, six pairs at three loci, and one pair at four loci. The average Y-STR mutation rate at 41 Y-STRs was ⁓3.4 × 10−3 (95% confidence intervals: 3.1 × 10−3–3.8 × 10−3). The mutation rates at DYS576 and DYS627 were higher than 1 × 10−2 in Northern Han Chinese, whilst the mutation rates at DYF387S1a/b, DYF404S1a/b, DYS449, DYS518, and DYS570 were lower than initially defined. In this study, the classical molecular factors (the longer STR region, the more complex motif and the order father) were confirmed to drive Y-STR mutation rates increased, but the length of repeat unit did not conform to the convention. Lastly, the interactive graphical and installable StatsY was developed to facilitate forensic scientists to automatically calculate allele and haplotype frequencies, forensic parameters, and mutation rates at Y-STRs. Key points 308 of 2 548 father–son pairs from Northern Han Chinese occurred at least one mutation(s) across 41 Y-STRs. Sanger sequencing or MPS was employed to confirm those mutations generated from CE. The longer STR region, the more complex motif and the order father drove Y-STR mutation rates increased. StatsY was developed to calculate allele and haplotype frequencies, forensic parameters and mutation rates at Y-STRs.
... Meta-analyses take results from different studies addressing the same question, and combine them to determine if results are similar or dissimilar (heterogeneitysee below), as well as pooling results to potentially give a more complete picture of any effects that a given treatment may have (11). For the present meta-analysis, literature examining Y-STR mutation rates from the last five years (between 2018 and 2022) (12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30) were analysed with MedCalc® software (31), with presented rates of mutation in the units of number of mutations/father-son pair. Table 1 details the meta-analysis output for the present meta-analysis. ...
... p=0.6677 (13)(14)(15)(17)(18)(19)21,22,25,26) Green shaded cells in the heterogeneity column indicate heterogeneity above low levels. Green shaded cells in the significance level column indicate significant differences between studies at p=0.05 ...
Preprint
Purpose of review: A 17-plex Y-STR (Y-Short Tandem Repeat) system is routinely used in the identification of historical military remains, but it is not providing the discrimination needed to resolve all cases. This review aims at performing a meta-analysis on commonly used Y-STRs utilising data from publications published between 2018-2022 to discern if any differences between them exist. Alternative markers that may be incorporated into the historical remains identification strategy will also be discussed to resolve open identification cases. Recent findings: Literature published between 2018 and 2022 were used in the meta-analysis to determine average mutation rates of Y-STRs. Rates were compared between markers of the 17-plex Y-STR system (also known as Yfiler™) and other commonly used panels. Point estimates from the present meta-analysis were compared to that of two meta-analysis papers prepared in the last ten years. Depending on the selected level of significance, differences exist between mutation rates (incidence rates) of the different meta-analyses. Summary: Y-STR markers are a great source of genetic information that can be used to aid in the identification of historical military remains. Y-SNPs (Y-Single Nucleotide Polymorphisms) may also provide additional information as to the identity of a set of remains, ultimately leading to the closure of open identification cases.
... This would further facilitate the deconvolution of DNA mixtures. [45] conclusIon Based on studies on the genetically diverse South African population, we may be much closer to understanding the population's genetic distributions and dynamics. Furthermore, kits must be individualized to accommodate different populations as some haplotypes and loci are not present in specific population groups. ...
Article
The South African population consists of four ethnic groups, i.e., Blacks, Coloreds, Indians, and Whites, and is considered the most diverse conglomeration of humans. In addition to autosomal short tandem repeat (STR) variation, an important tool to study population diversity is Y-chromosome (Y)-STR analysis. Y-STRs aid in forensic investigations and provide essential data about paternal lineage origins. Y-STR kits consisting of an array of stable and rapidly mutating markers offer crucial information on a given population's genetic and haplotype diversity. This review discusses the development of Y-STR kits over the years and highlights some prominent Y-STR studies conducted on the South African population. The earliest Y-STR kit developed was the Y-PLEX™6, with the most recent being the UniQTyper™ Y-10 Multiplex. The South African population studies show varying data, with the 'minimal haplotype' having low discrimination capacity among the ethnic groups and the UniQTyper™ Y-10 showing high genetic diversity among the ethnic groups of the country. There is a dearth of Y-STR studies on the South African population. With the advent of new Y-STR kits with increased discriminatory markers, additional studies are required to represent the South African population in the Y-STR databases. Considering the diversity of the South African population, establishment of a local/regional population database would be beneficial. In addition, data on the origins and prevalence of mutations and silent alleles should be obtained from STR datasets generated during kinship investigations (specifically, parentage tests) so that detailed information about the frequencies of mutations, silent alleles, and uniparental disomy in the South African population at Y STR loci can be estimated. © 2022 Journal of Cancer Research and Practice | Published by Wolters Kluwer - Medknow.
Article
A six-color fluorescent multiplex amplification system for 31 Y-chromosomal short tandem repeats (Y-STRs) (DYS19, DYS390, DYS391, DYF399S1, DYF404S1, DYS439, DYS444, DYS449, DYS452, DYS456, DYS458, DYS460, DYS481, DYS508, DYS513, DYS516, DYS518, DYS543, DYS547, DYS549, DYS552, DYS557, DYS570, DYS576, DYS612, DYS622, DYS626, DYS627, DYS630, DYS635, and Y-GATA-A10) was developed for investigating the mutation rates of 31 highly mutated Y-STR genes in the Han population of northern China. The mutation rates of the 31 highly mutated Y-STRs were calculated using the father–son pair study method after typing 526 Northern Han father–son pairs with this system. Statistically, 148 Y-STR mutations were found, with mutation rates ranging from 0 (95% confidence interval [CI] 0 to 9.0 × 10 ⁻³ , DYS622) to 7.0 × 10 ⁻² (95% CI 5.1 × 10 ⁻² to 9.7 × 10 ⁻² , DYF399S1). Out of these, 126 father–son pairs were successfully identified, with a distinction rate of 24.0% (95% CI 20.4%–27.9%). The ability of the 31 highly mutated Y-STRs to distinguish closely related males from the same paternal lineage in the Northern Han population is extremely valuable for criminal investigations and other purposes.
Article
Full-text available
Massively parallel sequencing (MPS) has emerged as a promising technology for targeting multiple genetic loci simultaneously in forensic genetics. Here, a novel 193-plex panel was designed to target 28 A-STRs, 41 Y-STRs, 21 X-STRs, 3 sex-identified loci, and 100 A-SNPs by employing a single-end 400 bp sequencing strategy on the MGISEQ-2000™ platform. In the present study, a series of validations and sequencing of 1642 population samples were performed to evaluate the overall performance of the MPS-based panel and its practicality in forensic application according to the SWGDAM guidelines. In general, the 193-plex markers in our panel showed good performance in terms of species specificity, stability, and repeatability. Compared to commercial kits, this panel achieved 100% concordance for standard gDNA and 99.87% concordance for 14,560 population genotypes. Moreover, this panel detected 100% of the loci from 0.5 ng of DNA template and all unique alleles at a 1:4 DNA mixture ratio (0.2 ng minor contributor), and the applicability of the proposed approach for tracing and degrading DNA was further supported by case samples. In addition, several forensic parameters of STRs and SNPs were calculated in a population study. High CPE and CPD values greater than 0.9999999 were clearly demonstrated and these results could be useful references for the application of this panel in individual identification and paternity testing. Overall, this 193-plex MPS panel has been shown to be a reliable, repeatable, robust, inexpensive, and powerful tool sufficient for forensic practice.
Article
Full-text available
Y-chromosomal short tandem repeats (Y-STRs) are widely used in forensic, genealogical, and population genetics. With the recent increase in the number of rapidly mutating (RM) Y-STRs, an unprecedented level of male differentiation can be achieved, widening and improving the applications of Y-STRs in various fields, including forensics. The growing complexity of Y-STR data increases the need for automated data analyses, but dedicated software tools are scarce. To address this, we present the Male Pedigree Toolbox (MPT), a software tool for the automated analysis of Y-STR data in the context of patrilineal genealogical relationships. The MPT can estimate mutation rates and male relative differentiation rates from input Y-STR pedigree data. It can aid in determining ancestral haplotypes within a pedigree and visualize the genetic variation within pedigrees in all branches of family trees. Additionally, it can provide probabilistic classifications using machine learning, helping to establish or prove the structure of the pedigree and the level of relatedness between males, even for closely related individuals with highly similar haplotypes. The tool is flexible and easy to use and can be adjusted to any set of Y-STR markers by modifying the intuitive input file formats. We introduce the MPT software tool v1.0 and make it publicly available with the goal of encouraging and supporting forensic, genealogical, and other geneticists in utilizing the full potential of Y-STRs for both research purposes and practical applications, including criminal casework.
Article
Full-text available
Rapidly mutating Y-STRs (RM Y-STRs) harbor great potential to distinguish male relatives and achieve male identification. However, forensic applications were greatly limited by the small number of the initially identified 14 RM Y-STRs. Recently, with the emergence of 12 novel RM Y-STRs, an integrated panel named RMplex was introduced, which contains all 26 RM Y-STRs and four fast mutating Y-STRs (FM Y-STRs). To obtain the first data on the mutation rates and father-son differentiation rates of the 30 newly proposed Y-STRs in Chinese populations, we performed an empirical mutation study on 307 DNA-confirmed Chinese paternal pairs. Previously reported mutation rates for 14 RM Y-STRs in Chinese and European populations were pooled and merged with our data. The highest meiosis number for the two groups reached 4771 and 2687, respectively. Five loci showed significant differences between the populations (DYS570, DYS399S1, DYS547, DYS612, and DYF403S1b). For the new panel covering 30 Y-STR loci, our results show extensive differences in the mutation rates between the two populations, as well. 10 RM Y-STR loci showed relatively low mutation rates (10–3–10–2 per meiosis) and 2 FM Y-STR loci had rapid mutation rates (> 10–2 per meiosis) in the Chinese population. Several-fold differences in mutation rates were found in nine Y-STR loci between the Chinese and reference populations, with two loci having significantly higher mutation rates and one locus with a significantly lower mutation rate in the Chinese population (P < 0.05). Eighteen RM Y-STRs (> 10–2 per meiosis), 8 FM Y-STR loci (5×10–3-10–2 per meiosis), 3 moderately mutating Y-STRs (MM Y-STRs, 10–3-5×10–3 per meiosis), and one locus with no observed mutation events were identified in the Chinese population. 40.06% of the Chinese paternity pairs were discriminated with RMplex while only 20.84% with the initial 14 RM Y-STRs, indicating that RMplex is beneficial for distinguishing paternally related males. Future studies on populations of different genetic backgrounds are necessary to obtain comprehensive estimates of mutation rates at these new loci.
Article
Y‐chromosome, as a gender‐determined biological marker, is inherited only between fathers and sons. The Y‐chromosome short tandem repeats (Y‐STRs) play an essential role in paternity lineage tracing as well as sexual assault cases. The Microreader Group Y Direct ID System as a six‐dye multiplex amplification kit, including 53 Y‐STR and one Y‐Indel locus, would improve performance and aid in obtaining more information through a greater number of loci with high polymorphism. In the present study, to verify the accuracy and efficiency of the kit, developmental validation was conducted by investigating sensitivity, species specificity, PCR inhibition, male–male and male–female mixtures, and reproducibility. The kit was tested using 311 male samples from Han and Qiang populations in Sichuan Province. The results showed that this kit had fairly high power for forensic discrimination (Han: haplotype diversity [HD] = 1, Qiang: HD = 0.999944). Additionally, 44 confirmed father–son pairs were also genotyped, among which 69 distinct haplotypes could be obtained. These father–son pairs cannot be distinguished by commonly used Y‐STR panels, indicating that adding these extra Y‐STRs to a single panel can achieve better discrimination performance. Collectively, the Microreader Group Y Direct ID System is robust and informative for forensic applications.
Article
Full-text available
Y chromosome short tandem repeat polymorphisms (Y-STRs) are important in many areas of human genetics. Y chromosomal STRs, being normally utilized in the field of forensics, exhibit low haplotype diversity in consanguineous populations and fail to discriminate among male relatives from the same pedigree. Rapidly mutating Y-STRs (RM Y-STRs) have received much attention in the past decade. These 13 RM Y-STRs have high mutation rates (>10−2) and have considerably higher haplotype diversity and discrimination capacity than conventionally used Y-STRs, showing remarkable power when it comes to differentiation in paternal lineages in endogamous populations. Previously, we analyzed two to four generations of 99 pedigrees with 1568 pairs of men covering one to six meioses from all over Pakistan and 216 male relatives from 18 deep-rooted endogamous Sindhi pedigrees covering one to seven meioses. Here, we present 861 pairs of men from 62 endogamous pedigrees covering one to six meioses from the Punjabi population of Punjab, Pakistan. Mutations were frequently observed at DYF399 and DYF403, while no mutation was observed at DYS526a/b. The rate of differentiation ranged from 29.70% (first meiosis) to 80.95% (fifth meiosis), while overall (first to sixth meiosis) differentiation was 59.46%. Combining previously published data with newly generated data, the overall differentiation rate was 38.79% based on 5176 pairs of men related by 1–20 meioses, while Yfiler differentiation was 9.24% based on 3864 pairs. Using father–son pair data from the present and previous studies, we also provide updated RM Y-STR mutation rates.
Article
Full-text available
A total of 1225 unrelated Han males from Henan province were analyzed with the prototype Yfiler® Plus kit (Life Technologies, Thermo Fisher Scientific, Waltham, MA, USA). The calculated gene diversity (GD) values ranged from 0.3855 to 0.9673 for the DYS391 and DYS385a/b loci, respectively. The discriminatory capacity (DC) was 86.94 % with 1065 observed haplotypes using 17 Yfiler loci, by the addition of 10 Y-STRs to the Yfiler® Plus system, the DC was increased to 98.94 % while showing 1212 observed haplotypes. Among the new incorporated Y-STRs, DYS576, DYF387S1, DYS518, DYS627, and DYS449 were major contributors to enhancing discrimination. In the analysis of molecular variance, the Henan Han population clustered with Asian origin populations and showed significant differences from other reference populations. In this study, the improvement of adding additional Y-STR markers with the Yfiler® Plus kit provided substantially stronger discriminatory power in the Henan Han population.
Article
Full-text available
Y-chromosomal short tandem repeats (Y-STRs) have been widely used in forensic analysis and population genetics. With low to moderate mutation rates, conventional Y-STR panels, including commercially available Y-STR kits, enable the identification of male pedigrees but typically fail to differentiate related male individuals. The introduction of rapidly mutating Y-chromosomal short tandem repeats (RM Y-STRs) with higher mutation rates (μ > 10 −2) has been demonstrated to increase the discrimination capacity of unrelated men and the differentiation rate of related men compared with standard Y-STRs. To date, several studies have been performed worldwide. Here, 260 father-son pairs from Chinese Yi population were investigated, and 18.8% of them were differentiated with the 13 RM Y-STR markers, which was close to the theoretical estimate of 19.5% based on the mutation rates of these markers. Among the 57 mutations observed, repeat gains were more common than repeat losses (1.48:1), and one-step mutations were more common than two-step mutations (27.5:1). Locus-specific mutation rates ranged from < 3.85 × 10 −3 (95% CI 0.00-1.41 × 10 −2) to 3.85 × 10 −2 (95% CI 1.86 × 10 −2-6.96 × 10 −2), with an average mutation rate of 1.46 × 10 −2 (95% CI 1.11 × 10 −2-1.89 × 10 −2). Furthermore, we combined the father-son pair data from the present study with the data from the previous studies, generating an overall mutation rate of 1.70 × 10 −2. The high differentiation rate obtained in the present study indicates the suitability of RM Y-STRs to distinguish paternal lineages in Chinese Yi population.
Article
Full-text available
Y chromosomal short tandem repeats (Y-STRs) have been widely harnessed for forensic applications, such as pedigree source searching from public security databases and male identification from male-female mixed samples. For various populations, databases composed of Y-STR haplotypes have been built to provide investigating leads for solving difficult or cold cases. Recently, the supplementary application of Y chromosomal haplogroup-determining single-nucleotide polymorphisms (SNPs) for forensic purposes was under heated debate. This study provides Y-STR haplotypes for 27 markers typed by the Yfiler ™ Plus kit and Y-SNP haplogroups defined by 24 loci within the Y-SNP Pedigree Tagging System for Shandong Han (n = 305) and Yunnan Han (n = 565) populations. The genetic backgrounds of these two populations were explicitly characterized by the analysis of molecular variance (AMOVA) and multi-dimensional scaling (MDS) plots based on 27 Y-STRs. Then, population comparisons were conducted by observing Y-SNP allelic frequencies and Y-SNP haplogroups distribution, estimating forensic parameters, and depicting distribution spectrums of Y-STR alleles in sub-haplogroups. The Y-STR variants, including null alleles, intermedia alleles, and copy number variations (CNVs), were co-listed, and a strong correlation between Y-STR allele variants ("DYS518~.2" alleles) and the Y-SNP haplogroup QR-M45 was observed. A network was reconstructed to illustrate the evolutionary pathway and to figure out the ancestral mutation event. Also, a phylogenetic tree on the individual level was constructed to observe the relevance of the Y-STR haplotypes to the Y-SNP haplogroups. This study provides the evidence that basic genetic backgrounds, which were revealed by both Y-STR and Y-SNP loci, would be useful for uncovering detailed population differences and, more importantly, demonstrates the contributing role of Y-SNPs in population differentiation and male pedigree discrimination.
Article
Full-text available
We have determined the distribution of Y-chromosomal haplotypes and predicted haplogroups in the ethnically diverse Kingdom of Bahrain, a small archipelago in the Arabian Gulf. Paternal population structure within Bahrain was investigated using the 27 Y-STRs (short tandem repeats) in the Yfiler Plus kit to generate haplotypes from 562 unrelated Bahraini males, sub-divided into four geographical regions—Northern, Capital, Southern and Muharraq. Yfiler Plus provided a significant improvement over the 17-locus Yfiler kit in discrimination capacity (from 77% to 87.5% overall), but discrimination capacity differed widely between regions from 98.4% in Muharraq to 75.2% in the Northern region, an unusually low value possibly resulting from recent rapid population expansion. Clusters of closely related male lineages were seen, with only 79.4% of donors displaying unique haplotypes and 59% of instances of shared haplotypes occurring within, rather than between, regions. Haplogroup prediction indicated diverse origins of the population with a predominance of haplogroups J2 and J1, both typical of the Arabian Peninsula, but also haplogroups such as B2 and E1b1a likely originating in Africa, and H, L and R2 likely indicative of migration from South Asia. Haplogroup frequencies differed significantly between regions, with J2 significantly more common in the Northern region compared with the Southern, possibly due to differential settlement by Baharna and Arabs. Our study shows that paternal lineage population structure can exist even over small geographical scales, and that highly discriminating genetic tools are required where rapid expansions have occurred within tightly bounded populations.
Article
Full-text available
Background Y‐chromosomal short tandem repeats (Y‐STRs) have been certified to be the serviceable markers for some paternity cases in the last few years. Methods We presented the gene diversity, haplotypic diversity, and forensic statistical parameters of 340 unrelated Uighur males from Kashi region based on the 27 Y‐STRs. Genomic DNA was extracted from bloodstain samples using the Chelex‐100 method and amplified by Yfiler® Plus PCR Amplification kit. Results Gene diversity values on the 27 Y‐STRs ranged from 0.4749 (at DYS437 locus) to 0.9416 (at DYS385a,b loci). According to forensic parameters of the 27 Y‐STR loci, 295 disparate haplotypes were acquired, 258 of which were unique. The haplotypic diversities and discrimination capacities at Yfiler plus 27 loci, Yfiler 17 loci, extended 11 loci, and minimal 9 loci were 0.9990 and 0.8676; 0.9961 and 0.6912; 0.9952 and 0.5941; and 0.9919 and 0.5676, respectively. Multidimensional scaling plot and neighbor‐joining tree between the studied Uighur group and 17 reference populations were conducted, and the obtained results indicated the Kashi Uighur group had the closer genetic relationships with Uighur groups living in different regions. Conclusion To sum up, the present study may provide valuable population data and background information of Kashi Uighur group.
Article
The Hui group is the second largest ethnic minority and one of the most widespread ethnic groups in China. However, the genetic architecture of the Hui population remains largely unexplored, particularly with respect to the male-specific region of the Y chromosome. Here, we studied nine Hui populations (Xinjiang, Qinghai, Gansu, Ningxia, Shaanxi, Henan, Shandong, Sichuan, Yunnan) using 157 Y-chromosome single nucleotide polymorphisms (Y-SNPs) and 27 short tandem repeats (Y-STRs) to unravel their genetic substructure and forensic characteristics. A total of 650 unrelated male samples from the Hui populations were genotyped by SNaPshot®, a single base extension (SBE) assay. Finally, 95 terminal haplogroups and high haplotype diversity (0.9999) were observed in Hui populations. Frequency heat map matrices, genetic distance (FST) and network analysis within Hui populations indicated that these nine Hui populations can be divided into three groups: Hui populations from the northwest (NWH), Hui populations from Sichuan and Shandong (SSH), and Hui populations from Yunnan (YNH). Our results suggested that we should use different databases for different Hui samples in forensic cases. Comparison with other populations that used different population genetic analysis revealed that the Hui populations had close relationships with East Asian populations, especially Chinese Han population. Overall, the high-resolution panel with Y-SNPs and Y-STRs gives new and complete insight into Hui populations, which can be used to interpret the genetic substructure of Hui populations and affect the utility of forensic databases.
Article
Background: Y-chromosomal short tandem repeats (Y-STRs) are widely used in paternity identification, pedigree investigation and human population genetic history. Aim: To investigate the Y-STR polymorphisms in a typical Miao population and explore the genetic differentiation between the Miao population and reference groups. Subjects and methods: We detected 36 Y-STRs genotyping in 455 unrelated Miao individuals from Guizhou province, and analysed genetic differentiation between the Miao population and 67 reference groups. Results: A total of 369 alleles were obtained, and the allele frequencies ranged from 0.0022 to 0.9802. In addition, the haplotype diversity, random match probability and discrimination capacity values were 0.99997, 0.0022 and 0.9934, respectively. Moreover, the genetic relationships between Guizhou Miao and 67 ethnic populations showed that the population stratification was almost consistent with geographic distribution and language-family. Conclusions: The 36 Y-STR loci in this study have good polymorphism distributions in the Guizhou Miao population, and therefore would be a useful tool in forensic identification and male parentage testing and even pedigree investigation.
Article
Short tandem repeat polymorphisms on the male‐specific part of the human Y‐chromosome (Y‐STRs) are valuable tools in many areas of human genetics. Although their paternal inheritance and moderate mutation rate (~10‐3 mutations per marker per meiosis) allow detecting paternal relationships, they typically fail to separate male relatives. Previously, we identified 13 Y‐STR markers with untypically high mutation rates (>10‐2), termed rapidly mutating (RM) Y‐STRs), and showed that they improved male relative differentiation over standard Y‐STRs. By applying a newly developed in silico search approach to the Y‐chromosome reference sequence, we identified 27 novel RM Y‐STR candidates. Genotyping them in 1,616 DNA‐confirmed father‐son pairs for mutation rate estimation empirically highlighted 12 novel RM Y‐STRs. Their capacity to differentiate males related by 1, 2, and 3 meioses was 27%, 47%, and 61%, respectively, while for all 25 currently known RM Y‐STRs it was 44%, 69%, and 83%. Of the 647 Y‐STR mutations observed in total, almost all were single repeat changes, repeat gains and losses were well balanced; allele length and fathers’ age were positively correlated with mutation rate. We expect these new RM Y‐STRs, together with the previously known ones, to significantly improving male relative differentiation in future human genetic applications. This article is protected by copyright. All rights reserved.
Article
Forensic genetic laboratories perform a large amount of STR analyses of the Y chromosome, in particular to analyze the male part of complex DNA mixtures. However, the statistical interpretation of evidence retrieved from Y-STR haplotypes is challenging. Due to the uni-parental inheritance mode, Y-STR loci are connected to each other and thus haplotypes show patterns of relationship on the familial and population level. This precludes the treatment of Y-STR loci as independently inherited variables and the application of the product rule. Instead, the dependency structure of Y-STRs needs to be included in the haplotype frequency estimation process affecting also the current paradigm of a random match probability that is in the autosomal case approximated by the population frequency assuming unrelatedness of sampled individuals. Information on the degree of paternal relatedness in the suspect population as well as on the familial network is however needed to interpret Y-chromosomal results in the best possible way. The previous recommendations of the DNA commission of the ISFG on the use of Y-STRs in forensic analysis published more than a decade ago [1] cover the interpretation issue only marginally. The current recommendations address a number of topics (frequency estimators, databases, metapopulations, LR formulation, triage, rapidly mutating Y-STRs) with relevance for the Y-STR statistics and recommend a decision-based procedure, which takes into account legal requirements as well as availability of population data and statistical methods.