ArticlePDF Available

A Predictive Model for the Risk of Infertility in Women using Supervised Learning Algorithms – A Comparative Analysis

Authors:

Abstract and Figures

Infertility is a worldwide problem, affecting couples in their reproductive age and has caused considerable social, emotional and psychological stress between couples, among families, within the individual concerned and the society at large. This study is aimed at performing a comparative analysis of machine learning classifiers capable of developing the most effective predictive model for the risk of infertility in women. Historical records describing the risk factors of infertility alongside the respective risk of infertility for Nigerian women was collected from Obafemi Awolowo University Teaching Hospital Complex (OAUTHC), Ile-Ife located in south-western Nigeria. The predictive model was formulated using naïve Bayes’, decision trees and multi-layer perceptron. The formulated model was simulated using Waikato Environment for Knowledge Analysis (WEKA) environment. The results of the study showed that decision trees and multi-layer perceptron outperformed naïve Bayes’ classifier’s performance. The decision trees algorithm used identified and used the variables relevant to predicting infertility to construct a decision tree which was used to construct a set of rules explaining the relationship between risk factors and the risk of infertility. These rules whenever applied to any patient’s record containing the values of the risk factors can be used to predict the risk of infertility.
Content may be subject to copyright.
DEVELOPMENT OF A PREDICTIVE MODEL FOR THE RISK OF
INFERTILITY IN WOMEN USING SUPERVISED MACHINE
LEARNING ALGORITHMS
(A Comparative Analysis)
Jeremiah Ademola Balogun, Peter Adebayo Idowu, Olusola Thomas Babawale
Department of Computer Science and Engineering, Obafemi Awolowo University, Ile-Ife, Nigeria
jeremiahbalogun@gmail.com, paidowu@oauife.edu.ng
ABSTRACT
Infertility is a worldwide problem, affecting 8% 15% of couples in their reproductive age.
Infertility has caused considerable social, emotional and psychological stress between
couples, among families, within the individual concerned and the society at large. This study
is aimed at determining the machine learning classifier capable of developing the most
effective predictive model for the risk of infertility in women. Historical records describing
the risk factors of infertility alongside the respective risk of infertility for some women was
collected from one of the teaching hospitals in western Nigeria. The predictive model was
formulated using three supervised machine learning algorithms/classifiers. The formulated
model was simulated using the Waikato Environment for Knowledge Analysis (WEKA)
environment. The results of the performance evaluation of the machine learning algorithms
showed that C4.5 decision trees and multi-layer perceptron had accuracy of 74.4%
outperforming the naïve Bayes’ classifier. The decision trees algorithm also identified and
used the variables relevant to predicting infertility to construct a tree this tree was
converted to a set of rules. These rules whenever applied to any patients record of infertility
risk factors can be used to predict the risk of infertility. The predictive model developed can
be integrated into existing health information systems which can be used by gynecologists to
monitor and predict patients risk of infertility in real time.
Keywords: prediction model, infertility in women, multi-layer perceptron, decision trees,
naïve bayes.
1.0 Introduction
While there is no universal definition of
infertility, a couple is generally considered
clinically infertile when pregnancy has not
occurred after at least twelve months of
regular sexual activity without the use of
contraceptives [1]. Primary infertility is
defined as childlessness and secondary
infertility as the inability to have an
additional live birth for a parous woman.
Although women's infertility is of greater
research consideration, health care
attention and social blame, male
conditions cause or contribute to around
half of all cases of infertility [2].
According to World Health Organization,
infertility is defined as one year of
frequent, unprotected intercourse during
which pregnancy has not occurred [3]. In
another definition, infertility is the
inability of a sexually active woman who
is not practicing contraception to have a
live birth [4].
Early exposures (e.g. in utero or in
childhood) could permanently reprogram
men and women for fecundity or biologic
capacity (e.g. gynecologic and urologic
health or gravid health during pregnancy)
and fertility outcomes (e.g. multiple births
or gestational age at delivery), which could
affect later adult on set diseases [5]. Thus,
infertility could have public health
implications beyond simply the inability to
have children. Infertility can be attributed
to any abnormality in the female or male
reproductive system [3]. The etiology is
mostly distributed fairly equally among the
male and female with factors ranging from
ovarian dysfunction, tubal factors amongst
others. A smaller percentage of cases are
attributed to endometriosis, uterine or
cervical factors, or other causes. In
approximately, one fourth of couples, the
cause is uncertain and is referred to as
unexplained infertility, while etiology is
multifactorial for some couples [6].
In general, an infertility evaluation is
initiated after 12 months of unprotected
intercourse during which pregnancy has
not been achieved. Earlier investigation
may be considered when historical
factors, such as previous pelvic
inflammatory disease or amenorrhea
suggest infertility, although physicians
should be aware that earlier evaluation
may lead to unnecessary testing and
treatment in some cases. Evaluation also
may be initiated earlier when the female
partner is older than 35 years, because
fertility rates decrease and spontaneous
miscarriage and chromosomal abnormality
rates increase with advancing maternal age
[7]. Partners should be evaluated together
and separately, because each person may
want to reveal information about which
their partner is unaware, such as previous
pregnancy or sexually transmitted disease.
The risk factors for infertility can be
classified into: genital, endocrinal,
developmental and general factors. Pelvic
inflame ematory disease (PID) due to
sexually transmitted diseases, unsafe
abortion, or puerperal infection is the main
cause of tubal infertility caused mainly by
chlamydial infection. Polycystic ovarian
syndrome (PCOS) is thought to be the
commonest cause of an ovulatory
infertility [8]. Several lifestyle factors may
affect reproduction, including habits of
diet, clothing, exercise, and the use of
alcohol, tobacco and recreational drugs.
Exposure to textile dyes, lead, mercury
and cadmium, volatile organic solvents
and pesticides has been also associated
with infertility [9]. Estimates of the
proportion of infertility cases attributable
to male or female specific factors in
developed countries were derived in the
1980s by the WHO: 8% of infertility cases
were attributable to male factors, 37 % to
female factors, 35 % to both the male and
female, and 5 % to an unknown cause (the
remaining 15 % became pregnant) [10].
Prediction involves some variables or
fields in the data set to predict unknown or
future value of other variables of interest.
On the other hand, description focuses on
finding patterns describing the data that
can be interpreted by humans. Machine
learning plays an important role in disease
prediction by identifying related pattern
that exists between the risk factors
associated with the likelihood of infertility
in women. This will improve the level of
decision-support offered to the expert
gynecologist during the course of
diagnosis.
This study presents a comparative analysis
of three (3) supervised machine learning
model used to develop predictive models
for the likelihood of infertility in women in
order to propose the most effective and
efficient model. Where possible, variables
that are relevant to predicting the
likelihood of infertility in women
alongside their underlying relationship
were also proposed.
2.0 Related Work
There are different types of diseases whose
likelihood or survival had been predicted
using data mining technique namely
Hepatitis and other liver disorders, Breast
cancer, Thyroid disease, Diabetes,
HIV/AIDS and Tuberculosis etc., For the
purpose of this research, related research
work in the body of knowledge of the
application of machine learning in the area
of fertility were reviewed. Related work
has been discovered to be limited to the
area of In-Vitro Fertilization (IVF) studies,
sperm motility and likelihood of
pregnancy etc. A number of the related
work reviewed is as follows.
Girija et al. [11] developed a predictive
model for the classification of women
health disease (fibroid) using decision
trees algorithm. Data was collected from
three (3) classes of people (no fibroid,
mild condition and sever condition) and
eight (8) features were used in developing
the proposed predictive model. The results
showed that the C4.5 decision trees
algorithm implemented as J48 on WEKA
was able to select two (2) important
features as predictive for fibroid in
women; the features selected were: the age
of the patient and signs of heavy bleeding.
The evaluation of the performance of the
predictive model was observed to give a
value of 56%.
Durairaj et al. [12] applied artificial neural
networks for In Vitro Fertilization (IVF)
data analysis and prediction with the aim
of detecting the success rate of IVF. Data
collected from patients (couples)
containing information about the
endometriosis, tubal factors and follicles in
the ovaries, body mass index, sperm
concentration, duration of infertility,
embryos transferred and the physiological
factors such as stress levels were used to
develop the predictive model needed for
the prediction of the success rate of IVF.
The results showed that the prediction
model developed using the identified
variables had a correlation coefficient (r)
of 0.498 with an accuracy of 75%.
Girela et al. [13] applied artificial
intelligence using machine learning
algorithms to predict semen characteristics
resulting from environmental factors, life
habits and health status in order to develop
a decision support system that can help in
the study of male fertility potential.
Semen samples collected from 123 young,
healthy volunteers were analyzed and
information regarding their life habits and
health status was collected using a
questionnaire. Sperm concentration and
percentage of motile sperm were related to
socio-demographic data, environmental
factors, health status, and life habits in
order to determine the predictive accuracy
of the multi-layer perceptron network
model developed. The results showed that
the most important semen parameter is the
sperm concentration with an accuracy of
90%, sensitivity of 95.45% and specificity
of 50%.
Uyar et al. [14] developed a predictive
model for the outcome of implantation in
an In Vitro Fertilization (IVF) setting
using machine learning methods. The
paper was aimed at predicting the outcome
of implantation of an individual embryo in
an IVF cycle in order to provide decision
support on the number of embryos
transferred. Electronic health records from
2453 embryos transferred at day 2 or day 3
after intracytoplasmic sperm injection
(ICSI). Each embryo was identified using
eighteen (18) clinical features and a class
label (indicating positive and negative
implantation outcomes). Naïve Bayes’
classifier was used to train the predictive
model using 66.7% for training and the
rest for testing over 10 runs and the
evaluation of the performance showed a
value of 80.4% for accuracy, 63.7% for
sensitivity (true positive (TP) rate) and
17.6% for the false positive (FP) rate (1
specificity).
Idowu et al. [15] developed a predictive
model for the likelihood of infertility in
Nigerian women using the multi-layer
perceptron (MLP) architecture of artificial
neural network using three sets of clinical
variables: personal profiles, medical and
surgical history and gynecological history.
Using a filter-based feature selection
algorithm (consistency subset evaluator),
six (6) relevant features were identified out
of the fourteen (14) identified variables
selected. A comparison of the
performance of the predictive model
developed was done using all fourteen (14)
variables and the selected six (6) relevant
variables three different training methods
was used: full dataset, percentage
proportion (60% for training and 40% for
testing) and 10-fold cross validation. The
results of the comparison showed that
using the full training set over-fitted the
predictive model developed while the
performance of the predictive model
developed was shown to improve using the
reduced feature set compared to using the
whole 14 features. The accuracy of the
developed model was observed to be
74.36% before and after feature selection
using the 10-fold cross validation method
but improved from 69.23% (before feature
selection) to 76.92% (after feature
selection).
3.0 Methods
3.1 Data Collection
For the purpose of this study, it was
necessary to identify and collect the data
needed for identifying infertility in women
from gynecologist located at the
University Teaching Hospital. The
variables identified include: age of
menarche, age of marriage, family history
of infertility, menstrual cycle, diabetes
mellitus, hypertension, thyroid disease,
pelvi-abdominal operation, endometriosis,
fibroid disease, polycystic ovary, genital
infection, previous termination of
pregnancy, Sexually Transmitted Infection
(STI) and the likelihood of infertility
(identified using the labels: Likely,
Unlikely and Probably) (Table 1). Data
was collected from a total of 39 patients
with a description of the variables in the
dataset stated as follows:
a. Age of Menarche: is the
identification of the age of the
patient at first menstruation; it is
recorded as a nominal value which
determines the age category in
years identified as equal or less
than 15 years and greater than 15
years.
b. Age of marriage: is the
identification of the patient’s age of
marriage; it is recorded as a
nominal value less than or equal to
30 years and greater than 30 years.
c. Menstrual cycle: is the
identification of the regularity of
the patient’s menstrual cycle; it is a
nominal value identified as
Regular or Irregular.
d. Family history of Infertility: is the
identification of an existing history
of infertility in the family; it is a
nominal value identified as either
Yes or No.
Table 1: Identified variables for determining infertility
S/N
Class of Risk
Risk Factors/Considered
Parameters (Points)
Labels (Points)
1.
Personal Profiles
Age of Menarche
≤15 yrs or >15
yrs
2.
Age of Marriage
≤30 yrs or >30
yrs
3.
Family History of infertility
Yes or No
4.
Menstrual cycle
Regular or
Irregular
5.
Medical and
Surgical history
Diabetes Mellitus
Yes or No
6.
Hypertension
Yes or No
7.
Thyroid
Yes or No
8.
Pelvi-abdominal operation
had
Yes or No
9.
Gynecological
history
Endometriosis
No or Yes
10.
Fibroid
No or Yes
11.
Polycystic Ovary
No or Yes
12.
Genital Infection
No or Yes
13.
Sexually transmitted
Infection (STI)
No or Yes
14.
Previous termination of
pregnancy
No or Yes
e. Diabetes Mellitus: is the
identification of the existence of
diabetes disease in the patient; it is
a nominal value identified as either
Yes or No.
f. Hypertension: is the identification
of if the patient has hypertension
before or presently or not; it is a
nominal value identified as either
Yes or No.
g. Thyroid Disease: is the
identification of the existence of
thyroid disease in the patient; it is a
nominal value identified as either
Yes or No.
h. Pelvi-abdominal operation had: is
the identification of the existence
of pelvi-abdominal operation on
the patient; it is a nominal value
identified as either Yes or No.
i. Endometriosis: is the identification
of the existence of Endometriosis
in the patient; it is a nominal value
identified as either Yes or No.
j. Fibroid disease: is the
identification of the existence of
fibroid disease in the patient; it is a
nominal value identified as either
Yes or No.
k. Polycystic ovary: is the
identification of the patient having
a polycystic ovary; it is a nominal
value identified as either Yes or No.
l. Genital infection: is the
identification of a genital infection
in the patient; it is a nominal value
identified as either Yes or No.
m. Previous termination of pregnancy:
is the identification of the patient
having a previous termination of
pregnancy; it is a nominal value
identified as either Yes or No.
3.2 Data-Preprocessing
Following the collection of data from the
required respondents; 39 patients with
their respective attributes (14 infertility
risk indicators) alongside the likelihood of
infertility was identified. In addition, the
task of data cleaning for noise removal
(errors, misspellings etc.) and missing data
were performed on the information
collected from the health records.
Following this process, all data cells
describing the attributes (fields) of each
patient were found to be filled. No
missing data were found in the repository
and all misspellings were corrected.
In order for the dataset collected to be fit
for the simulation environment; the dataset
was converted to a more compactible data
storage format. This would make the
dataset fit for all the necessary machine
learning operations performed by the
simulation environment. Important to the
study is the ability of the machine learning
techniques to identify the most important
combination of features that are more
likely to improve the predicting the
likelihood of infertility.
The dataset collected was converted to the
required format needed for simulation; the
Waikato Environment for Knowledge
Analysis (WEKA) called the attribute
relation file format (.arff) a light-weight
java application with a number of
supervised and unsupervised machine
learning tools. This format allows for the
formal identification of the file name,
attribute names and labels alongside the
dataset that correspond to each attribute
expressed using their respective labels.
Figure 1 shows the format of the .arff file
format chosen for the formal
representation of the dataset using the 39
patient data collected.
Figure 1: arff file containing identified
attributes
3.3 Model Formulation
Systems that construct classifiers are one
of the commonly used tools in data
mining. Such systems take as input a
collection of cases, each belonging to one
of a small number of classes and described
by its values for a fixed set of attributes,
and output a classifier that can accurately
predict the class to which a new case
belongs. Supervised machine learning
algorithms make it possible to assign a set
of records (infertility risk indicators) to a
target classes the risk of infertility
(Unlikely, Likely and Benign).
Supervised machine learning algorithms
are Black-boxed models, thus it is not
possible to give an exact description of the
mathematical relationship existing among
the independent variables (input variables)
with respect to the target variable (output
variable risk of infertility). Cost
functions are used by supervised machine
learning algorithms to estimate the error in
prediction during the training of data for
model development. Gradient decent and
other related algorithms are used to reduce
the error by estimating cost function
parameters.
3.3.1 Naïve Baye’s Classifier
Naive Bayes Classifier is a probabilistic
model based on Baye’s theorem. It is
defined as a statistical classifier. It is one
of the frequently used methods for
supervised learning. It provides an
efficient way of handling any number of
attributes or classes which is purely based
on probabilistic theory. Bayesian
classification provides practical learning
algorithms and prior knowledge on
observed data.
If X is a data sample containing instances,
Xi where each instances are the infertility
likelihood risk factors. Let H be a
hypothesis that X belongs to class C which
contains likely, probable and unlikely
cases. Classification requires the
determination of the following:
P(Hj|X) the posteriori probability:
the probability that the hypothesis,
Hj (unlikely, benign or likely)
holds given the observed data
sample X.
P(Hj) - prior probability: the initial
probability of the class, j;
P(Xi): probability that sample data
is observed for each attribute, i;
P(Xi|H) - likelihood: the
probability of observing the
sample’s attribute, Xi given that the
hypothesis holds in the training
data X; and
The posteriori probability of a
hypothesis Hj defined as either of unlikely,
likely or benign, P(Hj|Xi), follows the
Baye’s theorem as follows:

   
The breast cancer risk output class is thus:
  
Where   is the set of risk factors for infertility likelihood of each patient, X
and      is the target class set.
3.3.2 Decision Trees Algorithm
The theory of a decision tree has the
following parts: a root node is the starting
point of the tree; branches connect nodes
showing the flow from question to answer.
Nodes that have child nodes are called
interior nodes. Leaf or terminal nodes are
nodes that do not have child nodes and
represent a possible value of target
variable given the variables represented by
the path from the root. The rules are
inducted by definition from each
respective node to branch to leaf [16].
Splitting points attribute variables and
values of chosen variables are chosen
based on Gini impurity (eqn. 3) and Gini
gain (eqn. 4) as expressed below by
Chaurasia et al. [16]:
   

 
  
Where  is the probability of getting i
in node t, and the target variable takes
values in {1, 2, 3… m}. is the
proportion of cases in node t divided to the
left child node and is the proportion of
cases in t sent to the right child node. If
the target variable is continuous, the split
criterion is used with the Least Squares
Deviation (LSD) as impurity measure. If
there is no Gini gain or the preset stopping
rule are satisfied, the splitting process
stops.
Given a set S of cases, C4.5 first grows an
initial tree using the divide-and-conquer
algorithm as follows:
If all the cases in S belong to the
same class or S is small, the tree is
a leaf labeled with the most
frequent class in S.
Otherwise, choose a test based on a
single attribute with two or more
outcomes. Make this test the root
of the tree with one branch for each
outcome of the test, partition S into
corresponding subsetsS1,
S2,...according to the outcome for
each case, and apply the same
procedure recursively to each
subset.
ID3 (Iterative Dichotomiser 3) developed
by Ross Quinlian [17] is a classification
tree used in the concept of information
entropy. This provides a method to
measure the number of bits each attribute
can provide, and the attribute that yields
the most information gain becomes the
most important attribute and it should go at
the top of the tree. Repeat this procedure
until all the instances in the node are in the
same category.
In this study, there are three outcomes,
namely: Likely (u1), Unlikely (u2) and
probably (u3) in the root node T of target
variable. Let u1, u2 and u3 denote the
number of probable, unlikely and likely
records, respectively. The initial
information entropy is given by equation 5
as:

 
If attribute X (a risk indicator of infertility)
with values {x1 and x2} is chosen to be the
split predictor and partition the initial node
into {T1, T2, T3… TN}, and u1, u2 and u3
denote the number of probable, unlikely
and likely records in the child node j. The
expected information entropy, EI(X) and
information gain, G(X) are given by:
  
 
 
In 1993, Ross Quinlan made several
improvements to ID.3 and extended it to
C4.5 [17]. Unlike ID.3 which deals with
discrete attributes, C4.5 handles both
continuous and discrete attributes by
creating a threshold to split the attribute
into two groups, those above the threshold
and those that are up to and including the
threshold. C4.5 also deals with records
that have unknown attribute values. C4.5
algorithm used normalized information
gain or gain ratio as a modified splitting
criterion of information gain which is the
ratio of information gain divided by the
information due to the split of a node on
the basis of the value of a specific
attribute. The reason of this modification is
that the information gain tends to favor
attributes that have a large number of
values.
3.3.3 Multi-layer Perceptron
Architecture
Multi-layer perception (MLP) is a natural
extension of the single layer perception
network of the class of artificial neural
networks used in artificial intelligence. It
is characterized by a forward flow of a set
of inputs passing through subsequent
hidden and computational layers
composed by perception neurons using the
feed-forward algorithm (Figure 3). The
usage of MLPs is defended by the fact that
they are able to predict and detect more
complicated patterns in data. This is
because multi-layer perceptron uses an
additional algorithm which is called the
back-propagation algorithm.
Figure 3: Structure of the multi-layer
perceptron architecture
The back-propagation algorithm used in
this study to train the network consists of
two steps:
Step 1 - Forward pass: the inputs
are passed through the network
layer by layer and an output is
produced. During this step, the
synaptic weights are fixed; and
Step 2 - Backward pass: the output
from step 1 is compared to the target
producing an error signal. That is
propagated backwards. The aim of
this step is to reduce the error in a
statistical sense by adjusting the
synaptic weights according to a
defined scheme.
The multilayer perception has the
following characteristics:
At all neurons within the network
feature, a nonlinear activation
function that is differentiable is
present everywhere;
The network has one or more
hidden layers made up of neurons
that are removed from direct contact
with input and output. These neurons
calculate a signal expressed as a
nonlinear function of its input with
synaptic weights and an estimate of
the gradient vector; and
There is a high degree of
interconnectivity within the network.
The mathematical model of the multi-layer
perceptron in Figure 3 is as follows:
i. The Input Layer
In this part of the multi-layer perception
(MLP) the input values, Xn (factors
responsible for infertility in women) are
entered into the MLP system where n is
the number of attributes (n=14 in this
study) and the weights, Wi of each input,
Xi produce a summation, Uk which is
added to a bias variable, X0 (takes a value
of 0 or 1) all equal to Vk is sent to the
hidden layer for the activation function, φ
to take effect where k is the hidden layer.
The Summation Uk has the expression as
follows:
 

Where   is the patient’s record containing the factors considered
predictive for the prediction of infertility in women.
And n = 14 attributes (input variables, xn)
ii. The Hidden Layer
At this part of the MLP the summation of
the input variables are all sent to the
activation function which is fired through
all the hidden layers (for the purpose of
this study 20 layers was used) using the
activation function called the sigmoid
function. The sigmoid function is
expressed as:
 
 
Where  is a shape parameter of the
sigmoid function
And  
iii. The Output Layer
At this point, the value of the output
(infertility status) is determined with the
error rate as low as possible. Also, the
back-propagation algorithm is applied
which tries to reduce the error rate, of the
model via gradient descent by adjusting
the values of the synaptic weights before
the neuron fires the next set of inputs. At
iteration m (the mth row in the training
set) which in this case is 39, the error for
neurons in the output layer is calculated in
order to determine the error in
computation. The error is calculated thus:
  

 
    
 
  
 
Where ypi and yai are the predicted and actual output for patient, i
And m is the total patient data (m = 39)
3.4 Performance Evaluation
Following the development of the
predictive model using all the proposed
methods, the performance of the model
was evaluated using the confusion matrix
to determine the value of the performance
metric chosen for this study. A confusion
matrix contains information about actual
and predicted classification done by a
classification system and its performance
is commonly evaluated using the data in
the matrix (Figure 4). In this study, the
likely cases are the positive cases while the
probable and unlikely cases are the
negative cases. Also, correctly classified
cases are placed in the true cells (positive
and negative) while incorrect
classifications are placed in the false cells
(positive and negative) and this has
generated the rule (i) to (iv), below:
i. True positives (TP) are correctly
classified positive cases;
ii. False positives (FP) are incorrectly
classified positive cases;
iii. True negatives (TN) are correctly
classified negative cases; and
iv. False negatives (FN) are
incorrectly classified negative
cases.
In order to capture the performance of the
decision trees algorithm used to classify
breast cancer risk, there is the need to plot
the results of the classification on a
confusion matrix (Figure 4). A confusion
matrix is a square which shows the actual
classification along the vertical and the
predicted along the vertical. All correct
classifications lie along the diagonal from
the north-west corner to the south-east
corner also called True Positives (TP) and
True Negatives (TN) while other cells are
called the False Positives (FP) and False
Negatives (FN). In this study, the likely
cases are considered as the positive case
while the unlikely and probable cases are
the negative cases.
Figure 4: Diagram of a Confusion
Matrix
From a confusion matrix, different
measures of the performance of a
prediction model can be determined using
the values of the true positive/negatives
and false positives/negatives. For the
purpose of this study, the positive cases
are the Likely Cases of infertility while the
negative cases are probably and Unlikely
cases.
a. True Positive rates (TP
rates/Recall) proportion of
positive cases correctly classified
  

b. False Positive rates (FP rates/False
alarms) proportion of negative
cases incorrectly classified as
positives
  
 
c. Precision proportion of predicted
positive cases that were correct
  

d. Accuracy proportion of the total
predictions that was correct.
   
  
4.0 Results
4.1 Data Description
The data containing information about the
attributes and the respective infertility
status for 39 patients is shown in Table 2
alongside the distribution of the data
shown in Figure 5. It was observed that
out of the 39 patients, 19 were likely
infertile, 3 were probably infertile and 17
were unlikely infertile. The highest
distribution was: 23 with age of menacre
less than or equal to 15 years, 23 had
thyroid disease, 22 had no family history
of infertility, 20 had no previous
terminated pregnancy, 21 had irregular
menstrual cycle, 21 had diabetes mellitus,
21 had hypertension, 21 had polycyctic
ovary and 21 had no genital infection.
The lowest distribution was: 16 had age of
menacre more than 15 years, 16 had no
thyroid disease, 17 had family history of
infertility, 17 had previously terminated
pregnancy, 18 had irregular menstrual
cycle, 18 had no diabetes mellitus, 18 had
no hypertension, 18 had no polycyctic
ovary and 18 had genital infection.
Table 2: Description of the identified variables
Variable Type
Attributes
Labels
Values
I
N
P
U
T
Age of Menacre
<=15 years
23
>15 years
16
Age of Marriage
<=30 years
20
>30 years
19
Family History of Infertility
No
22
Yes
17
Menstrual Cycle
Irregular
21
Regular
18
Diabetes Mellitus
No
18
Yes
21
Hypertension
No
18
Yes
21
Thyroid Disease
No
16
Yes
23
Pelvi-Abdominal Operation
No
20
Yes
19
Endometriosis
No
19
Yes
20
Fibroid
No
20
Yes
19
Polycyctic Ovary
No
18
Yes
21
Genital Infection
No
21
Yes
18
Previous Terminated Pregnancy
No
22
Yes
17
OUTPUT
Infertility Status
Likely
19
Probably
3
Unlikely
17
4.2 Simulation Results
Three different supervised machine
learning algorithms were used to formulate
the predictive model for the likelihood of
infertility; they were used to train the
development of the prediction model using
the dataset containing 39 patients’ risk
factor records. The simulation of the
prediction models was done using the
Waikato Environment for Knowledge
Analysis (WEKA). The C4.5 decision
trees algorithm was implemented using the
J48 decision trees algorithm available in
the trees class, the naïve Bayes’ algorithm
was implemented using the naïve Bayes’
classifier available in the Bayes class
while the Multi-layer perceptron was
implemented using the multi-layer
perceptron classifier available in the
functions class all available on the WEKA
environment of classification tools. The
models were trained using the 10-fold
cross validation method which splits the
dataset into 10 subsets of data while 9
parts are used for training the remaining
one is used for testing; this process is
repeated until the remaining 9 parts take
their turn for testing the model.
4.2.1 Results of the naïve Bayes’
classifier
Using the naïve Bayes’ classifier to train
the predictive model developed using the
training data via the 10-fold cross
validation method, it was discovered that
there were 28 (71.79%) correct
classifications and 11 (28.21%) incorrect
classifications showing an accuracy of
71.8% (Figure 5).
Figure 5: Simulation results for naïve
Bayes’ classifier
Using the confusion matrix, it was
discovered that out of 19 likely cases there
were 15 correct classifications while1
misclassified for probable and 3 for
unlikely. Out of 3 probable cases there
were no correct classifications while 1
misclassified for likely and 2 for unlikely.
Out of 17 unlikely cases there were 13
correct classifications with 3 misclassified
for likely and 1 for probable (Figure 6
left). Figure 7 shows a graphical plot of
the correct and incorrect classifications
correct classifications are crosses while
incorrect classifications are boxes
.
Figure 6: Confusion matrix of each machine learning algorithm results
Figure 7: Graphical plot of simulation results for naïve Bayes’
4.2.2 Results of the C4.5 decision trees
classifier
Using the C4.5 decision trees classifier to
train the predictive model developed using
the training data via the 10-fold cross
validation method, it was discovered that
there were 29 (74.36%) correct
classifications and 10 (25.64%) incorrect
classifications showing an accuracy of
74.4% (Figure 8). Using the confusion
matrix, it was discovered that out of 19
likely cases there were 15 correct
classifications while1 misclassified for
probable and 3 for unlikely. Out of 3
probable cases there were no correct
classifications while 2 misclassified for
likely and 1 for unlikely. Out of 17
unlikely cases there were 14 correct
classifications with 3 misclassified for
likely (Figure 6 middle). Figure 9 shows
a graphical plot of the correct and incorrect
classifications correct classifications are
crosses while incorrect classifications are
boxes.
Figure 8: Simulation results for C4.5
decision trees classifier
Figure 9: Graphical plot of simulation
results for C4.5 decision trees
For every decision trees algorithm there is
always a hierarchical tree with an
attributes at each node form the parent
node all the way to the child node to the
leave - the target class. The tree can be
coverted to a rule by following the patten
from the parent ode at the top all the way
to the child node until the bottom leaf is
achieved where the necessary
classification is defined. Figure 10 shows
the decision trees constructed during the
model development; it can be seen that a
number of variables were identified as
been relevant for infertility likelihood
prediction. It can also be discovered that
the size of the tree is 6 and the number of
leaves plotted are 5. The variables
identified are:
iv. Prevous termination of pregnancy
v. Menstrual Cycle
vi. Age of Manacre and
vii. Genital Infection
Figure 10: Graphical plot of the
decision tree for infrtility likelihood
Using the decision tree in Figure 10, the
following rule can be used to predict the
likelihood of infertility in women given
the values of the four identified risk
factors. The rule can be read as follows:
IF Previous Termination of Pregnancy = “Yes” THEN infertility likelihood = “Likely
Else IF Previous Termination of Pregnancy = “No” THEN
IF Menstrual Cycle = “Regular” THEN infertility likelihood = “Unlikely
Else IF Menstrual Cycle = “Irregular” THEN
IF Age of Menacre = “>15 years” THEN infertility likelihood = “Probable
Else If Age of Menacre = “<=15 years” THEN
IF Genital Infection = “Yes” THEN infertility likelihood = “Likely
Else IF Genital Infection = “No” THEN infertility likelihood =
Unlikely
4.2.3 Results of the Multi-Layer
Perceptron (MLP) classifier
Using the Multi-layer perceptron classifier
to train the predictive model developed
using the training data via the 10-fold
cross validation method, it was discovered
that there were 29 (74.36%) correct
classifications and 10 (25.64%) incorrect
classifications showing an accuracy of
74.4% (Figure 11). Using the confusion
matrix, it was discovered that out of 19
likely cases there were 16 correct
classifications while 2 misclassified for
probable and 1 for unlikely. Out of 3
probable cases there were no correct
classifications while 1 misclassified for
likely and 2 for unlikely. Out of 17
unlikely cases there were 13 correct
classifications with 1 misclassified for
likely and 3 for probable (Figure 6 right).
Figure 12 shows a graphical plot of the
correct and incorrect classifications
correct classifications are crosses while
incorrect classifications are boxes.
Figure 11: Simulation results for C4.5
decision trees classifier
Figure 12: Graphical plot of simulation
results for C4.5 decision trees
4.3 Discussions
Table 3 gives a summary of the simulation
results by presenting the average value of
each performance metrics that was
evaluated for the machine learning
techniques used. The True positive rate
(recall/sensitivity), false positive rate (false
alarm/1-specificity), precision, accuracy
and the area under the receiver operating
characteristics (ROC) curve were used.
From the table, it was discovered that the
decision trees and the MLP algorithms
showed the highest accuracy due to the
ability to predict 29 out of the 39 records
correctly. The true positive rate was also
highest for the decision trees and the MLP
algorithms with an equal value of 0.744
which implies that 74.4% of the actual
positive cases (likely) were correctly
classified. The MLP showed the lowest
value for the false positive rate with a
value of 0.119 which implies that 11.9%
of the actual negative classes (probable or
unlikely) were misclassified for positive
cases. The MLP also had the highest value
for the precision with a value of 0.787
which implies that 78.7% of the positive
classifications made were actually positive
classes. The decision trees algorithm was
observed to have the lowest area under the
receiver operating characteristics (ROC)
curve a graph of the TP rate against the
FP rate which had a value of 0.722. The
area under the graph is used to identify the
level of relevance that can be given to the
machine learning algorithm at making
predictions thus, the higher the value
then the lower the bias of the model.
Table 3: Summary of simulation results
Metrics
Accuracy (%)
TP rate
(recall)
FP rate
(False
alarm)
Precision
Area under
ROC Curve
(AUC)
Naïve
Bayes’
71.795
0.718
0.201
0.699
0.855
Decision
Trees
74.359
0.744
0.203
0.704
0.722
Multi-layer
Perceptron
74.359
0.744
0.119
0.787
0.862
From the simulation results, it can be
inferred that the most effective supervised
machine learning algorithm is the multi-
layer perceptron (MLP) due to its high
accuracy, TP rate and Precision with lower
value for the FP rate. The variables
identified and the rule deduced from the
variables using the decision trees
algorithm can also be used to support
decision made by gynecologist concerning
infertility likelihood in women.
5.0 Conclusions
In this paper, the development of a
predictive model for determining the
likelihood of infertility in Nigerian women
was proposed using dataset collected from
patients in Obafemi Awolowo University
Teaching Hospital Complex (OAUTHC),
Ile-Ife, Osun State in Nigeria. 14 variables
were identified by gynecologist to be
necessary in predicting infertility in
women for which a dataset containing
information of 39 patients alongside their
respective infertility status (likely, unlikely
and probably) was also provided with 14
attributes following the identification of
the required variables.
After the process of data collection and
pre-processing, three supervised machine
learning algorithms were used to develop
the predictive model for the likelihood of
infertility in women using the historical
dataset from which the training and testing
dataset was collected. The 10-fold cross
validation method was used to train the
predictive model developed using the
machine learning algorithms and the
performance of the models evaluated.
The multi-layer perceptron proved to be an
effective algorithm for predicting
infertility in women given the attributes
identified but it is believed that higher
accuracy could be attained by increasing
the number of records used and be
identifying other relevant attributes which
could help predict infertility in women.
Rule induced algorithms can also be used
to plot the relationship between the
selected attributes identified with respect
to determining the likelihood of infertility
in women using the decision trees
algorithm.
6.0 REFERENCES
[1] American Society for Reproductive
Medicine, ASRM (2006): Smoking
and Infertility. Sterility 86: 172 - 177.
[2] J. Boivin, L. Bunting, J.A. Collins and
K.G. Nygren (2009): International
Estimates of Infertility Prevalence and
Treatment Seeking: Potential need and
demand for infertility medical care.
Human Reproduction 24: 23792380.
[3] World Health Organization. Infertility:
A Tabulation of Available Data on
prevalence of Primary and Secondary
Infertility (1999): Programme on
Maternal and Child Health and Family
Planning, Division of Family Health,
1999, Geneva.
[4] American Society for Reproductive
Medicine, ASRM (2008): Fertility.
Sterility 90(7): 2361- 2365.
[5] N.E. Skakkebaek, E. Rajpert-De Meyts
and K.M. (2001): Main Testicular
dysgenesis syndrome: An increasingly
common developmental disorder with
environmental aspects. Human
Reproduction 16: 972980.
[6] E.E. Puscheck and T.Z. Woodad
(2009): Infertility: e-Medicine
Specialties Obstetrics and
Gynaecology. Available from
http://emedicine.medscape.com/article/
274143.
[7] B.M. Audu, A.A. Massa and M. Bukar
(2003): Clinical Presentation of
Infertility in Gombe, North-Eastern,
Nigeria. Tropical Journal of Obstetrics
Gynaecology 20: 93-96.
[8] F. Okonofua (1996): Infertility in
Developing Countries. British Journal
of Obstetrics and Gynecology 103:
957-62.
[9] A. Idrisa (2005): Infertility. In:
Kwawukume, EY and Emuveyan, E.E.
eds. Comprehensive Gynaecology in
the tropics, Graphic Packaging, Accra:
333345.
[10]World Health Organization (1992):
Infections, Pregnancies, and Infertility:
Perspectives on Prevention. Sterility
47: 964 968.
[11]D.K. Girija and M.S. Shashidhara
(2012): Classification of Women
Health Disease (Fibroid) Using
Decision Tree Algorithm.
International Journal of Computer
Applications in Engineering Sciences
2(3): 205 209.
[12]M. Durairaj and P. Thamilselvan
(2013): Applications of Artificial
Neural Network for IVF Data Analysis
and Prediction. Journal of
Engineering, Computers and Applied
Sciences (JEC&AS) 2(9):11 15.
[13]J.L. Girela, D. Gil, M. Johnsson, M.J.
Gomez-Torres and J. De Juan (2013):
Semen Parameters can be predicted
from Environmental Factors and
Lifestyle Using Artificial Intelligence
Methods. Biology of Reproduction
88(4): 1 8.
[14]A. Uyar, A. Bener and H.N. Ciray
(2014): Predictive Modeling of
Implantation Outcome in an In Vitro
Fertilization Setting: An Application of
Machine Learning methods. Medical
Decision Making, May 2014.
Retrieved from Research Gate at
http://www.researchgate.net/publicatio
n/262536801.
[15]P.A. Idowu, S.O. Sarumi and J.A.
Balogun (2015): A Prediction model
for the likelihood of infertility in
women. In 9th International
Conference on Information and
Communications Technology (ICT)
Applications, Ilorin, Kwara: 78 88.
[16]S. Chaurasia, P. Chakrabarti and N.
Chouraisia (2012): An application of
Classification Techniques on Breast
Cancer Prognosis. International
Journal of Computer Applications.
Volume 59: 1 12.
[17]J.R. Quinlan (1993): C4.5: Programs
for machine learning. Morgan
Kaufmann Publishers
... Prediction contains variables in the data set to make analysis and find patterns which describes the data structure that can be interpreted by humans (6). Machine learning is a fast-growing field which explores how computers can automatically learn to recognize complex data structures and make a conclusion based on a set of observed data (7). ...
... Prediction contains variables in the data set to make analysis and find patterns which describes the data structure that can be interpreted by humans (6). Machine learning is a fast-growing field which explores how computers can automatically learn to recognize complex data structures and make a conclusion based on a set of observed data (7). ...
Article
Full-text available
Objective:Infertility is a worldwide problem and causes considerable social, emotional and psychological stress between couples and among families. This study is aimed at determining the machine learning classifier capable of developing the most effective predictive model to determine the risk of infertility in men by genetic and external factors.Materials and Methods:The dataset was collected at Ondokuz Mayıs University in the Department of Urology. The model was developed using supervised learning methods and by algorithms like decision tree, K nearest neighbor, Naive bayes, support vector machines, random forest and superlearner. Performances of the classifiers were assessed with the area under the curve.Results:Results of the performance evaluation showed that support vector machines and superlearner algorithms had area under curve of 96% and 97% respectively and this performance outperformed the remaining classifier. According to the results for importance of variables sperm concentration, follicular stimulating hormone and luteinizing hormone and some genetic factors are the important risk factors for infertility.Conclusion:These findings, whenever applied to any patient’s record of infertility risk factors, can be used to predict the risk of infertility in men. The predictive model developed can be integrated into existing health information systems which can be used by urologists to predict patients’ risk of infertility in real time.
... There are lot of diseases that has been predicted using various data mining techniques. Some of the techniques has predicted the diseases at the early stage while some has helped in curing the diseases [24] in their paper performed a comparative analysis of machine learning techniques for predicting infertility in women. Naïve Bayes, C4.5 decision trees algorithm and multi-layer perceptron algorithms were used to formulate the predictive model for infertility in women. ...
Article
Full-text available
Infertility in women has been the general trend because people do not believe that men too can be infertile, but nowadays it is has been verified that male as a role to play in infertility as well as the female. This study developed a fuzzy logic model for the prediction of risk of infertility in men. The work identified the non-invasive risk factors and their associated relationship with the risk of male infertility from medical experts; and collected relevant data from 28 males. The model using was formulated using triangular membership functions equal to the number of risk factor labels and adopted the relationship for creating 4374 IF-THEN rules. The model was simulated using Fuzzy Logic Toolbox of the MATLAB software and was validated using the 28 male records collected. The result of the model showed an accuracy of 100% owing to the capacity to map underlying rules to every data record applied. The study concluded that the model will provide effective decision-support required for mitigating the related effects of male infertility in Nigeria.
Article
Full-text available
Infertility has massively disrupted social and marital life, resulting in stressful emotional well-being. Early diagnosis is the utmost need for faster adaption to respond to these changes, which makes possible via AI tools. Our main objective is to comprehend the role of AI in fertility detection since we have primarily worked to find biomarkers and related risk factors associated with infertility. This paper aims to vividly analyse the role of AI as an effective method in screening, predicting for infertility and related risk factors. Three scientific repositories: PubMed, Web of Science, and Scopus, are used to gather relevant articles via technical terms: (human infertility OR human fertility) AND risk factors AND (machine learning OR artificial intelligence OR intelligent system). In this way, we systematically reviewed 42 articles and performed a meta-analysis. The significant findings and recommendations are discussed. These include the rising importance of data augmentation, feature extraction, explainability, and the need to revisit the meaning of an effective system for fertility analysis. Additionally, the paper outlines various mitigation actions that can be employed to tackle infertility and its related risk factors. These insights contribute to a better understanding of the role of AI in fertility analysis and the potential for improving reproductive health outcomes.
Article
Full-text available
This paper aims to predict the success rate of in-vitro fertilization (IVF) using Artificial Neural Network (ANN). Artificial Neural Networks are founds very useful for a number of medical diagnosis applications [1]. In this work, the ANN is used for processing the patients IVF data and assessing the possible success rate of the treatment which could help the gynecologist to suggest the infertility patients who undergo fertility treatment for baby. In recent years, the number of infertile couples seeking infertility treatment rapidly increases due to the increasing awareness of test tube baby treatments. The increased success rate of infertility treatment through IVF and ICSI (intra-cytoplasmic sperm injection) makes the people to consider this option to have babies. The treatment of having babies though IVF / ICSI is a costly affair and there is no reliable methodology to assess the success rate of the treatment, and success rate differs from patients to patients. There are a number of factors affecting the success of the particular treatment, including male and female factors and various IVF test results. Even the psychological factors of the couples play major role in deciding the success of the treatment, and multiple cycles of the treatments increase the cost as well as affecting the patients' health and increases the stress level. In this work, the efforts of applying artificial neural network for predicting the success of the IVF treatment for individual couples who undergo fertility treatment is carried out. The data used in this work contain the information of various tests and medical examination results of the couples such as endometriosis, tubal factors and follicles in the ovaries, and the physiological factors such as stress level factors.
Article
Full-text available
Study question Has WHO been able to generate prevalence values or definitions for subfertility/infertility that are being used by clinicians, policy makers, governments and key stakeholders? Have the lack of diagnosis and management of sub-fertile/ infertile individuals and couples significantly hindered obtaining global burden estimates? Summary answer Limited dedication to addressing the global majority of the infertile with management and treatment, coupled with limited ability to diagnosis subfertility within developing country health care services, results in inability for UN Member States, especially in developing countries, to use a WHO epidemiological indicator to generate an infertility prevalence value. What is known already Estimates of infertility prevalence derived by WHO since 1984 have varied widely with reports of global estimates of 80 million women (1984), 186 million couples in developing countries (2004), 33.4 women infertile due to unsafe abortion/maternal sepsis(2011) and 39.6 million and 48.5 million women in 1990 and 2010, respectively (2012). Each prevalence assessment was reached using a different mechanism and developing different definitions of infertility in order to determine and report a global burden value. Study design, size, duration Retrospective and systematically analysis will compare definitions that have led to mechanisms, algorithms and qualifiers which have been used to generate and to report WHO infertility prevalence values since the 1980s. An analysis will be made on the processes used to generate WHO Recommended indicators for infertility prevalence and determination. Participants/materials, setting, methods WHO reports, guidance, manuals and articles that are WHO derived or written on behalf of WHO, by expert advisors and collaborating agencies will be assessed. Responses provided within context to the unmet need will be compared over time in linkage with these documents presenting estimated global burden and prevalence values. Main results and the role of chance Limited dedication to addressing and limited ability to diagnosis, manage and treat the global majority of the infertile within developing country health care services, results in inability for these UN Member States to use a WHO epidemiological infertility prevalence indicator. Despite the gains in other MNCRH indicators, trends in infertility prevalence rates remain stubbornly unmoved. Can the global community confidently support that “prevention” is not the only answer? Global commitment to support research, monitoring and evaluation of the infertile is slightly increasing, however mainly through limited linkages with other global public health initiatives rather than through recognition of the unmet need of the infertile themselves. This is despite extremely large and significant identification of its global burden on health, and therefore impact on health systems. Limitations, reason for caution Rates of maternal mortality, child mortality and other maternal, newborn, child and reproductive health (MNCRH) indicators are decreasing in global estimates. “Prevention” of infertility had taken center stage beginning in the late 1980s, with an expectation that as rates of MNCRH indicators decreased, the burden of infertility would naturally fall. Wider implications of the findings If efforts were made to reach an concensus-driven, agreed upon definition and algorithm for infertility prevalence, (in collaboration with WHO, WHO partners in infertility, and its UN Member States) coupled with a consensus-driven standardized functional protocol for evaluating sub/infertility in men and women which WHO could recommend to all UN Members States, these could significantly contribute to global understanding of this "disease of the reproductive system" and result in more equitable access and treatment worldwide. Study funding/competing interest(s) HRP Special Programme in Human Reproduction Trial registration number Not applicable
Article
Full-text available
Background: Multiple embryo transfers in in vitro fertilization (IVF) treatment increase the number of successful pregnancies while elevating the risk of multiple gestations. IVF-associated multiple pregnancies exhibit significant financial, social, and medical implications. Clinicians need to decide the number of embryos to be transferred considering the tradeoff between successful outcomes and multiple pregnancies. Objective: To predict implantation outcome of individual embryos in an IVF cycle with the aim of providing decision support on the number of embryos transferred. Design: Retrospective cohort study. Data source: Electronic health records of one of the largest IVF clinics in Turkey. The study data set included 2453 embryos transferred at day 2 or day 3 after intracytoplasmic sperm injection (ICSI). Each embryo was represented with 18 clinical features and a class label, +1 or -1, indicating positive and negative implantation outcomes, respectively. Methods: For each classifier tested, a model was developed using two-thirds of the data set, and prediction performance was evaluated on the remaining one-third of the samples using receiver operating characteristic (ROC) analysis. The training-testing procedure was repeated 10 times on randomly split (two-thirds to one-third) data. The relative predictive values of clinical input characteristics were assessed using information gain feature weighting and forward feature selection methods. Results: The naïve Bayes model provided 80.4% accuracy, 63.7% sensitivity, and 17.6% false alarm rate in embryo-based implantation prediction. Multiple embryo implantations were predicted at a 63.8% sensitivity level. Predictions using the proposed model resulted in higher accuracy compared with expert judgment alone (on average, 75.7% and 60.1%, respectively). Conclusions: A machine learning-based decision support system would be useful in improving the success rates of IVF treatment.
Article
Full-text available
Fertility rates have dramatically decreased in the last two decades, especially in men. It has been described that environmental factors as well as life habits may affect semen quality. In this paper we use artificial intelligence techniques in order to predict semen characteristics from environmental factors, life habits and health status, as a possible Decision Support System that can help in the study of the male fertility potential. One hundred twenty three young healthy volunteers provide a semen sample that was analyzed according to the World Health Organization 2010 criteria. They also were asked to fulfill a validated questionnaire about life habits and health status. Sperm concentration and percentage of motile sperm were related to socio-demographic data, environmental factors, health status, and life habits, to determine the predictive accuracy of a Multilayer Perceptron Network, a type of Artificial Neural Network. In conclusion, we have developed an Artificial Neural Network that can predict the results of the semen analysis, based on the data collected by the questionnaire. The semen parameter that is best predicted using this methodology is the sperm concentration. Although the accuracy of motility is slightly lower than concentration, it is possible to predict it with a significant accuracy. This methodology can be a useful tool in order to early diagnosis of patients with seminal disorders or in the selection of candidates to become semen donors.
Article
Approximately 30% of reproductive age women and 35% of reproductive age men in the United States smoke cigarettes. Substantial harmful effects of cigarette smoke on fecundity and reproduction have become apparent but are not generally appreciated.
Article
Numerous reports have recently focused on various aspects of adverse trends in male reproductive health, such as the rising incidence of testicular cancer; low and probably declining semen quality; high and possibly increasing frequencies of undescended testis and hypospadias; and an apparently growing demand for assisted reproduction. Due to specialization in medicine and different ages at presentation of symptoms, reproductive problems used to be analysed separately by various professional groups, e.g. paediatric endocrinologists, urologists, andrologists and oncologists. This article summarizes existing evidence supporting a new concept that poor semen quality, testis cancer, undescended testis and hypospadias are symptoms of one underlying entity, the testicular dysgenesis syndrome (TDS), which may be increasingly common due to adverse environmental influences. Experimental and epidemiological studies suggest that TDS is a result of disruption of embryonal programming and gonadal development during fetal life. Therefore, we recommend that future epidemiological studies on trends in male reproductive health should not focus on one symptom only, but be more comprehensive and take all aspects of TDS into account. Otherwise, important biological information may be lost.
Article
Background: Infertility is the commonest problem in our gynaecological clinics. However, the manner of clinical presentation varies from one place to another, depending on the socio-economic and cultural environment from which the patients come. There is therefore a need to evaluate the clinical presentation of infertility in this part of the world to characterize the manner of presentation of the patients. Objective: To identify the mode of presentation of infertility in two tertiary health care centres in Gombe. Method: A descriptive study of the history and findings of physical examination of women complaining of inability to conceive was conducted. Results: One hundred and ninety six infertile women had their history taken and were examined. The prevalence of primary and secondary infertility was 36.7% and 63.3% respectively. The mean (± SD) age, parity and duration of infertility among the women were 28.3 ± 6.0 years, 0.9 ± 1.3 and 7.5 ± 6.0 years respectively. History of dysmenorrhoea (68.9%), previous abortion (63.2%), previous treatment for infertility (62.8%), chronic pelvic pain (48.5%), poor coital exposure (44.9%), abnormal menstruation (44.4%) and previous pelvic inflammatory disease (43.9%) were common. Common physical findings were galactorrhoea and features of genital tract infection. As the duration of infertility increased, a significantly higher proportion of the women were likely to have sought for traditional medication to solve the problem. Conclusion: Secondary infertility is the commonest type of infertility in this population. Women with a longer duration of infertility are likely to have tried traditional medication unsuccessfully. Key Words: Infertility, Clinical Symptoms, Traditional Medication [Trop J Obstet Gynaecol, 2003, 20: 93-96]