ArticlePDF Available

Crop yield prediction using machine learning: A systematic literature review

Authors:

Abstract and Figures

Machine learning is an important decision support tool for crop yield prediction, including supporting decisions on what crops to grow and what to do during the growing season of the crops. Several machine learning algorithms have been applied to support crop yield prediction research. In this study, we performed a Systematic Literature Review (SLR) to extract and synthesize the algorithms and features that have been used in crop yield prediction studies. Based on our search criteria, we retrieved 567 relevant studies from six electronic databases, of which we have selected 50 studies for further analysis using inclusion and exclusion criteria. We investigated these selected studies carefully, analyzed the methods and features used, and provided suggestions for further research. According to our analysis, the most used features are temperature, rainfall, and soil type, and the most applied algorithm is Artificial Neural Networks in these models. After this observation based on the analysis of machine learning-based 50 papers, we performed an additional search in electronic databases to identify deep learning-based studies, reached 30 deep learning-based papers, and extracted the applied deep learning algorithms. According to this additional analysis, Convolutional Neural Networks (CNN) is the most widely used deep learning algorithm in these studies, and the other widely used deep learning algorithms are Long-Short Term Memory (LSTM) and Deep Neural Networks (DNN).
Content may be subject to copyright.
Contents lists available at ScienceDirect
Computers and Electronics in Agriculture
journal homepage: www.elsevier.com/locate/compag
Crop yield prediction using machine learning: A systematic literature review
Thomas van Klompenburg
a
, Ayalew Kassahun
a
, Cagatay Catal
b,
a
Information Technology Group, Wageningen University & Research, Wageningen, the Netherlands
b
Department of Computer Engineering, Bahcesehir University, Istanbul, Turkey
ARTICLE INFO
Keywords:
Crop yield prediction
Decision support system
Systematic literature review
Machine learning
Deep learning
ABSTRACT
Machine learning is an important decision support tool for crop yield prediction, including supporting decisions
on what crops to grow and what to do during the growing season of the crops. Several machine learning al-
gorithms have been applied to support crop yield prediction research. In this study, we performed a Systematic
Literature Review (SLR) to extract and synthesize the algorithms and features that have been used in crop yield
prediction studies. Based on our search criteria, we retrieved 567 relevant studies from six electronic databases,
of which we have selected 50 studies for further analysis using inclusion and exclusion criteria. We investigated
these selected studies carefully, analyzed the methods and features used, and provided suggestions for further
research. According to our analysis, the most used features are temperature, rainfall, and soil type, and the most
applied algorithm is Artificial Neural Networks in these models. After this observation based on the analysis of
machine learning-based 50 papers, we performed an additional search in electronic databases to identify deep
learning-based studies, reached 30 deep learning-based papers, and extracted the applied deep learning algo-
rithms. According to this additional analysis, Convolutional Neural Networks (CNN) is the most widely used
deep learning algorithm in these studies, and the other widely used deep learning algorithms are Long-Short
Term Memory (LSTM) and Deep Neural Networks (DNN).
1. Introduction
Machine learning (ML) approaches are used in many fields, ranging
from supermarkets to evaluate the behavior of customers (Ayodele,
2010) to the prediction of customers’ phone use (Witten et al., 2016).
Machine learning is also being used in agriculture for several years
(McQueen et al., 1995). Crop yield prediction is one of the challenging
problems in precision agriculture, and many models have been pro-
posed and validated so far. This problem requires the use of several
datasets since crop yield depends on many different factors such as
climate, weather, soil, use of fertilizer, and seed variety (Xu et al.,
2019). This indicates that crop yield prediction is not a trivial task;
instead, it consists of several complicated steps. Nowadays, crop yield
prediction models can estimate the actual yield reasonably, but a better
performance in yield prediction is still desirable (Filippi et al., 2019a).
Machine learning, which is a branch of Artificial Intelligence (AI)
focusing on learning, is a practical approach that can provide better
yield prediction based on several features. Machine learning (ML) can
determine patterns and correlations and discover knowledge from da-
tasets. The models need to be trained using datasets, where the out-
comes are represented based on past experience. The predictive model
is built using several features, and as such, parameters of the models are
determined using historical data during the training phase. For the
testing phase, part of the historical data that has not been used for
training is used for the performance evaluation purpose.
An ML model can be descriptive or predictive, depending on the
research problem and research questions. While descriptive models are
used to gain knowledge from the collected data and explain what has
happened, predictive models are used to make predictions in the future
(Alpaydin, 2010). ML studies consist of different challenges when
aiming to build a high-performance predictive model. It is crucial to
select the right algorithms to solve the problem at hand, and in addi-
tion, the algorithms and the underlying platforms need to be capable of
handling the volume of data.
To get an overview of what has been done on the application of ML
in crop yield prediction, we performed a systematic literature review
(SLR). A Systematic Literature Review (SLR) shows the potential gaps in
research on a particular area of problem and guides both practitioners
and researchers who wish to do a new research study on that problem
area. By following a methodology in SLR, all relevant studies are ac-
cessed from electronic databases, synthesized, and presented to respond
to research questions defined in the study. An SLR study leads to new
perspectives and helps new researchers in the field to understand the
state-of-the-art.
https://doi.org/10.1016/j.compag.2020.105709
Received 29 January 2020; Received in revised form 21 July 2020; Accepted 9 August 2020
Corresponding author.
E-mail address: cagatay.catal@eng.bau.edu.tr (C. Catal).
Computers and Electronics in Agriculture 177 (2020) 105709
Available online 18 August 2020
0168-1699/ © 2020 Elsevier B.V. All rights reserved.
T
An SLR study is expected to be replicable, which means that all the
steps taken need to be explained clearly, and the results should be
transparent for other researchers. The critical factors for a successful SLR
study are objectivity and transparency (Kitchenham et al., 2007). As its
name indicates, an SLR needs to be systematic and cover all the literature
published so far. This study presents all the available literature published
so far on the application of machine learning in crop yield prediction
problem. In this study, we present our empirical results and responses to
the research questions defined as part of this review article.
The remainder of this paper is organized as follows: Section 2 ex-
plains the background. Section 3 discusses the methodology. Section 4
presents the results of the SLR. Section 5 explains the deep learning-
based crop yield prediction research. Section 5 presents the discussion,
and Section 7 concludes this paper.
2. Related work
Crop yield prediction is an essential task for the decision-makers at
national and regional levels (e.g., the EU level) for rapid decision-
making. An accurate crop yield prediction model can help farmers to
decide on what to grow and when to grow. There are different ap-
proaches to crop yield prediction. This review article has investigated
what has been done on the use of machine learning in crop yield pre-
diction in the literature.
During our analysis of the retrieved publications, one of the exclusion
criteria is that the publication is a survey or traditional review paper.
Those excluded publications are, in fact, related work and are discussed
in this section. Chlingaryan and Sukkarieh performed a review study on
nitrogen status estimation using machine learning (Chlingaryan et al.,
2018). The paper concludes that quick developments in sensing tech-
nologies and ML techniques will result in cost-effective solutions in the
agricultural sector. Elavarasan et al. performed a survey of publications
on machine learning models associated with crop yield prediction based
on climatic parameters. The paper advises looking broad to find more
parameters that account for crop yield (Elavarasan et al., 2018). Liakos
et al. (2018) published a review paper on the application of machine
learning in the agricultural sector. The analysis was performed with
publications focusing on crop management, livestock management,
water management, and soil management. Li, Lecourt, and Bishop per-
formed a review study on determining the ripeness of fruits to decide the
optimal harvest time and yield prediction (Li et al., 2018). Mayuri and
Priya addressed the challenges and methodologies that are encountered
in the field of image processing and machine learning in the agricultural
sector and especially in the detection of diseases (Mayuri and Priya,
xxxx). Somvanshi and Mishra presented several machine learning ap-
proaches and their application in plant biology (Somvanshi and Mishra,
2015). Gandhi and Armstrong published a review paper on the appli-
cation of data mining in the agricultural sector in general, dealing with
decision making. They concluded that further research needs to be done
to see how the implementation of data mining into complex agricultural
datasets could be realized (Gandhi and Armstrong, 2016). Beulah per-
formed a survey on the various data mining techniques that are used for
crop yield prediction and concluded that the crop yield prediction could
be solved by employing data mining techniques (Beulah, 2019).
According to our survey of review articles, the significant ones of
which are presented in this section, this paper is the first SLR that fo-
cuses on the application of machine learning in the crop yield predic-
tion problem. The existing survey studies did not systematically review
the literature, and most of them reviewed studies on a specific aspect of
crop yield prediction. Also, we presented 30 deep learning-based stu-
dies in this article and discussed which deep learning algorithms have
been used in these studies.
3. Methodology
3.1. Review protocol
Before conducting the systematic review, a review protocol is de-
fined. The review has been done using the well-known review guide-
lines provided by Kitchenham et al. (2007). Firstly, the research ques-
tions are defined. When research questions are ready, databases are
used to select the relevant studies. The databases that were used in this
study are Science Direct, Scopus, Web of Science, Springer Link, Wiley,
and Google Scholar. After the selection of relevant studies, they were
filtered and assessed using a set of exclusion and quality criteria. All the
relevant data from the selected studies are extracted, and eventually,
the extracted data were synthesized in response to the research ques-
tions. The approach we followed can be split up into three parts: plan
review, conduct review, and report review.
The first stage is planning the review. In this stage, research questions
are identified, a protocol is developed, and eventually, the protocol is va-
lidated to see if the approach is feasible. In addition to the research ques-
tions, publication venues, initial search strings, and publication selection
criteria are also defined. When all of this information is defined, the pro-
tocol is revised one more time to see if it represents a proper review pro-
tocol. In Fig. 1, the internal steps of the Plan Review stage are represented.
The second stage is conducting the review, which is represented in
Fig. 2. When conducting the review, the publications were selected by
going through all the databases. The data was extracted, which means
that their information regarding authors, year of publication, type of
publication, and more information regarding the research questions
were stored. After all the necessary data was extracted correctly, the
data was synthesized in order to provide an overview of the relevant
papers published so far.
In the final stage, a.k.a., Reporting the Review, the review was
concluded by documenting the results and addressing the research
questions, as shown in Fig. 3.
3.2. Research questions
This SLR aims to get insight into what studies have been published
in the domain of ML and crop yield prediction. To get insight, studies
have been analyzed from several dimensions. For this SLR study, the
following four research questions(RQs) have been defined.
RQ1- Which machine learning algorithms have been used in the
literature for crop yield prediction?
RQ2- Which features have been used in literature for crop yield
prediction using machine learning?
RQ3- Which evaluation parameters and evaluation approaches have
been used in literature for crop yield prediction?
RQ4- What are challenges in the field of crop yield prediction using
machine learning?
3.3. Search strategy
The searching is done by narrowing down to the basic concepts that
are relevant for the scope of this review. Machine learning has many
Fig. 1. Details of the Plan Review Step.
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
2
application fields, which means that there are a lot of published studies
that are probably not in the scope of this review article. The basic
searching is done by an automated search. The starting input for the
search was “machine learning” AND “yield prediction”. Articles were
retrieved, and abstracts were read to find the synonyms of the key-
words. The search was performed in six databases. The search input
“machine learning” AND “yield prediction” was used to get a broad
view of the studies. After the exclusion criteria were applied, and all the
results were processed, and a more complex search string was built in
order to avoid missing relevant studies. This final search string is as
follows: ((“machine learning” OR “artificial intelligence”) AND “data
mining” AND (“yield prediction” OR “yield forecasting” OR “yield es-
timation”)). After executing this search string, 567 studies were re-
trieved.
A specific description of the search strings per database are provided
as follows:
Science direct: The search string is [“machine learning” AND “yield
prediction”] (Title, abstract, keywords) and [((“machine learning” OR
“artificial intelligence”) AND “data mining” AND (“yield prediction”
OR “yield forecasting” OR “yield estimation”))](Title, abstract, key-
words).
Scopus: The search string is [“machine learning” AND “yield pre-
diction”](Title, abstract, keywords) and [((“machine learning” OR
“artificial intelligence”) AND “data mining” AND (“yield prediction”
OR “yield forecasting” OR “yield estimation”))] (Title, abstract, key-
words).
Web of Science: The search string is [“machine learning” AND
“yield prediction”] (title, abstract, author keywords, and Keywords
Plus).
Springer Link: The search string is [“machine learning” AND “yield
prediction”](anywhere) and [((“machine learning” OR “artificial in-
telligence”) AND “data mining” AND (“yield prediction” OR “yield
forecasting” OR “yield estimation”))] (anywhere)
Wiley: The search string is [“machine learning” AND “yield pre-
diction”] (anywhere).
Google Scholar: The search string is [“machine learning” AND
“yield prediction”] (anywhere) and [((“machine learning” OR “artificial
intelligence”) AND “data mining” AND (“yield prediction” OR “yield
forecasting” OR “yield estimation”))] (anywhere).
For Web of Science and Wiley, the search string [((“machine
learning” OR “artificial intelligence”) AND “data mining” AND (“yield
prediction” OR “yield forecasting” OR “yield estimation”))] did not
result in any publications.
3.4. Exclusion criteria
To exclude irrelevant studies, the studies were analyzed and graded
based on exclusion criteria to set the boundaries for the systematic
review. The exclusion criteria (EC) are shown as follows:
Exclusion criteria 1 - Publication is not related to the agricultural
sector and yield prediction combined with machine learning
Exclusion criteria 2 – Publication is not written in English
Exclusion criteria 3 Publication that is a duplicate or already re-
trieved from another database
Exclusion criteria 4 – Full text of the publication is not available
Exclusion criteria 5 – Publication is a review/survey paper
Exclusion criteria 6 – Publication has been published before 2008
After the first three exclusion criteria were applied, only 77 studies
remained for further analysis. After applying all the six exclusion cri-
teria, 50 studies were selected for further analysis. In Table 1, we show
the number of initially retrieved papers and the number of papers after
selection criteria were applied. Fig. 4 shows the distribution of selected
publications based on the databases we searched. As shown in Table 1,
most of the papers were retrieved from Google Scholar, Scopus, and
Springer databases.
To answer the four research questions, data from the selected stu-
dies have been extracted and synthesized. The information retrieved
was focused on checking whether or not the studies meet the require-
ments stated in the exclusion criteria and on responding to the research
questions. The selected studies that passed the exclusion criteria are
presented in Appendix A. During the data synthesis, all the extracted
data have been combined and synthesized, and the research questions
were answered accordingly. The results are presented in Section 4.
4. Results
The selected publications are shown in Table 2. The table shows the
publication year, title, and algorithms used in these papers.
Fig. 4 shows the number of publications per year published in the
last ten years. This figure indicates that recently the number of papers
Fig. 2. Details of the Conducting Review Step.
Fig. 3. Details of the Reporting Review Step.
Table 1
Distribution of papers based on the databases.
Database # of initially
retrieved papers
# of papers after
exclusion criteria
Percentage of
Papers (%)
Science Direct 17 4 8
Scopus 68 11 22
Web of Science 32 0 0
Springer Link 132 10 20
Wiley 20 1 2
Google Scholar 298 24 48
Total 567 50 100
Fig. 4. Distribution of the selected publications per year.
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
3
Table 2
Selected publications.
Retrieved From Reference Title Algorithm used Year
Scopus Ruß et al. (2008) Data Mining with Neural Networks for Wheat Yield Prediction Neural networks 2008
Science Direct Everingham et al. (2009) Ensemble data mining approaches to forecast regional sugarcane crop production Forward stagewise algorithm 2009
Springer Link Ruß & Kruse (2010) Regression Models for Spatial Data: An Example from Precision Agriculture Clustering, random forest, support vector machine 2010
Springer Link Baral et al. (2011) Yield Prediction Using Artificial Neural Networks Neural networks 2011
Springer Link Črtomir et al. (2012) Application of Neural Networks and Image Visualization for Early Forecast of Apple Yield Neural networks 2012
Google Scholar Johnson (2013) Crop yield forecasting on the Canadian Prairies by remotely sensed vegetation indices and machine
learning methods
Multiple linear regression, neural networks 2013
Google Scholar Romero et al. (2013) Using classification algorithms for predicting durum wheat yield in the province of Buenos Aires K-nearest neighbor, decision tree 2013
Google Scholar Ananthara et al. (2013) CRY - an improved crop yield prediction model using bee hive clustering approach for agricultural data
sets
Clustering 2013
Scopus Shekoofa et al. (2014) Determining the most important physiological and agronomic traits contributing to maize grain yield
through machine learning algorithms: A new avenue in intelligent agriculture
Decision tree, clustering 2014
Scopus Gonzalez-Sanchez et al. (2014) Predictive ability of machine learning methods for massive crop yield prediction M5-prime regression tree, k-nearest neighbor, support vector machine 2014
Scopus Pantazi et al. (2014) Application of supervised self-organizing models for wheat yield prediction Neural networks 2014
Google Scholar Cakir et al. (2014) Yield prediction of wheat in south-east region of Turkey by using artificial neural networks Neural networks, multivariate polynomial regression 2014
Google Scholar Rahman & Haq (2014) Machine learning facilitated rice prediction in Bangladesh Decision tree, neural networks, linear regression 2014
Scopus Kunapuli et al. (2015) Yield prediction for precision territorial management in maize using spectral data Polynomial regression, logistic regression 2015
Google Scholar Matsumura et al. (2015) Maize yield forecasting by linear regression and artificial neural networks in Jilin, China Neural networks, multiple linear regression 2015
Google Scholar Ahamed et al. (2015) Applying data mining techniques to predict annual yield of major crops and recommend planting
different crops in different districts in Bangladesh
Linear regression, neural networks, clustering, k-nearest neighbor 2015
Google Scholar Paul et al. (2015) Analysis of soil behavior and prediction of crop yield using data mining approach Naïve Bayes, k-nearest neighbor 2015
Science Direct Pantazi et al. (2016) Wheat yield prediction using machine learning and advanced sensing techniques Neural networks 2016
Scopus Jeong et al. (2016) Random forests for global and regional crop yield predictions Random forest, linear regression 2016
Wiley Mola-Yudego et al. (2016) Spatial yield estimates of fast-growing willow plantations for energy based on climatic variables in
northern Europe
Gradient boosting tree 2016
Google Scholar Everingham et al. (2016) Accurate prediction of sugarcane yield using a random forest algorithm Random forest 2016
Scopus Gandhi et al. (2016) Rice crop yield prediction in India using support vector machines Support vector machine 2016
Google Scholar Bose et al. (2016) Spiking neural networks for crop yield estimation based on spatiotemporal analysis of image time series Neural networks 2016
Google Scholar Gandhi et al. (2016) Rice crop yield prediction using artificial neural networks Neural networks 2016
Google Scholar Gandhi and Armstrong (2016) Applying data mining techniques to predict yield of rice in Humid Subtropical Climatic Zone of India Decision tree, logistic regression, k-nearest neighbor 2016
Google Scholar Sujatha and Isakki (2016) A study on crop yield forecasting using classification techniques Naïve Bayes, J48, random forest, neural networks, decision tree,
support vector machines (No experimental results reported)
2016
Google Scholar Ying-xue et al. (2017) Support vector machine-based open crop model (SBOCM): Case of rice production in China Support vector machine 2017
Google Scholar Cheng et al. (2017) Early yield prediction using image analysis of apple fruit and tree canopy features with neural networks Neural networks 2017
Google Scholar Bargoti and Underwood (2017) Image segmentation for fruit detection and yield estimation in apple orchards Neural networks 2017
Google Scholar Fernandes et al. (2017) Sugarcane yield prediction in Brazil using NDVI time series and neural networks ensemble Neural networks 2017
Google Scholar You et al. (2017) Deep Gaussian process for crop yield prediction based on remote sensing data Neural networks and gaussian process, neural networks 2017
Springer Link Osman et al. (2017) Predicting Early Crop Production by Analysing Prior Environment Factors Neural networks, linear regression 2017
Google Scholar Ali et al. (2017) Modeling managed grassland biomass estimation by using multitemporal remote sensing data machine
learning approach
ANFIS, neural networks, multiple linear regression 2017
Science Direct Kouadio et al. (2018) Artificial intelligence approach for the prediction of Robusta coffee yield using soil fertility properties Extreme learning machine, multiple linear regression, random forest 2018
Springer Link Goldstein et al. (2018) Applying machine learning on sensor data for irrigation recommendations: revealing the agronomists
tacit knowledge
Gradient boosting tree, linear regression 2018
Scopus Zhong et al. (2018) Hierarchical modeling of seed variety yields and decision making for future planting plan Random forest, linear regression 2018
Scopus Crane-Droesch (2018) Machine learning methods for crop yield prediction and climate change impact assessment in
agriculture
Neural networks 2018
Scopus Villanueva et al. (2018) Bitter melon crop yield prediction using Machine Learning Algorithm Neural networks 2018
Google Scholar Girish et al. (2018) Crop Yield and Rainfall Prediction in Tumakuru District using Machine Learning Support vector machine, linear regression, k-nearest neighbor 2018
Google Scholar Khanal et al. (2018) Integration of high resolution remotely sensed data and machine learning techniques for spatial
prediction of soil properties and corn yield
Neural networks, support vector machine, random forest 2018
Google Scholar Taherei Ghazvinei et al. (2018) Sugarcane growth prediction based on meteorological parameters using extreme learning machine and
artificial neural network
Neural networks 2018
Springer Link Ahmad et al. (2018) Yield Forecasting of Spring Maize Using Remote Sensing and Crop Modeling in Faisalabad-Punjab
Pakistan
Support vector machine, random forest, decision tree 2018
(continued on next page)
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
4
on crop yield prediction is increasing.
There were no exclusion criteria based on the type of publication;
therefore, conference papers were also included. The pie chart in Fig. 5
shows the distribution of types of publications. The figure shows that
most of the articles we accessed are journal articles; conference papers
and book chapters constitute less than 25% of the total number of pa-
pers.
To address research question two (RQ2), features used in the ma-
chine learning algorithms applied in the papers were investigated and
summarized. All features we were able to extract are shown in Table 3.
As shown in Table 3, the most used features are related to tem-
perature, rainfall, and soil type. Crop yield is the dependent variable. To
get a better overview of the independent variables (features), the fea-
tures were grouped. The independent features can be grouped into soil
and crop information, humidity, nutrients, and field management. The
number of times these groups are used is presented in Table 4. As shown
in this table, the feature groups that are most used are related to the
soil, solar, and humidity information.
The feature group “soil information” consists of the following
variables: soil maps, soil type, pH value, cation exchange capacity, and
area of production. Whether or not soil maps were used and the in-
formation content of the maps differs among the different publications.
In the soil maps, general information about the nutrients in the soil,
type of the soil, and location can be found. Crop information refers to
information about the crop itself, such as weight, growth during the
growth-process, variety of plants, and crop density. Other measure-
ments that indicate growth is also included in this group, for example,
the leaf area index. Humidity stands for the water in the field. The
features that fall under the humidity group include rainfall, humidity,
forecasted rainfall, and precipitation. Nutrients can be nutrients that
are already in the soil, but the nutrients can also be applied nutrients.
These features measure the level of saturation. The measured nutrients
are nitrogen, magnesium, potassium, sulphur, zinc, boron, calcium,
manganese, and phosphorus. With field management, decisions of
farmers to adjust their field are grouped. These features are irrigation
and fertilization, and thus field management could also refer to the
management of nutrients. The solar information contains features re-
lated to radiation or temperature. These are gamma radiometric, tem-
perature, photoperiod, shortwave radiation, degree-days, and solar ra-
diation. The feature group labeled as ‘Other’ contains the features that
cannot be put in any of the groups mentioned above. Most of these
features are used only once or are calculated features (Measuring
Vegetation (NDVI & EVI), 2000). These features are used less and in-
clude features such as wind speed, pressure, and images. The calculated
features are MODIS Enhanced Vegetation Index (MODIS-EVI), Nor-
malized Vegetation Index (NDVI), and Enhanced Vegetation Index
Table 2 (continued)
Retrieved From Reference Title Algorithm used Year
Springer Link Shah et al. (2018) Smart Farming System: Crop Yield Prediction Using Regression Techniques Support vector machine, random forest, multivariate polynomial
regression
2018
Springer Link Monga (2018) Estimating Vineyard Grape Yield from Images Neural networks 2018
Google Scholar Wang et al. (2018) Deep transfer learning for crop yield prediction with remote sensing data Neural networks 2018
Science Direct Xu et al. (2019) Design of an integrated climatic assessment indicator (ICAI) for wheat production: A case study in
Jiangsu Province, China
Random forest, support vector machine 2019
Scopus Filippi et al. (2019b) An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine
learning
Random forest 2019
Google Scholar Rao & Manasa (2019) Artificial Neural Networks for Soil Quality and Crop Yield Prediction using Machine Learning Neural networks 2019
Springer Link Ranjan & Parida (2019) Paddy acreage mapping and yield prediction using sentinel-based optical and SAR data in Sahibganj
district, Jharkhand (India)
Linear regression 2019
Springer Link Charoen-Ung & Mittrapiyanuruk
(2019)
Sugarcane Yield Grade Prediction Using Random Forest with Forward Feature Selection and Hyper-
parameter Tuning
Random forest 2019
Fig. 5. Distribution of the type of 50 primary publications.
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
5
(EVI) (Filippi et al., 2019a).
To represent all the features gathered through this SLR study, we
drew a feature map depicted in Fig. 6 shows the significant features and
sub-features.
To address the first research question (RQ1), machine learning al-
gorithms were investigated and summarized. The algorithms used more
than once are listed in Table 5. As shown in the table, Neural Networks
(NN) and Linear Regression algorithms are the two algorithms used
mostly. Also, Random Forest (RF) and Support Vector Machines (SVM)
are widely used, according to Table 5.
To address research question three (RQ3), evaluation parameters
were identified. All the evaluation parameters that were used and the
number of times they were used are shown in Table 6. As the table
shows, Root Mean Square Error (RMSE) is the most used parameter in
the studies.
Apart from the evaluation parameters, several validation ap-
proaches were used as well. Most of the time, cross-validation is used.
The most used evaluation method was 10-fold cross-validation.
To address research question four (RQ4), the publications were read
to see if they stated any problems or improvements for future models. In
several studies, insufficient availability of data (too few data) was
mentioned as a problem. The studies stated that their systems worked
for the limited data that they had at hand, and indicated data with more
variety should be used for further testing. This means data with dif-
ferent climatic circumstances, different vegetation, and longer time-
series of yield data. Another suggested improvement is that more data
sources should be integrated. Finally, the publication indicated that the
use of machine learning in farm management systems should be ex-
plored. If the models work as requested, software applications must be
created that allow the farmer to make decisions based on the models.
5. Deep learning-based crop yield prediction
In the first part of our research (i.e., Systematic Literature Review),
we observed that Artificial Neural Networks (ANN) is the most used
algorithm for crop yield prediction. Recently, deep learning, which is a
sub-branch of machine learning, has provided state-of-the-art results in
many different domains, such as face recognition and image classifi-
cation. These Deep Neural Networks (DNN) algorithms use similar
concepts of ANN algorithms; however, they include different hidden
layer types such as convolutional layer and pooling layer and consist of
many hidden layers instead of a single hidden layer.
As such, in the second part of our research, we aimed to investigate
to what extent deep learning algorithms have been applied in crop yield
prediction. To broaden our analysis and reach recent applications of
deep learning algorithms in yield prediction, we designed a new search
criterion (i.e., “deep learning” AND “yield prediction”) and performed a
new search in the same electronic databases that were used during the
SLR study. We reached the following 30 papers shown in Table 7. We
investigated these articles in detail, extracted, and synthesized the deep
learning algorithms applied by researchers.
Fig. 7 shows the yearly distribution of deep learning-based papers.
Although we are in the half of the year 2020, the number of papers that
belong to the year 2020 is now equal to the number of papers published
in 2019. This shows that the number of papers is increasing every year.
In Table 8, we show the distribution of deep learning-based papers
per database. Most of the papers were retrieved from Google Scholar,
and the second top database was Scopus. Science Direct and Springer
Link returned a similar number of deep learning-based papers.
In Table 9, we show the distribution of applied deep learning al-
gorithms in the identified papers list. The most applied deep learning
algorithm is Convolutional Neural Networks (CNN), and the other
widely used algorithms are Long-Short Term Memory (LSTM) and Deep
Neural Networks (DNN) algorithms. Since some papers applied more
than one deep learning algorithm, the total number of usages shown in
the second column is larger than the total number of papers.
These deep learning algorithms are shortly described as follows:
Deep Neural Networks (DNN): These DNN algorithms are very similar
to the traditional Artificial Neural Networks (ANN) algorithms ex-
cept the number of hidden layers. In DNN networks, there are many
hidden layers that are mostly fully connected, as in the case of ANN
algorithms. However, for other kinds of deep learning algorithms
such as CNN, there are also different types of layers, such as the
convolutional layer and the pooling layer.
Convolutional Neural Networks (CNN): Compared to a fully con-
nected network, CNN has fewer parameters to learn. There are three
types of layers in a CNN model, namely convolutional layers,
pooling layers, and fully-connected layers. Convolutional layers
consist of filters and feature maps. Filters are the neurons of the
layer, have weighted inputs, and create an output value (Brownlee,
2016). A feature map can be considered as the output of one filter.
Pooling layers are applied to down-sample the feature map of the
previous layers, generalize feature representations, and reduce the
Table 3
All features used.
Feature # of times used
Temperature 24
Soil type 17
Rainfall 17
Crop information 13
Soil maps 12
Humidity 11
pH-value 11
Solar radiation 10
Precipitation 9
Images 8
Area of production 8
Fertilization 7
NDVI 6
Cation exchange capacity 6
Nitrogen 6
Irrigation 5
Potassium 5
Wind speed 5
Zinc 3
Magnesium 3
Shortwave radiation 2
Sulphur 2
Boron 2
Calcium 2
Organic carbon 2
EVI 2
Phosphorus 2
Gamma radiametrics 1
MODIS-EVI 1
Forecasted rainfall 1
Photoperiod 1
Climate 1
Degree-days 1
Time 1
Pressure 1
Leaf area index 1
Manganese 1
Table 4
Grouped features.
Group # of times used
Soil information 54
Solar information 39
Humidity 38
Nutrients 28
Other 24
Crop information 14
Field management 12
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
6
overfitting (Brownlee, 2019). Fully-connected layers are mostly
used at the end of the network for predictions. The general pattern
for CNN models is that one or more convolutional layers are fol-
lowed by a pooling layer, and this structure is repeated several
times, and finally, fully connected layers are applied (Brownlee,
2016, 2019).
Long-Short Term Memory (LSTM): LSTM networks were designed
specifically for sequence prediction problems. There are several
LSTM architectures (Brownlee, 2017), namely vanilla LSTM, stacked
LSTM, CNN-LSTM, Encoder-Decoder LSTM, Bidirectional LSTM, and
Generative LSTM. There are several limitations of Multi-Layer
Fig. 6. Feature diagram.
Table 5
Most used machine learning algorithms.
Most used machine learning algorithms # of times used
Neural Networks 27
Linear Regression 14
Random Forest 12
Support Vector Machine 10
Gradient Boosting Tree 4
Table 6
All evaluation parameters used.
Key Evaluation parameter # of times used
RMSE Root mean square error 29
R
2
R-squared 19
MAE Mean absolute error 8
MSE Mean square error 5
MAPE Mean absolute percentage error 3
RSAE Reduced simple average ensemble 3
LCCC Lin’s concordance correlation coefficient 1
MFE Multi factored evaluation 1
SAE Simple average ensemble 1
rcv Reference change values 1
MCC Matthew’s correlation coefficient 1
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
7
Table 7
Deep learning-based publications.
Retrieved From Reference Title Deep Learning Algorithm(s) used Year
Science Direct Schwalbert et al. (2020) Satellite-based soybean yield forecast: Integrating machine learning and weather data for
improving crop yield prediction in southern Brazil
Long-Short Term Memory (LSTM) 2020
Science Direct Chu and Yu (2020) An end-to-end model for rice yield prediction using deep learning fusion The combination of Back-Propagation Neural Networks (BPNNs) and
Independently Recurrent Neural Network (IndRNN)
2020
Science Direct Tedesco-Oliveira et al. (2020) Convolutional neural networks in predicting cotton yield from images of commercial fields Convolutional Neural Networks (CNN) 2020
Science Direct Nevavuori et al. (2019) Crop yield prediction with deep convolutional neural networks Convolutional Neural Networks (CNN) 2019
Science Direct Maimaitijiang et al. (2020) Soybean yield prediction from UAV using multimodal data fusion and deep learning Deep Neural Networks (DNN) 2020
Science Direct Yang et al. (2019) Deep convolutional neural networks for rice grain yield estimation at the ripening stage using
UAV-based remotely sensed images
Convolutional Neural Networks (CNN) 2019
Google Scholar Khaki and Wang (2019) Crop Yield Prediction Using Deep Neural Networks Deep Neural Networks (DNN) 2019
Google Scholar Rahnemoonfar and Sheppard
(2017)
Real-time yield estimation based on deep learning Convolutional Neural Networks (CNN) 2017
Google Scholar Chen et al. (2019) Strawberry Yield Prediction Based on a Deep Neural Network Using High-Resolution Aerial
Orthoimages
Faster Region-based Convolutional Neural Networks (Faster R-CNN) 2019
Google Scholar Sun et al. (2019) County-Level Soybean Yield Prediction Using Deep CNN-LSTM Model The combination of Convolutional Neural Networks and Long-Short Term
Memory Networks (CNN-LSTM)
2019
Google Scholar Khaki et al. (2020) A CNN-RNN Framework for Crop Yield Prediction The combination of Convolutional Neural Networks and Recurrent Neural
Networks (CNN-RNN)
2020
Google Scholar Terliksiz and Altýlar (2019) Use Of Deep Neural Networks For Crop Yield Prediction: A Case Study Of Soybean Yield in
Lauderdale County, Alabama, USA
3D Convolutional Neural Networks (3D CNN) 2019
Google Scholar Lee et al. (2019) A Self-Predictable Crop Yield Platform (SCYP) Based On Crop Diseases Using Deep Learning Convolutional Neural Networks (CNN) 2019
Google Scholar Elavarasan and Vincent (2020) Crop Yield Prediction Using Deep Reinforcement Learning Model for Sustainable Agrarian
Applications
Deep Recurrent Q-Network 2020
Google Scholar Wang et al. (2020) Winter Wheat Yield Prediction at County Level and Uncertainty Analysis in Main Wheat-
Producing Regions of China with Deep Learning Approaches
The combination of Convolutional Neural Networks and Long-Short Term
Memory (CNN-LSTM)
2020
Google Scholar Wolanin et al. (2020) Estimating and understanding crop yields with explainable deeplearning in the Indian Wheat Belt Convolutional Neural Networks (CNN) 2020
Springer Link Bhojani and Bhatt (2020) Wheat crop yield prediction using new activation functions in neuralnetwork Deep Neural Networks (DNN) 2020
Springer Link Fathi et al. (2019) Crop Yield Prediction Using Deep Learning in Mediterranean Region Deep Neural Networks (DNN) 2019
Springer Link Shidnal et al. (2019) Crop yield prediction: two-tiered machine learning model approach Convolutional Neural Networks (CNN) 2019
Springer Link Khaki and Wang (2019) Crop Yield Prediction Using Deep Neural Networks Deep Neural Networks (DNN) 2019
Springer Link Nguyen et al. (2019) Spatial-Temporal Multi-Task Learningfor Within-Field Cotton Yield Prediction Spatial-Temporal Multi-Task Learning 2019
Springer Link De Alwis et al. (2019) Duo Attention with Deep Learning on Tomato Yield Prediction and Factor Interpretation Duo Attention Long-Short Term Memory 2019
Wiley Jiang et al. (2020) A deep learning approach to conflating heterogeneous geospatial data for corn yield estimation: A
case study of the US Corn Belt at the county level
Long-Short Term Memory (LSTM) 2020
Scopus Saravi et al. (2019) Quantitative model of irrigation effect on maize yield by deep neural network Deep Neural Networks (DNN) 2019
Scopus Zhang et al. (2020) Combining Optical, Fluorescence, Thermal Satellite, and Environmental Data to Predict County-
Level Maize Yield in China Using Machine Learning Approaches
Long-Short Term Memory (LSTM) 2020
Scopus Kang et al. (2020) Comparative assessment of environmental variables and machine learning algorithms for maize
yield prediction in the US Midwest
Long-Short Term Memory (LSTM) and Convolutional Neural Networks (CNN) 2020
Scopus Wang et al. (2020) Combining Multi-Source Data and Machine Learning Approaches to Predict Winter Wheat Yield in
the Conterminous United States
Deep Neural Networks (DNN) 2020
Scopus Ju et al. (2020) Machine learning approaches for crop yield prediction with MODIS and weather data Long-Short Term Memory (LSTM) Convolutional Neural Networks (CNN),
Stacked-Sparse AutoEncoder (SSAE)
2020
Scopus Yalcin (2019) An Approximation for A Relative Crop Yield Estimate from Field Images Using Deep Learning Convolutional Neural Networks (CNN) 2019
Scopus Wang et al. (2018) Deep Transfer Learning for Crop Yield Prediction with Remote Sensing Data Long-Short Term Memory (LSTM) for Transfer Learning 2018
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
8
Perceptron (MLP) feedforward ANN algorithms, such as being sta-
teless, unaware of temporal structure, messy scaling, fixed sized
inputs, and fixed-sized outputs (Brownlee, 2017). Compared to the
MLP network, LSTM can be considered as the addition of loops to
the network. Also, LSTM is a special type of Recurrent Neural Net-
work (RNN) algorithm. Since LSTM has an internal state, is aware of
the temporal structure in the inputs, can model parallel input series,
can process variable-length input to generate variable-length
output, they are very different than the MLP networks. The memory
cell is the computational unit of the LSTM (Brownlee, 2017). These
cells consist of weights (i.e., input weights, output weights, and
internal state) and gates (i.e., forget gate, input gate, and output
gate).
3D CNN: This network is a special type of CNN model in which the
kernels move through height, length, and depth. As such, it produces
3D activation maps. This type of model was developed to improve
the identification of moving, as in the case of security cameras and
medical scans. 3D convolutions are performed in the convolutional
layers of CNN (Ji et al., 2012).
Faster R-CNN: The Region-Based Convolutional Neural Network (R-
CNN) is a family of CNN models that were designed specifically for
object detection (Brownlee, 2019). There are four variations of R-
CNN, namely R-CNN, Fast R-CNN, Faster R-CNN, and Mask R-CNN.
In Faster R-CNN, a Region Proposal Network is added to interpret
features extracted from CNN (Ren et al., 2015).
Autoencoder: Autoencoders are unsupervised learning approaches
that consist of the following four main parts: encoder, bottleneck,
decoder, and reconstruction loss. The architecture of autoencoders
can be designed based on simple feedforward neural networks, CNN,
or LSTM networks (Baldi, 2012; Vincent et al., 2008).
Hybrid networks: It is possible to combine the power of different deep
learning algorithms. As such, researchers combine different algo-
rithms in a different way. Chu and Yu (2020) combined Back-Pro-
pagation Neural Networks (BPNNs) and Independently Recurrent
Neural Network (IndRNN) and applied this model for crop yield
prediction. Sun et al. (2019) combined Convolutional Neural Net-
works and Long-Short Term Memory Networks (CNN-LSTM) for
soybean yield prediction. Khaki et al. (2020) combined Convolu-
tional Neural Networks and Recurrent Neural Networks (CNN-RNN)
for yield prediction. Wang et al. (2020) combined CNN and LSTM
(CNN-LSTM) networks for the wheat yield prediction problem.
Multi-Task Learning (MTL): In multi-task learning, we share re-
presentations between tasks to improve the performance of our
models developed for these tasks (Ruder, 2017). It has been applied
in many different domains, such as drug discovery, speech re-
cognition, and natural language processing. The aim is to improve
the performance of all the tasks involved instead of improving the
performance of a single task. Zhang and Yang (2017) reviewed
several multi-task learning approaches for supervised learning tasks
and also explained how to combine multi-task learning with other
learning categories, such as semi-supervised learning and re-
inforcement learning. They divided supervised MTL approaches into
the following categories: feature learning approach, low-rank ap-
proach, task clustering approach, task relation learning approach,
and decomposition approach.
Deep Recurrent Q-Network (DQN): In reinforcement learning, agents
observe the environment and act based on some rules and the
available data. Agents get rewards based on their actions (i.e., po-
sitive or negative reward) and try to maximize this reward. The
environment and agents interact with each other continuously. DQN
algorithm was developed in 2015 by the researchers of DeepMind
acquired by Google in 2014. This DQN algorithm that combines the
power of reinforcement learning and deep neural networks solved
several Atari games in 2015. The classical Q-learning algorithm was
enhanced with deep neural networks, and also, the experience re-
play technique was integrated (Mnih et al., 2015). Elavarasan and
Vincent (2020) applied this algorithm for crop yield prediction.
The number of papers that apply deep learning for crop yield pre-
diction is increasing. As such, we expect to see more research in this
direction.
6. Discussion
General discussion: Such research is susceptible to threats to va-
lidity, and potential threats to validity can be external, construct
validity, and reliability (Šmite et al., 2010). The external validity
and construct validity are addressed for this SLR study since the
initial search string was broad, and the query returned a substantial
number of studies: 567 publications in total. The search string
covered the whole scope of the SLR. For reliability of the SLR, the
validity can be considered well-addressed since the process of the
SLR has been described clearly and is replicable. If this SLR is re-
plicated, it could return slightly different selected publications, but
the differences would be a result of different personal judgments.
However, it is highly unlikely that the overall findings would
change.
Search-related discussion: There is a possibility that valuable
publications might have been missed. More synonyms could have
been used, and a broader search could have returned new studies.
However, the search string resulted in a high number of publications
Fig. 7. Yearly distribution of deep learning-based papers.
Table 8
Distribution of deep learning-based papers per database.
Database # of papers Percentage of Papers (%)
Science Direct 6 20
Scopus 7 23,33
Web of Science 0 0
Springer Link 6 20
Wiley 1 3,33
Google Scholar 10 33,33
Total 30 100
Table 9
Distribution of deep learning algorithms.
Algorithms used # of usages Percentage (%)
CNN 10 30,30
LSTM 7 21,21
DNN 7 21,21
Hybrid 4 12,12
Autoencoder 1 3,03
Multi-Task Learning (MTL) 1 3,03
Deep Recurrent Q-Network (DQN) 1 3,03
3D CNN 1 3,03
Faster R-CNN 1 3,03
Total 33 100
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
9
indicating a broad enough search.
Analysis-related discussion: Another issue that could be a threat
to validity the way the analysis is conducted. For example, not all
publications stated what kind of evaluation parameters were used,
and sometimes just a few examples of features were explained. Thus,
sometimes this information that is required to address the research
questions could not be found in the paper. This way, the data that
was used to answer the research questions were derived from a few
numbers of publications than a total of 50 selected publications. To
get more information about the publications, the authors could
potentially have been contacted, but this line of action was not
feasible within the context of this research, and that might also not
solve all the issues.
RQ1-Related (algorithms) discussion: Linear Regression is the
second most used algorithms, according to Table 5. Linear Regres-
sion is used as a benchmarking algorithm in most cases to check
whether the proposed algorithm is better than Linear Regression or
not. Therefore, although it is shown in many articles, it does not
mean that it is the best performing algorithm. Table 5 should be
interpreted carefully because “most used” does not mean the best-
performing ones. In fact, Deep Learning (DL), which is a sub-branch
of Machine Learning, has been used for the crop yield prediction
problem recently and is believed to be very promising. In this study,
we also identified several deep learning-based studies. There are
several additional promising aspects of DL methods, such as auto-
matic feature extraction and superior performance. We expect that
more research will be conducted on the use of DL approaches in crop
yield prediction in the near future due to the superior performance
of DL algorithms in other problem domains.
Among the selected publications, both classifiers and clustering al-
gorithms are used. Since pictures are used for clustering in those pub-
lications, the publication is in connection with the machine vision in-
stead of ML using a numerical dataset. The use of clustering algorithms
for this problem can be investigated in detail to find different research
perspectives in this problem.
RQ2-related (features) discussion: Groups are created for features
and algorithms to visualize the main features and algorithms. Due to
this decision, detailed information is lost, but clarity has been
maintained. The most used features are soil type, rainfall, and
temperature. Apart from those features that are used in several
studies, there are also features that were used in specific studies.
Those features are gamma radiation, MODIS-EVI, forecast rainfall,
humidity, photoperiod, pH-value, irrigation, leaf area, NDVI, EVI,
and crop information. There are also studies that use different nu-
trients as features, which are magnesium, potassium, sulphur, zinc,
nitrogen, boron, and calcium. The most used features are not always
the same kind of data. Temperature, for example, is measured as
average temperature, but more features like maximum temperature
and minimum temperature are also applied.
RQ3-related (evaluation parameters and approaches) discus-
sion: There are not many evaluation parameters reported in the
selected papers. Almost every study used RMSE as the measurement
of the quality of the model. Other evaluation parameters are MSE,
R
2
, and MAE. Some parameters were used in specific studies, most of
these parameters look like some of the previously mentioned para-
meters, with a small difference. These are MAPE, LCCC, MFE, SAE,
rcv, RSAE, and MCC. Most of the models had outcomes with high
accuracy values for their evaluation parameters, which means that
the model made correct predictions. As the evaluation approach, the
10-fold cross-validation approach was preferred by researchers.
RQ4-related (challenges) discussion: Challenges were reported
based on the explicit statements in the articles. However, there
might be additional challenges that were not stated in the identified
papers. The challenges are mainly in the field of improvement of a
working model. When more data is gathered to train and test, much
more can be said about the precision of the model. Another chal-
lenge is the implementation of the models into the farm manage-
ment systems. When applications are made that the farmer can use,
then only can the models be useful to make decisions, also during
the growing season. When specific parameters for that specific place
are measured and added, predictions will have higher precision.
7. Conclusion
This study showed that the selected publications use a variety of
features, depending on the scope of the research and the availability of
data. Every paper investigates yield prediction with machine learning
but differs from the features. The studies also differ in scale, geological
position, and crop. The choice of features is dependent on the avail-
ability of the dataset and the aim of the research. Studies also stated
that models with more features did not always provide the best per-
formance for the yield prediction. To find the best performing model,
models with more and fewer features should be tested. Many algorithms
have been used in different studies. The results show that no specific
conclusion can be drawn as to what the best model is, but they clearly
show that some machine learning models are used more than the
others. The most used models are the random forest, neural networks,
linear regression, and gradient boosting tree. Most of the studies used a
variety of machine learning models to test which model had the best
prediction.
Since Neural Networks is the most applied algorithm, we also aimed
to investigate to what extent deep learning algorithms were used for
crop yield prediction. After the identification of 30 papers that applied
deep learning, we extracted and synthesized the applied algorithms. We
observed that CNN, LSTM, and DNN algorithms are the most preferred
deep learning algorithms. However, there are also other kinds of al-
gorithms applied to this problem. We consider that this article will pave
the way for further research on the development of crop yield predic-
tion problem.
In our future work, we aim to build on the outcomes of this study
and focus on the development of a DL-based crop yield prediction
model.
Declaration of Competing Interest
The authors declare that they have no known competing financial
interests or personal relationships that could have appeared to influ-
ence the work reported in this paper.
Appendix A
In Table A1, features used per publications are shown. If there is a ‘1’ in the box, it means that that specific feature was used.
In Table A2, the evaluation parameters used per publication are presented.
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
10
Table A1
Features used per selected publication.
Paper Soil
type
Gamma
radia-
metrics
Soil
maps
MODIS-
EVI
Rainfall Forecas-
ted
rainfall
Precipi-
tation
Temper-
ature
Humidi-
ty
Photop-
eriod
Fertiliz-
ation
Climate pH-
value
Irrigati-
on
Cation
ex-
change
capacity
Magnes-
ium
Potassi-
um
Area of
produc-
tion
Wind
Speed
Filippi et al., 2019 1 1 1 1 1 1
Jeong et al., 2016 1 1 1 1 1 1 1
Zhong et al., 2018 1 1 1 1
Villanueva and
Salenga, 2018
Crane-Droesch,
2018
1 1 1 1 1 1 1
Gonzalez-Sanchez
et al., 2014
1 1 1 1 1
Xu et al., 2019 1 1 1
Pantazi et al., 2016 1 1 1
Kouadio et al.,
2018
1 1 1 1 1
Kunapuli et al.,
2015
1 1
Shekoofa et al.,
2014
1 1
Pantazi et al., 2014 1 1 1 1
Goldstein et al.,
2018
1 1 1
Mola-Yudego et
al., 2016
1 1
Girish et al., 2018 1
Rao and Manasa,
2019
1 1 1 1 1 1
Khanal et al., 2018 1 1 1 1 1 1
Cheng et al., 2017
Everingham et al.,
2009
1 1
Everingham et al.,
2016
1 1
Bargoti and
Underwood,
2017
Fernandes and
Ebecken, 2017
1
Johnson et al.,
2013
Matsumura et al.,
2015
1 1 1
Taherei Ghazvinei,
2018
1 1 1 1 1
(continued on next page)
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
11
Table A1 (continued)
Paper Soil
type
Gamma
radia-
metrics
Soil
maps
MODIS-
EVI
Rainfall Forecas-
ted
rainfall
Precipi-
tation
Temper-
ature
Humidi-
ty
Photop-
eriod
Fertiliz-
ation
Climate pH-
value
Irrigati-
on
Cation
ex-
change
capacity
Magnes-
ium
Potassi-
um
Area of
produc-
tion
Wind
Speed
Romero et al.,
2013
Su et al., 2017 1 1 1 1 1 1 1 1
You et al., 2017
Ahmad et al., 2018 1 1
Črtomir et al.,
2012
Osman et al., 2017 1 1 1 1 1
Ranjan and Parida,
2019
1 1
Shah et al., 2018 1 1 1
Russ et al., 2008 1 1
Monga, 2018
Russ and Kruse,
2010
1 1 1
Baral et al., 2011 1 1 1
Ahamed et al.,
2015
1 1 1 1 1 1
Ali et al., 2017 1 1
Cakir et al., 2014 1 1 1 1
Gandhi et al., 2016 1 1 1
Wang et al., 2018
Charoen-Ung and
Mittrapiyanur-
uk, 2019
1 1 1 1
Ananthara et al.,
2013
1 1 1 1
Bose et al., 2016 1
Gandhi et al., 2016 1 1 1
Gandhi and
Armstrong,
2016
1 1 1 1
Paul et al., 2015 1 1
Rahman and Haq,
2014
1 1 1
Sujatha and Isakki,
2016
1 1
Paper Shortw-
ave
radia-
tion
Degree-
days
Time Solar
radia-
tion
Pressure Sulphur Zinc Nitroge-
n
Boron Calcium Crop
Inform-
ation
Leaf
Area
Index
Phosph-
orus
Manga-
nese
Organic
carbon
Images NDVI EVI
Filippi et al., 2019
Jeong et al., 2016
Zhong et al., 2018 1
Villanueva and
Salenga, 2018
Crane-Droesch,
2018
1 1 1
(continued on next page)
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
12
Table A1 (continued)
Paper Shortw-
ave
radia-
tion
Degree-
days
Time Solar
radia-
tion
Pressure Sulphur Zinc Nitroge-
n
Boron Calcium Crop
Inform-
ation
Leaf
Area
Index
Phosph-
orus
Manga-
nese
Organic
carbon
Images NDVI EVI
Gonzalez-Sanchez
et al., 2014
1
Xu et al., 2019 1
Pantazi et al., 2016
Kouadio et al.,
2018
1 1 1 1 1
Kunapuli et al.,
2015
1 1
Shekoofa et al.,
2014
1
Pantazi et al., 2014 1 1 1
Goldstein et al.,
2018
1 1
Mola-Yudego et
al., 2016
Girish et al., 2018
Rao and Manasa,
2019
1 1 1 1 1 1
Khanal et al., 2018
Cheng et al., 2017 1
Everingham et al.,
2009
1
Everingham et al.,
2016
1 1
Bargoti and
Underwood,
2017
1
Fernandes and
Ebecken, 2017
1
Johnson et al.,
2013
1 1
Matsumura et al.,
2015
Taherei Ghazvinei,
2018
1
Romero et al.,
2013
1
Su et al., 2017 1
(continued on next page)
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
13
Table A1 (continued)
Paper Shortw-
ave
radia-
tion
Degree-
days
Time Solar
radia-
tion
Pressure Sulphur Zinc Nitroge-
n
Boron Calcium Crop
Inform-
ation
Leaf
Area
Index
Phosph-
orus
Manga-
nese
Organic
carbon
Images NDVI EVI
You et al., 2017 1
Ahmad et al., 2018 1 1 1 1 1
Črtomir et al.,
2012
1
Osman et al., 2017 1
Ranjan and Parida,
2019
1
Shah et al., 2018
Russ et al., 2008 1
Monga, 2018 1
Russ and Kruse,
2010
Baral et al., 2011
Ahamed et al.,
2015
1
Ali et al., 2017 1 1 1
Cakir et al., 2014 1
Gandhi et al., 2016 1
Wang et al., 2018 1 1
Charoen-Ung and
Mittrapiyanur-
uk, 2019
Ananthara et al.,
2013
1 1
Bose et al., 2016 1 1
Gandhi et al., 2016 1
Gandhi and
Armstrong,
2016
Paul et al., 2015 1 1 1
Rahman and Haq,
2014
1
Sujatha and Isakki,
2016
1
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
14
Table A2
Evaluation parameters used per publication.
Paper Root mean
square error
Lin’s
concordance
correlation
coefficient
Mean square
error
R-squared Mean
absolute
error
Mean
absolute
percentage
error
Multi
factored
evaluation
Simple
average
ensemble
Reference
change
values
Reduced
simple
average
ensemble
Matthew’s
correla-
tion
coefficient
Filippi et al., 2019 1 1 1
Jeong et al., 2016 1
Zhong et al., 2018 1 1
Villanueva and Salenga,
2018
Crane-Droesch, 2018 1
Gonzalez-Sanchez et al.,
2014
1 1 1
Xu et al., 2019 1 1
Pantazi et al., 2016 1
Kouadio et al., 2018 1 1
Kunapuli et al., 2015 1
Shekoofa et al., 2014
Pantazi et al., 2014
Goldstein et al., 2018 1
Mola-Yudego et al., 2016 1 1
Girish et al., 2018
Rao and Manasa, 2019
Khanal et al., 2018 1 1
Cheng et al., 2017 1 1 1 1
Everingham et al., 2009 1 1 1
Everingham et al., 2016 1
Bargoti and Underwood,
2017
1
Fernandes and Ebecken,
2017
1 1
Johnson et al., 2013
Matsumura et al., 2015 1 1
Taherei Ghazvinei, 2018 1 1
Romero et al., 2013
Su et al., 2017 1
You et al., 2017 1
Ahmad et al., 2018 1 1 1 1
Črtomir et al., 2012 1
(continued on next page)
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
15
Table A2 (continued)
Paper Root mean
square error
Lin’s
concordance
correlation
coefficient
Mean square
error
R-squared Mean
absolute
error
Mean
absolute
percentage
error
Multi
factored
evaluation
Simple
average
ensemble
Reference
change
values
Reduced
simple
average
ensemble
Matthew’s
correla-
tion
coefficient
Osman et al., 2017 1
Ranjan and Parida, 2019
Shah et al., 2018 1 1 1
Russ et al., 2008 1 1
Monga, 2018 1 1
Russ and Kruse, 2010 1
Baral et al., 2011
Ahamed et al., 2015 1
Ali et al., 2017 1 1
Cakir et al., 2014 1
Gandhi et al., 2016 1 1 1 1
Wang et al., 2018 1 1
Charoen-Ung and
Mittrapiyanuruk, 2019
Ananthara et al., 2013
Bose et al., 2016 1 1 1
Gandhi et al., 2016 1 1 1
Gandhi and Armstrong,
2016
1 1 1
Paul et al., 2015
Rahman and Haq, 2014 1
Sujatha and Isakki, 2016
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
16
Appendix B. Supplementary material
Supplementary data to this article can be found online at https://doi.org/10.1016/j.compag.2020.105709.
References
Ahamed, A.T.M.S., Mahmood, N.T., Hossain, N., Kabir, M.T., Das, K., Rahman, F.,
Rahman, R.M., 2015. Applying data mining techniques to predict annual yield of
major crops and recommend planting different crops in different districts in
Bangladesh. In: 2015 IEEE/ACIS 16th International Conference on Software
Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing,
SNPD 2015 - Proceedings, https://doi.org/10.1109/SNPD.2015.7176185.
Ahmad, I., Saeed, U., Fahad, M., Ullah, A., Habib-ur-Rahman, M., Ahmad, A., Judge, J.,
2018. Yield forecasting of spring maize using remote sensing and crop modeling in
Faisalabad-Punjab Pakistan. J. Indian Soc. Remote Sens. 46 (10), 1701–1711.
https://doi.org/10.1007/s12524-018-0825-8.
Ali, I., Cawkwell, F., Dwyer, E., Green, S., 2017. Modeling managed grassland biomass
estimation by using multitemporal remote sensing data—a machine learning ap-
proach. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10 (7), 3254–3264. https://
doi.org/10.1109/JSTARS.2016.2561618.
Alpaydin, E., 2010. Introduction to Machine Learning, 2nd ed. Retrieved from https://
books.google.nl/books?hl=nl&lr=&id=TtrxCwAAQBAJ&oi=fnd&pg=PR7&dq=
introduction+to+machine+learning&ots=T5ejQG_7pZ&sig=0xC_
H0agN7mPhYW7oQsWiMVwRnQ#v=onepage&q=introduction to machine
learning&f=false.
Ananthara, M.G., Arunkumar, T., Hemavathy, R., 2013. CRY-An improved crop yield
prediction model using bee hive clustering approach for agricultural data sets. In:
Proceedings of the 2013 International Conference on Pattern Recognition,
Informatics and Mobile Engineering, PRIME 2013, 473–478. https://doi.org/10.
1109/ICPRIME.2013.6496717.
Ayodele, T.O., 2010. Introduction to Machine Learning.
Baldi, P., 2012. Autoencoders, unsupervised learning, and deep architectures. In:
Proceedings of ICML workshop on unsupervised and transfer learning, pp. 37–49.
Baral, S., Kumar Tripathy, A., Bijayasingh, P., 2011. Yield Prediction Using Artificial
Neural Networks, pp. 315–317. https://doi.org/10.1007/978-3-642-19542-6_57.
Bargoti, S., Underwood, J.P., 2017. Image segmentation for fruit detection and yield
estimation in apple orchards. J. Field Rob. 34 (6), 1039–1060. https://doi.org/10.
1002/rob.21699.
Beulah, R., 2019. A survey on different data mining techniques for crop yield prediction.
Int. J. Comput. Sci. Eng. 7 (1), 738–744. https://doi.org/10.26438/ijcse/v7i1.738744.
Bhojani, S.H., Bhatt, N., 2020. Wheat crop yield prediction using new activation functions
in neural network. Neural Comput. Appl. 1–11.
Bose, P., Kasabov, N., Bruzzone, L., n.d. Spiking neural networks for crop yield estimation
based on spatiotemporal analysis of image time series. Ieeexplore.Ieee.Org. Retrieved
from https://ieeexplore.ieee.org/abstract/document/7524771/.
Brownlee, J., 2016. Deep learning with Python: develop deep learning models on Theano
and TensorFlow using Keras. Machine Learning Mastery.
Brownlee, J., 2017. Long Short-term Memory Networks with Python: Develop Sequence
Prediction Models with Deep Learning. Machine Learning Mastery.
Brownlee, J., 2019. Deep Learning for Computer Vision: Image Classification, Object
Detection, and Face Recognition in Python. Machine Learning Mastery.
Cakir, Y., Kirci, M., Gunes, E.O., 2014. Yield prediction of wheat in south-east region of
Turkey by using artificial neural networks. In: 2014 The 3rd International Conference
on Agro-Geoinformatics, Agro-Geoinformatics 2014. https://doi.org/10.1109/Agro-
Geoinformatics.2014.6910609.
Charoen-Ung, P., Mittrapiyanuruk, P., 2019. Sugarcane yield grade prediction using
random forest with forward feature selection and hyper-parameter tuning, pp. 33–42.
https://doi.org/10.1007/978-3-319-93692-5_4.
Chen, Y., Lee, W.S., Gan, H., Peres, N., Fraisse, C., Zhang, Y., He, Y., 2019. Strawberry
yield prediction based on a deep neural network using high-resolution aerial or-
thoimages. Remote Sens. 11 (13), 1584.
Cheng, H., Damerow, L., Sun, Y., Blanke, M., 2017. Early yield prediction using image
analysis of apple fruit and tree canopy features with neural networks. J. Imag. 3 (1),
6. https://doi.org/10.3390/jimaging3010006.
Chlingaryan, A., Sukkarieh, S., Whelan, B., 2018. Machine learning approaches for crop
yield prediction and nitrogen status estimation in precision agriculture: a review.
Comput. Electron. Agric. 151, 61–69. https://doi.org/10.1016/j.compag.2018.05.012.
Chu, Z., Yu, J., 2020. An end-to-end model for rice yield prediction using deep learning
fusion. Comput. Electron. Agric. 174.
Crane-Droesch, A., 2018. Machine learning methods for crop yield prediction and climate
change impact assessment in agriculture. Environ. Res. Lett. 13 (11), 114003.
https://doi.org/10.1088/1748-9326/aae159.
Črtomir, R., Urška, C., Stanislav, T., Denis, S., Karmen, P., Pavlovič, M., Marjan, V., 2012.
Application of neural networks and image visualization for early forecast of apple
yield. Erwerbs-Obstbau 54 (2), 69–76. https://doi.org/10.1007/s10341-012-0162-y.
De Alwis, S., Zhang, Y., Na, M., Li, G., 2019. Duo attention with deep learning on tomato
yield prediction and factor interpretation. In: Pacific Rim International Conference on
Artificial Intelligence. Springer, Cham, pp. 704–715.
Elavarasan, D., Vincent, P.D., 2020. Crop yield prediction using deep reinforcement
learning model for sustainable agrarian applications. IEEE Access 8, 86886–86901.
Elavarasan, D., Vincent, D.R., Sharma, V., Zomaya, A.Y., Srinivasan, K., 2018. Forecasting
yield by integrating agrarian factors and machine learning models: a survey. Comput.
Electron. Agric. 155, 257–282. https://doi.org/10.1016/j.compag.2018.10.024.
Everingham, Y., Sexton, J., Skocaj, D., Inman-Bamber, G., 2016. Accurate prediction of
sugarcane yield using a random forest algorithm. Agron. Sustainable Dev. 36 (2).
https://doi.org/10.1007/s13593-016-0364-z.
Everingham, Y.L., Smyth, C.W., Inman-Bamber, N.G., 2009. Ensemble data mining ap-
proaches to forecast regional sugarcane crop production. Agric. For. Meteorol. 149
(3–4), 689–696. https://doi.org/10.1016/J.AGRFORMET.2008.10.018.
Fathi, M.T., Ezziyyani, M., Ezziyyani, M., El Mamoune, S., 2019. Crop yield prediction using
deep learning in Mediterranean Region. In: International Conference on Advanced
Intelligent Systems for Sustainable Development. Springer, Cham, pp. 106–114.
Fernandes, J.L., Ebecken, N.F.F., Esquerdo, J.C.D.M., 2017. Sugarcane yield prediction in
Brazil using NDVI time series and neural networks ensemble. Int. J. Remote Sens. 38
(16), 4631–4644. https://doi.org/10.1080/01431161.2017.1325531.
Filippi, P., Jones, E.J., Wimalathunge, N.S., Somarathna, P.D.S.N., Pozza, L.E., Ugbaje,
S.U., Bishop, T.F.A., 2019a. An approach to forecast grain crop yield using multi-
layered, multi-farm data sets and machine learning. Precis. Agric. 1–15. https://doi.
org/10.1007/s11119-018-09628-4.
Filippi, P., Jones, E.J., Wimalathunge, N.S., Somarathna, P.D.S.N., Pozza, L.E., Ugbaje,
S.U., Bishop, T.F.A., 2019b. An approach to forecast grain crop yield using multi-
layered, multi-farm data sets and machine learning. Precis. Agric. https://doi.org/10.
1007/s11119-018-09628-4.
Gandhi, N., Armstrong, L., 2016. Applying data mining techniques to predict yield of rice
in humid subtropical climatic zone of India. In: Proceedings of the 10th INDIACom;
2016 3rd International Conference on Computing for Sustainable Global
Development, INDIACom 2016, 1901–1906. Retrieved from https://ieeexplore.ieee.
org/abstract/document/7724597/.
Gandhi, N., Armstrong, L.J., 2016b. A review of the application of data mining techniques
for decision making in agriculture. In: Proceedings of the 2016 2nd International
Conference on Contemporary Computing and Informatics, https://doi.org/10.1109/
IC3I.2016.7917925.
Gandhi, N., Petkar, O., Armstrong, L.J., Tripathy, A.K., 2016. Rice crop yield prediction in
India using support vector machines. In: 2016 13th International Joint Conference on
Computer Science and Software Engineering, JCSSE 2016. https://doi.org/10.1109/
JCSSE.2016.7748856.
Girish, L., Gangadhar, S., Bharath, T., Balaji, K., n.d. Crop Yield and Rainfall Prediction in
Tumakuru District using Machine Learning. Ijream.Org. Retrieved from https://
www.ijream.org/papers/NCTFRD2018015.pdf.
Goldstein, A., Fink, L., Meitin, A., Bohadana, S., Lutenberg, O., Ravid, G., 2018. Applying
machine learning on sensor data for irrigation recommendations: revealing the
agronomist’s tacit knowledge. Precis. Agric. 19 (3), 421–444. https://doi.org/10.
1007/s11119-017-9527-4.
Gonzalez-Sanchez, A., Frausto-Solis, J., Ojeda-Bustamante, W., 2014. Predictive ability of
machine learning methods for massive crop yield prediction. Spanish J. Agric. Res. 12
(2), 313–328. https://doi.org/10.5424/sjar/2014122-4439.
Jeong, J.H., Resop, J.P., Mueller, N.D., Fleisher, D.H., Yun, K., Butler, E.E., Kim, S.H.,
2016. Random forests for global and regional crop yield predictions. PLoS ONE 11
(6). https://doi.org/10.1371/journal.pone.0156571.
Ji, S., Xu, W., Yang, M., Yu, K., 2012. 3D convolutional neural networks for human action
recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35 (1), 221–231.
Jiang, H., Hu, H., Zhong, R., Xu, J., Xu, J., Huang, J., Lin, T., 2020. A deep learning ap-
proach to conflating heterogeneous geospatial data for corn yield estimation: a case
study of the US Corn Belt at the county level. Glob. Change Biol. 26 (3), 1754–1766.
Johnson, M.D., 2013. Crop Yield Forecasting on the Canadian Prairies by Satellite Data
and Machine Learning Methods. Master’s Thesis, University of British Columbia,
Atmospheric Science. Retrieved from https://www.sciencedirect.com/science/
article/pii/S0168192315007546.
Ju, S., Lim, H., Heo, J., 2020. Machine learning approaches for crop yield prediction with
MODIS and weather data. 40th Asian Conference on Remote Sensing: Progress of
Remote Sensing Technology for Smart Future, ACRS 2019.
Kang, Y., Ozdogan, M., Zhu, X., Ye, Z., Hain, C.R., Anderson, M.C., 2020. Comparative
assessment of environmental variables and machine learning algorithms for maize
yield prediction in the US Midwest. Environ. Res. Lett.
Khaki, S., Wang, L., 2019. Crop yield prediction using deep neural networks. Front. Plant
Sci. 10, 621.
Khaki, S., Wang, L., Archontoulis, S.V., 2020. A cnn-rnn framework for crop yield pre-
diction. Front. Plant Sci. 10, 1750.
Khanal, S., Fulton, J., Klopfenstein, A., Douridas, N., Shearer, S., 2018. Integration of high
resolution remotely sensed data and machine learning techniques for spatial pre-
diction of soil properties and corn yield. Comput. Electron. Agric. 153, 213–225.
https://doi.org/10.1016/J.COMPAG.2018.07.016.
Kitchenham, B., Charters, S., Budgen, D., Brereton, P., Turner, M., Linkman, S., Visaggio,
G., 2007. Guidelines for performing Systematic Literature Reviews in Software
Engineering. Retrieved from https://userpages.uni-koblenz.de/~laemmel/
esecourse/slides/slr.pdf.
Kouadio, L., Deo, R.C., Byrareddy, V., Adamowski, J.F., Mushtaq, S., Phuong Nguyen, V.,
2018. Artificial intelligence approach for the prediction of Robusta coffee yield using
soil fertility properties. Comput. Electron. Agric. 155, 324–338. https://doi.org/10.
1016/J.COMPAG.2018.10.014.
Kunapuli, S.S., Rueda-Ayala, V., Benavidez-Gutierrez, G., Cordova-Cruzatty, A., Cabrera,
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
17
A., Fernandez, C., Maiguashca, J., 2015. Yield prediction for precision territorial
management in maize using spectral data. In: Precision Agriculture 2015 - Papers
Presented at the 10th European Conference on Precision Agriculture, ECPA 2015 (pp.
199–206). Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-
84947244569&partnerID=40&md5=241e9b9de12f2eb0fae3ed0ee2fd22c0.
Lee, S., Jeong, Y., Son, S., Lee, B., 2019. A self-predictable crop yield platform (SCYP)
based on crop diseases using deep learning. Sustainability 11 (13), 3637.
Li, B., Lecourt, J., Bishop, G., 2018. Advances in non-destructive early assessment of fruit
ripeness towards defining optimal time of harvest and yield prediction—a review.
Plants 7 (1). https://doi.org/10.3390/plants7010003.
Liakos, K.G., Busato, P., Moshou, D., Pearson, S., Bochtis, D., 2018. Machine learning in
agriculture: a review. Sensors (Switzerland) 18 (8). https://doi.org/10.3390/
s18082674.
Maimaitijiang, M., Sagan, V., Sidike, P., Hartling, S., Esposito, F., Fritschi, F.B., 2020.
Soybean yield prediction from UAV using multimodal data fusion and deep learning.
Remote Sens. Environ. 237.
Matsumura, K., Gaitan, C.F., Sugimoto, K., Cannon, A.J., Hsieh, W.W., 2015. Maize yield
forecasting by linear regression and artificial neural networks in Jilin, China. J. Agric.
Sci. 153 (3), 399–410. https://doi.org/10.1017/S0021859614000392.
Mayuri, P.K., Priya, V.C., n.d. Role of image processing and machine learning techniques
in disease recognition, diagnosis and yield prediction of crops: a review. Int. J. Adv.
Res. Comput. Sci., 9(2). https://doi.org/10.26483/ijarcs.v9i2.5793.
McQueen, R.J., Garner, S.R., Nevill-Manning, C.G., Witten, I.H., 1995. Applying machine
learning to agricultural data. Comput. Electron. Agric. 12 (4), 275–293. https://doi.
org/10.1016/0168-1699(95)98601-9.
Measuring Vegetation (NDVI & EVI), 2000. Retrieved from https://earthobservatory.
nasa.gov/features/MeasuringVegetation/measuring_vegetation_2.php.
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Petersen, S.,
2015. Human-level control through deep reinforcement learning. Nature 518 (7540),
529–533.
Mola-Yudego, B., Rahlf, J., Astrup, R., Dimitriou, I., 2016. Spatial yield estimates of fast-
growing willow plantations for energy based on climatic variables in northern
Europe. GCB Bioenergy 8 (6), 1093–1105. https://doi.org/10.1111/gcbb.12332.
Monga, T., 2018. Estimating vineyard grape yield from images, pp. 339–343. https://doi.
org/10.1007/978-3-319-89656-4_37.
Nevavuori, P., Narra, N., Lipping, T., 2019. Crop yield prediction with deep convolutional
neural networks. Comput. Electron. Agric. 163.
Nguyen, L.H., Zhu, J., Lin, Z., Du, H., Yang, Z., Guo, W., Jin, F., 2019. Spatial-temporal
multi-task learning for within-field cotton yield prediction. In: Pacific-Asia
Conference on Knowledge Discovery and Data Mining. Springer, Cham, pp. 343–354.
Osman, T., Psyche, S.S., Kamal, M.R., Tamanna, F., Haque, F., Rahman, R.M., 2017.
Predicting early crop production by analysing prior environment factors, pp.
470–479. https://doi.org/10.1007/978-3-319-49073-1_51.
Pantazi, X.E., Moshou, D., Mouazen, A.M., Kuang, B., Alexandridis, T., 2014. Application
of supervised self organising models for wheat yield prediction, pp. 556–565. https://
doi.org/10.1007/978-3-662-44654-6_55.
Pantazi, X.E., Moshou, D., Alexandridis, T., Whetton, R.L., Mouazen, A.M., 2016. Wheat
yield prediction using machine learning and advanced sensing techniques. Comput.
Electron. Agric. 121, 57–65. https://doi.org/10.1016/j.compag.2015.11.018.
Paul, M., Vishwakarma, S.K., Verma, A., 2015. Analysis of soil behaviour and prediction
of crop yield using data mining approach. In: 2015 International Conference on
Computational Intelligence and Communication Networks (CICN). IEEE, pp.
766–771. https://doi.org/10.1109/CICN.2015.156.
Rahman, M., Haq, N., n.d. Machine learning facilitated rice prediction in Bangladesh.
Ieeexplore.Ieee.Org. Retrieved from https://ieeexplore.ieee.org/abstract/document/
7113655/.
Rahnemoonfar, M., Sheppard, C., 2017. Real-time yield estimation based on deep
learning. In: Autonomous Air and Ground Sensing Systems for Agricultural
Optimization and Phenotyping II Vol. 10218. International Society for Optics and
Photonics, pp. 1021809.
Ranjan, A.K., Parida, B.R., 2019. Paddy acreage mapping and yield prediction using
sentinel-based optical and SAR data in Sahibganj district, Jharkhand (India). Spatial
Inf. Res. https://doi.org/10.1007/s41324-019-00246-4.
Rao, T., Manasa, S., n.d. Artificial Neural networks for soil quality and crop yield pre-
diction using machine learning. Ijfrcsce.Org. Retrieved from http://www.ijfrcsce.
org/download/browse/Volume_5/January_19_Volume_5_Issue_1/1547885118_19-
01-2019.pdf.
Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster r-cnn: Towards real-time object de-
tection with region proposal networks. In: Advances in neural information processing
systems, pp. 91–99.
Romero, J.R., Roncallo, P.F., Akkiraju, P.C., Ponzoni, I., Echenique, V.C., Carballido, J.A.,
2013. Using classification algorithms for predicting durum wheat yield in the pro-
vince of Buenos Aires. Comput. Electron. Agric. 96, 173–179. https://doi.org/10.
1016/j.compag.2013.05.006.
Ruder, S., 2017. An overview of multi-task learning in deep neural networks. arXiv
preprint arXiv:1706.05098.
Ruß, G., Kruse, R., 2010. Regression models for spatial data: an example from precision
agriculture, pp. 450–463. https://doi.org/10.1007/978-3-642-14400-4_35.
Ruß, G., Kruse, R., Schneider, M., Wagner, P., 2008. Data mining with neural networks for
wheat yield prediction. In: Lecture Notes in Computer Science (including subseries
Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol.
5077 LNAI. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 47–56. https://doi.
org/10.1007/978-3-540-70720-2_4.
Saravi, B., Nejadhashemi, A.P., Tang, B., 2019. Quantitative model of irrigation effect on
maize yield by deep neural network. Neural Comput. Appl. 1–14.
Schwalbert, R.A., Amado, T., Corassa, G., Pott, L.P., Prasad, P.V., Ciampitti, I.A., 2020.
Satellite-based soybean yield forecast: Integrating machine learning and weather data
for improving crop yield prediction in southern Brazil. Agric. For. Meteorol. 284.
Shah, A., Dubey, A., Hemnani, V., Gala, D., Kalbande, D.R., 2018. Smart Farming System:
Crop Yield Prediction Using Regression Techniques. Springer, Singapore, pp. 49–56.
https://doi.org/10.1007/978-981-10-8339-6_6.
Shekoofa, A., Emam, Y., Shekoufa, N., Ebrahimi, M., Ebrahimie, E., 2014. Determining
the most important physiological and agronomic traits contributing to maize grain
yield through machine learning algorithms: a new avenue in intelligent agriculture.
PLoS ONE 9 (5), e97288. https://doi.org/10.1371/journal.pone.0097288.
Shidnal, S., Latte, M.V., Kapoor, A., 2019. Crop yield prediction: two-tiered machine
learning model approach. Int. J. Inf. Technol. 1–9.
Šmite, D., Wohlin, C., Gorschek, T., Feldt, R., 2010. Empirical evidence in global software
engineering: a systematic review. Empirical Softw. Eng. 15 (1), 91–118. https://doi.
org/10.1007/s10664-009-9123-y.
Somvanshi, P., Mishra, B.N., 2015. Machine learning techniques in plant biology. In:
PlantOmics: The Omics of Plant Science. Springer India, New Delhi, pp. 731–754.
https://doi.org/10.1007/978-81-322-2172-2_26.
Su, Y.X., Xu, H., Yan, L.J., 2017. Support vector machine-based open crop model
(SBOCM): case of rice production in China. Saudi J. Biol. Sci. 24 (3), 537–547.
Sujatha, R., Isakki, P., 2016. A study on crop yield forecasting using classification tech-
niques. In: 2016 International Conference on Computing Technologies and Intelligent
Data Engineering, ICCTIDE 2016. https://doi.org/10.1109/ICCTIDE.2016.7725357.
Sun, J., Di, L., Sun, Z., Shen, Y., Lai, Z., 2019. County-level soybean yield prediction using
deep CNN-LSTM model. Sensors 19 (20), 4363.
Taherei-Ghazvinei, P., Hassanpour-Darvishi, H., Mosavi, A., Yusof, K.W., Alizamir, M.,
Shamshirband, S., Chau, K., 2018. Sugarcane growth prediction based on meteor-
ological parameters using extreme learning machine and artificial neural network.
Eng. Appl. Comput. Fluid Mech. 12 (1), 738–749. https://doi.org/10.1080/
19942060.2018.1526119.
Tedesco-Oliveira, D., da Silva, R.P., Maldonado Jr, W., Zerbato, C., 2020. Convolutional
neural networks in predicting cotton yield from images of commercial fields. Comput.
Electron. Agric. 171.
Terliksiz, A.S., Altýlar, D.T., 2019. Use Of deep neural networks for crop yield prediction: a case
study Of Soybean Yield in Lauderdale County, Alabama, USA. In: 2019 8th International
Conference on Agro-Geoinformatics (Agro-Geoinformatics). IEEE, pp. 1–4.
Villanueva, M.B., Louella, M., Salenga, M., 2018. Bitter Melon Crop Yield Prediction using
Machine Learning Algorithm. IJACSA) International Journal of Advanced Computer
Science and Applications, Vol. 9. Retrieved from www.ijacsa.thesai.org.
Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A., 2008. Extracting and composing
robust features with denoising autoencoders. In: Proceedings of the 25th interna-
tional conference on Machine learning, pp. 1096–1103.
Wang, A., Tran, C., Desai, N., Lobell, D., n.d. Deep transfer learning for crop yield pre-
diction with remote sensing data. Dl.Acm.Org. Retrieved from https://dl.acm.org/
citation.cfm?id=3212707.
Wang, X., Huang, J., Feng, Q., Yin, D., 2020. Winter wheat yield prediction at county
level and uncertainty analysis in main wheat-producing regions of china with deep
learning approaches. Remote Sens. 12 (11), 1744.
Wang, A.X., Tran, C., Desai, N., Lobell, D., Ermon, S., 2018. Deep transfer learning for
crop yield prediction with remote sensing data. In: Proceedings of the 1st ACM
SIGCAS Conference on Computing and Sustainable Societies, pp. 1–5.
Wang, Y., Zhang, Z., Feng, L., Du, Q., Runge, T., 2020. Combining multi-source data and
machine learning approaches to predict winter wheat yield in the conterminous
United States. Remote Sens. 12 (8), 1232.
Witten, I.H., Frank, E., Hall, M.A., Pal, C.J., 2016. Data Mining: Practical Machine
Learning Tools and Techniques. Data Mining: Practical Machine Learning Tools and
Techniques. https://doi.org/10.1016/c2009-0-19715-5.
Wolanin, A., Mateo-García, G., Camps-Valls, G., Gómez-Chova, L., Meroni, M., Duveiller,
G., Guanter, L., 2020. Estimating and understanding crop yields with explainable
deep learning in the Indian Wheat Belt. Environ. Res. Lett. 15 (2).
Xu, X., Gao, P., Zhu, X., Guo, W., Ding, J., Li, C., Wu, X., 2019. Design of an integrated
climatic assessment indicator (ICAI) for wheat production: a case study in Jiangsu
Province, China. Ecol. Ind. 101, 943–953. https://doi.org/10.1016/j.ecolind.2019.
01.059.
Yalcin, H., 2019. An approximation for a relative crop yield estimate from field images
using deep learning. In: 2019 8th International Conference on Agro-Geoinformatics
(Agro-Geoinformatics). IEEE, pp. 1–6.
Yang, Q., Shi, L., Han, J., Zha, Y., Zhu, P., 2019. Deep convolutional neural networks for
rice grain yield estimation at the ripening stage using UAV-based remotely sensed
images. Field Crops Res. 235, 142–153.
Ying-xue, S., Huan, X., Li-jiao, Y., 2017. Support vector machine-based open crop model
(SBOCM): Case of rice production in China. Saudi J. Biol. Sci. 24 (3), 537–547.
https://doi.org/10.1016/j.sjbs.2017.01.024.
You, J., Li, X., Low, M., Lobell, D., Ermon, S., 2017. Deep Gaussian process for crop yield
prediction based on remote sensing data. In: Proceedings of the Thirty-First AAAI
Conference on Artificial Intelligence (AAAI-17), 4559–4566. https://doi.org/10.
1109/MWSCAS.2006.381794.
Zhang, Y., Yang, Q., 2017. A survey on multi-task learning. arXiv preprint arXiv:1707.
08114.
Zhang, L., Zhang, Z., Luo, Y., Cao, J., Tao, F., 2020. Combining optical, fluorescence,
thermal satellite, and environmental data to predict county-level maize yield in china
using machine learning approaches. Remote Sens. 12 (1), 21.
Zhong, H., Li, Xiaocheng, Lobell, D., Ermon, S., Brandeau, M.L., 2018. Hierarchical
modeling of seed variety yields and decision making for future planting plans.
Environ. Syst. Decis. 38, 458–470.
T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709
18
... Despite its importance in the world diet, its yield in developing countries remains low. Many factors affect crop yields, including landscape, soil quality, pest infestations, genotype, water quality, accessibility, climate conditions, and harvest planning [10], [11], making prediction a difficult statistical task. However, it is essential to estimate crop yields accurately based on production factors that determine the yield. ...
... However, it is essential to estimate crop yields accurately based on production factors that determine the yield. Nowadays, crop prediction is receiving increasing attention from researchers around the world [11], particularly due to the growing rate of malnutrition in the world and global climate change. ...
... They found that computer vision and AI have the potential to improve crop management practices in precision agriculture. Klompenburg et al. [11] also performed a systematic literature review to extract and synthesize the algorithms and predictors used in crop yield prediction studies. Following their analysis, the authors found that temperature, rainfall, and soil type are the most used features and artificial neural networks are the most used algorithms for yield prediction. Bali et al. [16] explored and evaluated different ML and deep learning techniques used in crop yield forecasting and the performance of hybrid models trained by combining more than one technique. ...
Article
Full-text available
Cereals are sensitive to small changes in complex combinations of biotic and abiotic factors. Such a complexity can be deciphered using techniques such as Machine learning (ML). Using the PRISMA approach, this paper explores the features and ML techniques in cereal yield prediction based on 115 articles from 2007 to 2023 in six databases. Results showed that most data in the articles were from secondary sources and only 28.68% used experiments or primary data. China (31) and the United States (18) contributed most. Wheat (48%), maize (33%), and rice (17%) represented the most studied cereals. Climate, remote sensing data, and soil parameters were the most used predictors. The most frequently used ML techniques for cereal prediction were support vector machine (SVM) (51%), multi-layer perceptron (MLP) (41%), linear regression (34%), random forest (RF) (24%), and XGBoost (20%). However, RF, MLP, and SVM models were the best-performing techniques to predict grain yield based on reported R-square and mean absolute error (MAE). The models in the studied articles generally performed well from test data, with an R-square between 0.7 and 1. The study further reveals that the data's availability and quality are the main obstacles to using ML models for crop prediction.
... Liakos et al. [14] published a a comprehensive review paper discussing the application of machine learning in the agricultural domain. They performed an analysis by drawing from literature related to crop management, animal management, water management, and soil management. ...
... For instance, a statistical technique called principal component analysis (PCA) transforms a series of observations of potentially correlated variables into a set of principal component values, which are values of linearly uncorrelated variables (PCs) [18]. Machine learning (ML) methods, as compared to conventional mechanism-based methods, have long been used in a wide range of agricultural applications to investigate patterns and correlations due to their ability to address linear and non-linear issues from large numbers of inputs [19,20]. In [21] research to classify five major crops (e.g., corn, soybean, winter wheat, rice, and cotton) using earth observing-1 (EO-1) Hyperion HSI during the 2008-2015 years in the United States. ...
Article
Full-text available
The advent of cloud computing and advanced processing technologies has elevated Deep Learning (DL) as a leading method for Hyper-Spectral Imaging (HSI) classification. Classifying crops accurately is vital for generating precise agricultural data to support informed decision-making. This study, introduces a DL framework called HypsLiDNet, tailored for remote sensing activities. This model processes HSI in conjunction with innovative, comprehensive Light Detection and Ranging (LiDAR) data from Hungary to conduct thorough examinations of the Earth's surface. Integrating LiDAR attributes with HSI is anticipated to enhance classification accuracy beyond HSIonly techniques. LiDAR integration provides a significant advantage by adding structural details to spectral data, aiding in the correct identification of objects with similar spectral characteristics but different shapes. The HypsLiDNet method utilizes morphological operations on LiDAR data to extract features indicative of the land's shape and texture. These features are then combined with HSI data through an attention mechanism that selectively highlights key features from both data types, improving the model's accuracy in predictions. This is particularly beneficial for complex environmental assessments, like distinguishing between plant species. The attention mechanism also refines the feature selection process, prioritizing relevant information, which boosts computational efficiency and reduces the use of resources. Moreover, this method requires a smaller number of training samples. HypsLiDNet showcases its ability to classify with precision by harnessing the combined power of HSI and LiDAR data. Experimental results show a significant improvement in classification outcomes, outperforming traditional machine learning approaches by more than 14% and recent DL techniques by approximately 1-3%.
... In the future, within the pivotal domain of agricultural yield forecasting, the deployment of deep learning models will overcome the constraints inherent in conventional methodologies. These models have fostered the integration of diverse datasets, including satellite imagery, climatic data, and soil conditions, to perform a holistic analysis aimed at forecasting the production of crops [73][74][75]. ...
Article
Full-text available
Accurate forecasting of crop yields holds paramount importance in guiding decision-making processes related to breeding efforts. Despite significant advancements in crop yield forecasting, existing methods often struggle with integrating diverse sensor data and achieving high prediction accuracy under varying environmental conditions. This study focused on the application of multi-sensor data fusion and machine learning algorithms based on unmanned aerial vehicles (UAVs) in wheat yield prediction. Five machine learning (ML) algorithms, namely random forest (RF), partial least squares (PLS), ridge regression (RR), k-nearest neighbor (KNN) and extreme gradient boosting decision tree (XGboost), were utilized for multi-sensor data fusion, together with three ensemble methods including the second-level ensemble methods (stacking and feature-weighted) and the third-level ensemble method (simple average), for wheat yield prediction. The 270 wheat hybrids were used as planting materials under full and limited irrigation treatments. A cost-effective multi-sensor UAV platform, equipped with red–green–blue (RGB), multispectral (MS), and thermal infrared (TIR) sensors, was utilized to gather remote sensing data. The results revealed that the XGboost algorithm exhibited outstanding performance in multi-sensor data fusion, with the RGB + MS + Texture + TIR combination demonstrating the highest fusion performance (R2 = 0.660, RMSE = 0.754). Compared with the single ML model, the employment of three ensemble methods significantly enhanced the accuracy of wheat yield prediction. Notably, the third-layer simple average ensemble method demonstrated superior performance (R2 = 0.733, RMSE = 0.668 t ha−1). It significantly outperformed both the second-layer ensemble methods of stacking (R2 = 0.668, RMSE = 0.673 t ha−1) and feature-weighted (R2 = 0.667, RMSE = 0.674 t ha−1), thereby exhibiting superior predictive capabilities. This finding highlighted the third-layer ensemble method’s ability to enhance predictive capabilities and refined the accuracy of wheat yield prediction through simple average ensemble learning, offering a novel perspective for crop yield prediction and breeding selection.
... Adaptive strategies like rainwater harvesting, afforestation, and using climate-resistant cultivars are being adopted by tea growers to mitigate these impacts and ensure sustainable production [5]. Additionally, studies in Dooars have shown that temperature variations during different seasons, excessive rainfall, and changes in solar radiation and soil temperature can either positively or negatively influence tea yield, emphasizing the need for proactive measures to safeguard tea plantations from the adverse effects of climate change [6][7][8]. Statistical and machine learning techniques have been compared in predicting Assam tea production. Studies have shown that machine learning algorithms, such as XGBoost regressor and random forest models, outperform statistical methods like multiple linear regression in tea yield prediction [9,10]. ...
Article
Full-text available
Climatic factors significantly impact Assam tea production. The tropical climate of Assam, characterized by high precipitation and temperatures up to 36°C during the monsoon, creates ideal conditions for tea cultivation, contributing to the region's unique malty flavor. Here, in this study an attempt has been made to bring a comparison among statistical and machine learning models in prediction of tea production and evaluate an optimal model among them. A time span of last 23 years data were collected from Biswanath College of Agriculture under Assam Agriculture University situated at Biswanath Chariali district. The study has found that mean absolute percentage error of random forest regression model is 6.49 percent followed by decision tree (7.3 percent) and linear regression model (7.5 percent). From the evaluation metrics, random forest algorithm fits well in comparison to decision tree and linear regression. This study could be generalized to comparison among more predictive machine learning models.
... For example, deep belief network (DBN), convolutional neural network (CNN), and recurrent neural network (RNN) represented by long short-term memory (LSTM) network, and so on. [10]- [12]. However, these deep models are much more complex than traditional machine learning, and all have long training times and large numbers of parameters. ...
Article
Full-text available
Temperature forecasting of grain in storage is crucial for timely granary temperature control, mitigating adverse effects of extreme temperatures on grain quality. Although traditional machine learning methods are lightweight and relatively quick to train, they suffer from poor stability and high error rates in predicting grain storage temperature. Conversely, deep learning models, while more accurate, are time-consuming and have heavy parameters. To address these problems, an improved model with light weight and good accuracy is proposed in this paper, which broad learning network is combined with one-dimensional convolution module and multi-head self-attention mechanism (BLN-1DCNN-MHSA). Firstly, we employ a one-dimensional convolution module at the feature nodes of the model to extract local temporal correlations, compensating for temporal sequence learning limitations of the BLN. Secondly, a multi-head self-attention mechanism at the enhancement nodes to captures important features dependencies and global temporal correlations. Lastly, our model achieves better prediction through enhanced representation ability of model nodes. The results with real grain storage temperature data demonstrate that the RMSE, MAPE, and MAE of the proposed model are 0.341, 0.54%, 0.28, respectively, which represent more than 2 times improvement in accuracy compared to the BLN, and it also reduces training time by more than 90% compared with LSTM and Transformer models. Additionally, the generalization and robustness of the improved approach are demonstrated through promising results in a classification experiment on the MNIST dataset. In general, the model provides a certain feasibility for early warning of grain storage risks by predicting its temperature trends.
Article
Accurate prediction of agricultural commodity prices holds an important role for ensuring food security, profitability for farmers in farming, and making well-informed decisions for both farmers and industry stakeholders. Most of the prediction is made for farmers in the proposed system. The proposed system aims to find the relationship between weather conditions and agricultural prices by utilizing a comprehensive dataset spanning past years, including historical price data, modal, maximum and minimum prices, productivity, production and key meteorological things affecting like rainfall and temperature. The system also uses machine learning algorithms to classify the effects of climate factors, on price variations in combination with data collection. In addition to showing superior prediction capacity of the Random Forests than Decision Trees, this project is very good and major in terms of agriculture prices. These findings offer a good prediction for farmers in the agricultural industry to make secured decisions and face the challenges of price volatility. In a world where the stability of food production and economic sustainability depends on price predictability, this project contributes a practical and powerful tool for enhancing the prediction of the agricultural sector.
Article
Full-text available
Timely and accurate forecasting of crop yields is crucial to food security and sustainable development in the agricultural sector. However, winter wheat yield estimation and forecasting on a regional scale still remains challenging. In this study, we established a two-branch deep learning model to predict winter wheat yield in the main producing regions of China at the county level. The first branch of the model was constructed based on the Long Short-Term Memory (LSTM) networks with inputs from meteorological and remote sensing data. Another branch was constructed using Convolution Neural Networks (CNN) to model static soil features. The model was then trained using the detrended statistical yield data during 1982 to 2015 and evaluated by leave-one-year-out-validation. The evaluation results showed a promising performance of the model with the overall R 2 and RMSE of 0.77 and 721 kg/ha, respectively. We further conducted yield prediction and uncertainty analysis based on the two-branch model and obtained the forecast accuracy in one month prior to harvest of 0.75 and 732 kg/ha. Results also showed that while yield detrending could potentially introduce higher uncertainty, it had the advantage of improving the model performance in yield prediction.
Article
Full-text available
Predicting crop yield based on the environmental, soil, water and crop parameters has been a potential research topic. Deep-learning-based models are broadly used to extract significant crop features for prediction. Though these methods could resolve the yield prediction problem there exist the following inadequacies: Unable to create a direct non-linear or linear mapping between the raw data and crop yield values; and the performance of those models highly relies on the quality of the extracted features. Deep reinforcement learning provides direction and motivation for the aforementioned shortcomings. Combining the intelligence of reinforcement learning and deep learning, deep reinforcement learning builds a complete crop yield prediction framework that can map the raw data to the crop prediction values. The proposed work constructs a Deep Recurrent Q-Network model which is a Recurrent Neural Network deep learning algorithm over the Q-Learning reinforcement learning algorithm to forecast the crop yield. The sequentially stacked layers of Recurrent Neural network is fed by the data parameters. The Q- learning network constructs a crop yield prediction environment based on the input parameters. A linear layer maps the Recurrent Neural Network output values to the Q-values. The reinforcement learning agent incorporates a combination of parametric features with the threshold that assist in predicting crop yield. Finally, the agent receives an aggregate score for the actions performed by minimizing the error and maximizing the forecast accuracy. The proposed model efficiently predicts the crop yield outperforming existing models by preserving the original data distribution with an accuracy of 93.7%.
Article
Full-text available
Winter wheat (Triticum aestivum L.) is one of the most important cereal crops, supplying essential food for the world population. Because the United States is a major producer and exporter of wheat to the world market, accurate and timely forecasting of wheat yield in the United States (U.S.) is fundamental to national crop management as well as global food security. Previous studies mainly have focused on developing empirical models using only satellite remote sensing images, while other yield determinants have not yet been adequately explored. In addition, these models are based on traditional statistical regression algorithms, while more advanced machine learning approaches have not been explored. This study used advanced machine learning algorithms to establish within-season yield prediction models for winter wheat using multi-source data to address these issues. Specifically, yield driving factors were extracted from four different data sources, including satellite images, climate data, soil maps, and historical yield records. Subsequently, two linear regression methods, including ordinary least square (OLS) and least absolute shrinkage and selection operator (LASSO), and four well-known machine learning methods, including support vector machine (SVM), random forest (RF), Adaptive Boosting (AdaBoost), and deep neural network (DNN), were applied and compared for estimating the county-level winter wheat yield in the Conterminous United States (CONUS) within the growing season. Our models were trained on data from 2008 to 2016 and evaluated on data from 2017 and 2018, with the results demonstrating that the machine learning approaches performed better than the linear regression models, with the best performance being achieved using the AdaBoost model (R2 = 0.86, RMSE = 0.51 t/ha, MAE = 0.39 t/ha). Additionally, the results showed that combining data from multiple sources outperformed single source satellite data, with the highest accuracy being obtained when the four data sources were all considered in the model development. Finally, the prediction accuracy was also evaluated against timeliness within the growing season, with reliable predictions (R2 > 0.84) being able to be achieved 2.5 months before the harvest when the multi-source data were combined.
Article
Full-text available
This research mainly based on multilayer perceptron (MLP) neural networks technique of data mining to forecast the wheat crop yield at the district level. There are many statistical and simulation models available, but the proposed algorithm with new activation function provides promising results in a shorter time with more accuracy. Sigmoid and hyperbolic tangent activation functions are widely used in the neural network. The activation functions play an important role in the neural network learning algorithm. The main objective of the proposed work is to develop an amended MLP neural network with new activation function and revised random weights and bias values for crop yield estimation by using the different weather parameter datasets. MLP model has been tested by existing activation functions and newly created activation functions with different cases including weights and bias values. In this research study, we evaluate the result of different activation functions and recommend some new simple activation functions, named DharaSig, DharaSigm and SHBSig, to improve the performance of neural networks and accurate results. Also, three new activation functions created with little variations in the DharaSig function named DharaSig1, DharaSig2 and DharaSig3. In this research study, variable numbers of hidden layers are tested with the variable number of neurons per hidden layer for the agriculture dataset. Variable values of momentum, seed and learning rate are also used in this study. Experiments show that newly created activation functions provide better results compared to ‘sigmoid’ default neural network activation function for agriculture datasets.
Article
Full-text available
Crop yield estimates over large areas are conventionally made using weather observations, but a comprehensive understanding of the effects of various environmental indicators, observation frequency, and the choice of prediction algorithm remains elusive. Here we present a thorough assessment of county-level maize yield prediction in U.S. Midwest using six statistical/machine learning algorithms (Lasso, Support Vector Regressor, Random Forest, XGBoost, Long-short term memory (LSTM), and Convolutional Neural Network (CNN)) and an extensive set of environmental variables derived from satellite observations, weather data, land surface model results, soil maps, and crop progress reports. Results show that seasonal crop yield forecasting benefits from both more advanced algorithms and a large composite of information associated with crop canopy, environmental stress, phenology, and soil properties (i.e. hundreds of features). The XGBoost algorithm outperforms other algorithms both in accuracy and stability, while deep neural networks such as LSTM and CNN are not advantageous. The compositing interval (8-day, 16-day or monthly) of time series variable does not have significant effects on the prediction. Combining the best algorithm and inputs improves the prediction accuracy by 5% when compared to a baseline statistical model (Lasso) using only basic climatic and satellite observations. Reasonable county-level yield foresting is achievable from early June, almost four months prior to harvest. At the national level, early-season (June and July) prediction from the best model outperforms that of the United States Department of Agriculture (USDA) World Agricultural Supply and Demand Estimates (WASDE). This study provides insights into practical crop yield forecasting and the understanding of yield response to climatic and environmental conditions.
Chapter
Agricultural yield estimation from natural images is a challenging problem to which machine learning can be applied. Convolutional Neural Networks have advanced the state of the art in many machine learning applications such as computer vision, speech recognition and natural language processing. The proposed research uses convolution neural networks to develop models that can estimate the weight of grapes on a vine using an image. Trained and tested with a dataset of 60 images of grape vines, the system manages to achieve a cross-validation yield estimation accuracy of 87%.
Article
Rice yield is essential for more than half of the world’s population, and thus, accurate predictions of rice yield are of great importance for trade, development policies, humanitarian assistance, decision-makers, etc. However, traditional mechanistic models and statistical machine learning models need to identify features, making the research on and application of these models laborious and time-consuming. In this paper, a novel end-to-end prediction model that fuses two back-propagation neural networks (BPNNs) with an independently recurrent neural network (IndRNN), named BBI-model, is proposed to address these challenges. In stage one, BBI-model preprocesses the original area and meteorology data. In stage two, one BPNN and the IndRNN are used to learn deep spatial and temporal features in parallel. In stage three, another BPNN combines two kinds of deep features and learns the relationships between these deep features and rice yields to make predictions for summer and winter rice yields. The experimental results indicate that BBI-model achieved the lowest mean absolute error (MAE) and root mean square error (RMSE) for the summer rice prediction (0.0044 and 0.0057, respectively) and corresponding values of 0.0074 and 0.0192 for the winter rice prediction when the number of layers in the network was set to six. Moreover, the errors of the model using the combination of deep spatial-temporal features were significantly lower than when simply using deep temporal features. Furthermore, the model converged quickly with 100 iterations and then remained stable. These findings confirm that the model can make accurate predictions for summer and winter rice yields of 81 counties in the Guangxi Zhuang Autonomous Region, China.
Article
One way to improve the quality of mechanized cotton harvesting is to change harvester settings and adjustments throughout the process, according to information obtained during the operation. We believe that yield predictions are important for managing the quality of operation, aiming at increasing efficiency and reducing losses. Therefore, this study aimed to develop an automated system for cotton yield prediction from color images acquired by a simple mobile device. We propose a robust approach to environmental conditions, training detection algorithms with images acquired at different times throughout the day, and evaluating three different scenarios (low-, average-, and high-demand computational resources). The experimental results for the average demand computational scenario, which are suitable for real-time deployment on low-cost devices such as smartphones and other ARM-processed devices, indicated the possibility of counting bolls using images acquired at different times throughout the day, with mean errors of 8.84% (∼5 bolls). Furthermore, we observed a 17.86% error when predicting yield using 205 images from the testing dataset, which is equivalent to about 19.14 g.