ArticlePDF Available

Crop yield prediction using machine learning: A systematic literature review

October 2020
Computers and Electronics in Agriculture 177(10):105709

October 2020
177(10):105709

DOI:10.1016/j.compag.2020.105709

License
CC BY 4.0

Authors:

Machine learning is an important decision support tool for crop yield prediction, including supporting decisions on what crops to grow and what to do during the growing season of the crops. Several machine learning algorithms have been applied to support crop yield prediction research. In this study, we performed a Systematic Literature Review (SLR) to extract and synthesize the algorithms and features that have been used in crop yield prediction studies. Based on our search criteria, we retrieved 567 relevant studies from six electronic databases, of which we have selected 50 studies for further analysis using inclusion and exclusion criteria. We investigated these selected studies carefully, analyzed the methods and features used, and provided suggestions for further research. According to our analysis, the most used features are temperature, rainfall, and soil type, and the most applied algorithm is Artificial Neural Networks in these models. After this observation based on the analysis of machine learning-based 50 papers, we performed an additional search in electronic databases to identify deep learning-based studies, reached 30 deep learning-based papers, and extracted the applied deep learning algorithms. According to this additional analysis, Convolutional Neural Networks (CNN) is the most widely used deep learning algorithm in these studies, and the other widely used deep learning algorithms are Long-Short Term Memory (LSTM) and Deep Neural Networks (DNN).

Details of the Plan Review Step.

…

Details of the Conducting Review Step.

…

Details of the Reporting Review Step.

…

Distribution of the selected publications per year.

…

Feature diagram.

…

Figures - available via license: Creative Commons Attribution 4.0 International

Content may be subject to copyright.

Available via license: CC BY 4.0

Content may be subject to copyright.

Contents lists available at ScienceDirect

Computers and Electronics in Agriculture

journal homepage: www.elsevier.com/locate/compag

Crop yield prediction using machine learning: A systematic literature review

Thomas van Klompenburg

, Ayalew Kassahun

, Cagatay Catal

b,⁎

Information Technology Group, Wageningen University & Research, Wageningen, the Netherlands

Department of Computer Engineering, Bahcesehir University, Istanbul, Turkey

ARTICLE INFO

Keywords:

Crop yield prediction

Decision support system

Systematic literature review

Machine learning

Deep learning

ABSTRACT

Machine learning is an important decision support tool for crop yield prediction, including supporting decisions

on what crops to grow and what to do during the growing season of the crops. Several machine learning al-

gorithms have been applied to support crop yield prediction research. In this study, we performed a Systematic

Literature Review (SLR) to extract and synthesize the algorithms and features that have been used in crop yield

prediction studies. Based on our search criteria, we retrieved 567 relevant studies from six electronic databases,

of which we have selected 50 studies for further analysis using inclusion and exclusion criteria. We investigated

these selected studies carefully, analyzed the methods and features used, and provided suggestions for further

research. According to our analysis, the most used features are temperature, rainfall, and soil type, and the most

applied algorithm is Artiﬁcial Neural Networks in these models. After this observation based on the analysis of

machine learning-based 50 papers, we performed an additional search in electronic databases to identify deep

learning-based studies, reached 30 deep learning-based papers, and extracted the applied deep learning algo-

rithms. According to this additional analysis, Convolutional Neural Networks (CNN) is the most widely used

deep learning algorithm in these studies, and the other widely used deep learning algorithms are Long-Short

Term Memory (LSTM) and Deep Neural Networks (DNN).

1. Introduction

Machine learning (ML) approaches are used in many ﬁelds, ranging

from supermarkets to evaluate the behavior of customers (Ayodele,

2010) to the prediction of customers’ phone use (Witten et al., 2016).

Machine learning is also being used in agriculture for several years

(McQueen et al., 1995). Crop yield prediction is one of the challenging

problems in precision agriculture, and many models have been pro-

posed and validated so far. This problem requires the use of several

datasets since crop yield depends on many diﬀerent factors such as

climate, weather, soil, use of fertilizer, and seed variety (Xu et al.,

2019). This indicates that crop yield prediction is not a trivial task;

instead, it consists of several complicated steps. Nowadays, crop yield

prediction models can estimate the actual yield reasonably, but a better

performance in yield prediction is still desirable (Filippi et al., 2019a).

Machine learning, which is a branch of Artiﬁcial Intelligence (AI)

focusing on learning, is a practical approach that can provide better

yield prediction based on several features. Machine learning (ML) can

determine patterns and correlations and discover knowledge from da-

tasets. The models need to be trained using datasets, where the out-

comes are represented based on past experience. The predictive model

is built using several features, and as such, parameters of the models are

determined using historical data during the training phase. For the

testing phase, part of the historical data that has not been used for

training is used for the performance evaluation purpose.

An ML model can be descriptive or predictive, depending on the

research problem and research questions. While descriptive models are

used to gain knowledge from the collected data and explain what has

happened, predictive models are used to make predictions in the future

(Alpaydin, 2010). ML studies consist of diﬀerent challenges when

aiming to build a high-performance predictive model. It is crucial to

select the right algorithms to solve the problem at hand, and in addi-

tion, the algorithms and the underlying platforms need to be capable of

handling the volume of data.

To get an overview of what has been done on the application of ML

in crop yield prediction, we performed a systematic literature review

(SLR). A Systematic Literature Review (SLR) shows the potential gaps in

research on a particular area of problem and guides both practitioners

and researchers who wish to do a new research study on that problem

area. By following a methodology in SLR, all relevant studies are ac-

cessed from electronic databases, synthesized, and presented to respond

to research questions deﬁned in the study. An SLR study leads to new

perspectives and helps new researchers in the ﬁeld to understand the

state-of-the-art.

https://doi.org/10.1016/j.compag.2020.105709

Received 29 January 2020; Received in revised form 21 July 2020; Accepted 9 August 2020

⁎

Corresponding author.

E-mail address: cagatay.catal@eng.bau.edu.tr (C. Catal).

Computers and Electronics in Agriculture 177 (2020) 105709

Available online 18 August 2020

An SLR study is expected to be replicable, which means that all the

steps taken need to be explained clearly, and the results should be

transparent for other researchers. The critical factors for a successful SLR

study are objectivity and transparency (Kitchenham et al., 2007). As its

name indicates, an SLR needs to be systematic and cover all the literature

published so far. This study presents all the available literature published

so far on the application of machine learning in crop yield prediction

problem. In this study, we present our empirical results and responses to

the research questions deﬁned as part of this review article.

The remainder of this paper is organized as follows: Section 2 ex-

plains the background. Section 3 discusses the methodology. Section 4

presents the results of the SLR. Section 5 explains the deep learning-

based crop yield prediction research. Section 5 presents the discussion,

and Section 7 concludes this paper.

2. Related work

Crop yield prediction is an essential task for the decision-makers at

national and regional levels (e.g., the EU level) for rapid decision-

making. An accurate crop yield prediction model can help farmers to

decide on what to grow and when to grow. There are diﬀerent ap-

proaches to crop yield prediction. This review article has investigated

what has been done on the use of machine learning in crop yield pre-

diction in the literature.

During our analysis of the retrieved publications, one of the exclusion

criteria is that the publication is a survey or traditional review paper.

Those excluded publications are, in fact, related work and are discussed

in this section. Chlingaryan and Sukkarieh performed a review study on

nitrogen status estimation using machine learning (Chlingaryan et al.,

2018). The paper concludes that quick developments in sensing tech-

nologies and ML techniques will result in cost-eﬀective solutions in the

agricultural sector. Elavarasan et al. performed a survey of publications

on machine learning models associated with crop yield prediction based

on climatic parameters. The paper advises looking broad to ﬁnd more

parameters that account for crop yield (Elavarasan et al., 2018). Liakos

et al. (2018) published a review paper on the application of machine

learning in the agricultural sector. The analysis was performed with

publications focusing on crop management, livestock management,

water management, and soil management. Li, Lecourt, and Bishop per-

formed a review study on determining the ripeness of fruits to decide the

optimal harvest time and yield prediction (Li et al., 2018). Mayuri and

Priya addressed the challenges and methodologies that are encountered

in the ﬁeld of image processing and machine learning in the agricultural

sector and especially in the detection of diseases (Mayuri and Priya,

xxxx). Somvanshi and Mishra presented several machine learning ap-

proaches and their application in plant biology (Somvanshi and Mishra,

2015). Gandhi and Armstrong published a review paper on the appli-

cation of data mining in the agricultural sector in general, dealing with

decision making. They concluded that further research needs to be done

to see how the implementation of data mining into complex agricultural

datasets could be realized (Gandhi and Armstrong, 2016). Beulah per-

formed a survey on the various data mining techniques that are used for

crop yield prediction and concluded that the crop yield prediction could

be solved by employing data mining techniques (Beulah, 2019).

According to our survey of review articles, the signiﬁcant ones of

which are presented in this section, this paper is the ﬁrst SLR that fo-

cuses on the application of machine learning in the crop yield predic-

tion problem. The existing survey studies did not systematically review

the literature, and most of them reviewed studies on a speciﬁc aspect of

crop yield prediction. Also, we presented 30 deep learning-based stu-

dies in this article and discussed which deep learning algorithms have

been used in these studies.

3. Methodology

3.1. Review protocol

Before conducting the systematic review, a review protocol is de-

ﬁned. The review has been done using the well-known review guide-

lines provided by Kitchenham et al. (2007). Firstly, the research ques-

tions are deﬁned. When research questions are ready, databases are

used to select the relevant studies. The databases that were used in this

study are Science Direct, Scopus, Web of Science, Springer Link, Wiley,

and Google Scholar. After the selection of relevant studies, they were

ﬁltered and assessed using a set of exclusion and quality criteria. All the

relevant data from the selected studies are extracted, and eventually,

the extracted data were synthesized in response to the research ques-

tions. The approach we followed can be split up into three parts: plan

review, conduct review, and report review.

The ﬁrst stage is planning the review. In this stage, research questions

are identiﬁed, a protocol is developed, and eventually, the protocol is va-

lidated to see if the approach is feasible. In addition to the research ques-

tions, publication venues, initial search strings, and publication selection

criteria are also deﬁned. When all of this information is deﬁned, the pro-

tocol is revised one more time to see if it represents a proper review pro-

tocol. In Fig. 1, the internal steps of the Plan Review stage are represented.

The second stage is conducting the review, which is represented in

Fig. 2. When conducting the review, the publications were selected by

going through all the databases. The data was extracted, which means

that their information regarding authors, year of publication, type of

publication, and more information regarding the research questions

were stored. After all the necessary data was extracted correctly, the

data was synthesized in order to provide an overview of the relevant

papers published so far.

In the ﬁnal stage, a.k.a., Reporting the Review, the review was

concluded by documenting the results and addressing the research

questions, as shown in Fig. 3.

3.2. Research questions

This SLR aims to get insight into what studies have been published

in the domain of ML and crop yield prediction. To get insight, studies

have been analyzed from several dimensions. For this SLR study, the

following four research questions(RQs) have been deﬁned.

• RQ1- Which machine learning algorithms have been used in the

literature for crop yield prediction?

• RQ2- Which features have been used in literature for crop yield

prediction using machine learning?

• RQ3- Which evaluation parameters and evaluation approaches have

been used in literature for crop yield prediction?

• RQ4- What are challenges in the ﬁeld of crop yield prediction using

machine learning?

3.3. Search strategy

The searching is done by narrowing down to the basic concepts that

are relevant for the scope of this review. Machine learning has many

Fig. 1. Details of the Plan Review Step.

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

application ﬁelds, which means that there are a lot of published studies

that are probably not in the scope of this review article. The basic

searching is done by an automated search. The starting input for the

search was “machine learning” AND “yield prediction”. Articles were

retrieved, and abstracts were read to ﬁnd the synonyms of the key-

words. The search was performed in six databases. The search input

“machine learning” AND “yield prediction” was used to get a broad

view of the studies. After the exclusion criteria were applied, and all the

results were processed, and a more complex search string was built in

order to avoid missing relevant studies. This ﬁnal search string is as

follows: ((“machine learning” OR “artiﬁcial intelligence”) AND “data

mining” AND (“yield prediction” OR “yield forecasting” OR “yield es-

timation”)). After executing this search string, 567 studies were re-

trieved.

A speciﬁc description of the search strings per database are provided

as follows:

Science direct: The search string is [“machine learning” AND “yield

prediction”] (Title, abstract, keywords) and [((“machine learning” OR

“artiﬁcial intelligence”) AND “data mining” AND (“yield prediction”

OR “yield forecasting” OR “yield estimation”))](Title, abstract, key-

words).

Scopus: The search string is [“machine learning” AND “yield pre-

diction”](Title, abstract, keywords) and [((“machine learning” OR

“artiﬁcial intelligence”) AND “data mining” AND (“yield prediction”

OR “yield forecasting” OR “yield estimation”))] (Title, abstract, key-

words).

Web of Science: The search string is [“machine learning” AND

“yield prediction”] (title, abstract, author keywords, and Keywords

Plus).

Springer Link: The search string is [“machine learning” AND “yield

prediction”](anywhere) and [((“machine learning” OR “artiﬁcial in-

telligence”) AND “data mining” AND (“yield prediction” OR “yield

forecasting” OR “yield estimation”))] (anywhere)

Wiley: The search string is [“machine learning” AND “yield pre-

diction”] (anywhere).

Google Scholar: The search string is [“machine learning” AND

“yield prediction”] (anywhere) and [((“machine learning” OR “artiﬁcial

intelligence”) AND “data mining” AND (“yield prediction” OR “yield

forecasting” OR “yield estimation”))] (anywhere).

For Web of Science and Wiley, the search string [((“machine

learning” OR “artiﬁcial intelligence”) AND “data mining” AND (“yield

prediction” OR “yield forecasting” OR “yield estimation”))] did not

result in any publications.

3.4. Exclusion criteria

To exclude irrelevant studies, the studies were analyzed and graded

based on exclusion criteria to set the boundaries for the systematic

review. The exclusion criteria (EC) are shown as follows:

Exclusion criteria 1 - Publication is not related to the agricultural

sector and yield prediction combined with machine learning

Exclusion criteria 2 – Publication is not written in English

Exclusion criteria 3 – Publication that is a duplicate or already re-

trieved from another database

Exclusion criteria 4 – Full text of the publication is not available

Exclusion criteria 5 – Publication is a review/survey paper

Exclusion criteria 6 – Publication has been published before 2008

After the ﬁrst three exclusion criteria were applied, only 77 studies

remained for further analysis. After applying all the six exclusion cri-

teria, 50 studies were selected for further analysis. In Table 1, we show

the number of initially retrieved papers and the number of papers after

selection criteria were applied. Fig. 4 shows the distribution of selected

publications based on the databases we searched. As shown in Table 1,

most of the papers were retrieved from Google Scholar, Scopus, and

Springer databases.

To answer the four research questions, data from the selected stu-

dies have been extracted and synthesized. The information retrieved

was focused on checking whether or not the studies meet the require-

ments stated in the exclusion criteria and on responding to the research

questions. The selected studies that passed the exclusion criteria are

presented in Appendix A. During the data synthesis, all the extracted

data have been combined and synthesized, and the research questions

were answered accordingly. The results are presented in Section 4.

4. Results

The selected publications are shown in Table 2. The table shows the

publication year, title, and algorithms used in these papers.

Fig. 4 shows the number of publications per year published in the

last ten years. This ﬁgure indicates that recently the number of papers

Fig. 2. Details of the Conducting Review Step.

Fig. 3. Details of the Reporting Review Step.

Table 1

Distribution of papers based on the databases.

Database # of initially

retrieved papers

# of papers after

exclusion criteria

Percentage of

Papers (%)

Science Direct 17 4 8

Scopus 68 11 22

Web of Science 32 0 0

Springer Link 132 10 20

Wiley 20 1 2

Google Scholar 298 24 48

Total 567 50 100

Fig. 4. Distribution of the selected publications per year.

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Table 2

Selected publications.

Retrieved From Reference Title Algorithm used Year

Scopus Ruß et al. (2008) Data Mining with Neural Networks for Wheat Yield Prediction Neural networks 2008

Science Direct Everingham et al. (2009) Ensemble data mining approaches to forecast regional sugarcane crop production Forward stagewise algorithm 2009

Springer Link Ruß & Kruse (2010) Regression Models for Spatial Data: An Example from Precision Agriculture Clustering, random forest, support vector machine 2010

Springer Link Baral et al. (2011) Yield Prediction Using Artiﬁcial Neural Networks Neural networks 2011

Springer Link Črtomir et al. (2012) Application of Neural Networks and Image Visualization for Early Forecast of Apple Yield Neural networks 2012

Google Scholar Johnson (2013) Crop yield forecasting on the Canadian Prairies by remotely sensed vegetation indices and machine

learning methods

Multiple linear regression, neural networks 2013

Google Scholar Romero et al. (2013) Using classiﬁcation algorithms for predicting durum wheat yield in the province of Buenos Aires K-nearest neighbor, decision tree 2013

Google Scholar Ananthara et al. (2013) CRY - an improved crop yield prediction model using bee hive clustering approach for agricultural data

sets

Clustering 2013

Scopus Shekoofa et al. (2014) Determining the most important physiological and agronomic traits contributing to maize grain yield

through machine learning algorithms: A new avenue in intelligent agriculture

Decision tree, clustering 2014

Scopus Gonzalez-Sanchez et al. (2014) Predictive ability of machine learning methods for massive crop yield prediction M5-prime regression tree, k-nearest neighbor, support vector machine 2014

Scopus Pantazi et al. (2014) Application of supervised self-organizing models for wheat yield prediction Neural networks 2014

Google Scholar Cakir et al. (2014) Yield prediction of wheat in south-east region of Turkey by using artiﬁcial neural networks Neural networks, multivariate polynomial regression 2014

Google Scholar Rahman & Haq (2014) Machine learning facilitated rice prediction in Bangladesh Decision tree, neural networks, linear regression 2014

Scopus Kunapuli et al. (2015) Yield prediction for precision territorial management in maize using spectral data Polynomial regression, logistic regression 2015

Google Scholar Matsumura et al. (2015) Maize yield forecasting by linear regression and artiﬁcial neural networks in Jilin, China Neural networks, multiple linear regression 2015

Google Scholar Ahamed et al. (2015) Applying data mining techniques to predict annual yield of major crops and recommend planting

diﬀerent crops in diﬀerent districts in Bangladesh

Linear regression, neural networks, clustering, k-nearest neighbor 2015

Google Scholar Paul et al. (2015) Analysis of soil behavior and prediction of crop yield using data mining approach Naïve Bayes, k-nearest neighbor 2015

Science Direct Pantazi et al. (2016) Wheat yield prediction using machine learning and advanced sensing techniques Neural networks 2016

Scopus Jeong et al. (2016) Random forests for global and regional crop yield predictions Random forest, linear regression 2016

Wiley Mola-Yudego et al. (2016) Spatial yield estimates of fast-growing willow plantations for energy based on climatic variables in

northern Europe

Gradient boosting tree 2016

Google Scholar Everingham et al. (2016) Accurate prediction of sugarcane yield using a random forest algorithm Random forest 2016

Scopus Gandhi et al. (2016) Rice crop yield prediction in India using support vector machines Support vector machine 2016

Google Scholar Bose et al. (2016) Spiking neural networks for crop yield estimation based on spatiotemporal analysis of image time series Neural networks 2016

Google Scholar Gandhi et al. (2016) Rice crop yield prediction using artiﬁcial neural networks Neural networks 2016

Google Scholar Gandhi and Armstrong (2016) Applying data mining techniques to predict yield of rice in Humid Subtropical Climatic Zone of India Decision tree, logistic regression, k-nearest neighbor 2016

Google Scholar Sujatha and Isakki (2016) A study on crop yield forecasting using classiﬁcation techniques Naïve Bayes, J48, random forest, neural networks, decision tree,

support vector machines (No experimental results reported)

2016

Google Scholar Ying-xue et al. (2017) Support vector machine-based open crop model (SBOCM): Case of rice production in China Support vector machine 2017

Google Scholar Cheng et al. (2017) Early yield prediction using image analysis of apple fruit and tree canopy features with neural networks Neural networks 2017

Google Scholar Bargoti and Underwood (2017) Image segmentation for fruit detection and yield estimation in apple orchards Neural networks 2017

Google Scholar Fernandes et al. (2017) Sugarcane yield prediction in Brazil using NDVI time series and neural networks ensemble Neural networks 2017

Google Scholar You et al. (2017) Deep Gaussian process for crop yield prediction based on remote sensing data Neural networks and gaussian process, neural networks 2017

Springer Link Osman et al. (2017) Predicting Early Crop Production by Analysing Prior Environment Factors Neural networks, linear regression 2017

Google Scholar Ali et al. (2017) Modeling managed grassland biomass estimation by using multitemporal remote sensing data machine

learning approach

ANFIS, neural networks, multiple linear regression 2017

Science Direct Kouadio et al. (2018) Artiﬁcial intelligence approach for the prediction of Robusta coﬀee yield using soil fertility properties Extreme learning machine, multiple linear regression, random forest 2018

Springer Link Goldstein et al. (2018) Applying machine learning on sensor data for irrigation recommendations: revealing the agronomists

tacit knowledge

Gradient boosting tree, linear regression 2018

Scopus Zhong et al. (2018) Hierarchical modeling of seed variety yields and decision making for future planting plan Random forest, linear regression 2018

Scopus Crane-Droesch (2018) Machine learning methods for crop yield prediction and climate change impact assessment in

agriculture

Neural networks 2018

Scopus Villanueva et al. (2018) Bitter melon crop yield prediction using Machine Learning Algorithm Neural networks 2018

Google Scholar Girish et al. (2018) Crop Yield and Rainfall Prediction in Tumakuru District using Machine Learning Support vector machine, linear regression, k-nearest neighbor 2018

Google Scholar Khanal et al. (2018) Integration of high resolution remotely sensed data and machine learning techniques for spatial

prediction of soil properties and corn yield

Neural networks, support vector machine, random forest 2018

Google Scholar Taherei Ghazvinei et al. (2018) Sugarcane growth prediction based on meteorological parameters using extreme learning machine and

artiﬁcial neural network

Neural networks 2018

Springer Link Ahmad et al. (2018) Yield Forecasting of Spring Maize Using Remote Sensing and Crop Modeling in Faisalabad-Punjab

Pakistan

Support vector machine, random forest, decision tree 2018

(continued on next page)

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

on crop yield prediction is increasing.

There were no exclusion criteria based on the type of publication;

therefore, conference papers were also included. The pie chart in Fig. 5

shows the distribution of types of publications. The ﬁgure shows that

most of the articles we accessed are journal articles; conference papers

and book chapters constitute less than 25% of the total number of pa-

pers.

To address research question two (RQ2), features used in the ma-

chine learning algorithms applied in the papers were investigated and

summarized. All features we were able to extract are shown in Table 3.

As shown in Table 3, the most used features are related to tem-

perature, rainfall, and soil type. Crop yield is the dependent variable. To

get a better overview of the independent variables (features), the fea-

tures were grouped. The independent features can be grouped into soil

and crop information, humidity, nutrients, and ﬁeld management. The

number of times these groups are used is presented in Table 4. As shown

in this table, the feature groups that are most used are related to the

soil, solar, and humidity information.

The feature group “soil information” consists of the following

variables: soil maps, soil type, pH value, cation exchange capacity, and

area of production. Whether or not soil maps were used and the in-

formation content of the maps diﬀers among the diﬀerent publications.

In the soil maps, general information about the nutrients in the soil,

type of the soil, and location can be found. Crop information refers to

information about the crop itself, such as weight, growth during the

growth-process, variety of plants, and crop density. Other measure-

ments that indicate growth is also included in this group, for example,

the leaf area index. Humidity stands for the water in the ﬁeld. The

features that fall under the humidity group include rainfall, humidity,

forecasted rainfall, and precipitation. Nutrients can be nutrients that

are already in the soil, but the nutrients can also be applied nutrients.

These features measure the level of saturation. The measured nutrients

are nitrogen, magnesium, potassium, sulphur, zinc, boron, calcium,

manganese, and phosphorus. With ﬁeld management, decisions of

farmers to adjust their ﬁeld are grouped. These features are irrigation

and fertilization, and thus ﬁeld management could also refer to the

management of nutrients. The solar information contains features re-

lated to radiation or temperature. These are gamma radiometric, tem-

perature, photoperiod, shortwave radiation, degree-days, and solar ra-

diation. The feature group labeled as ‘Other’ contains the features that

cannot be put in any of the groups mentioned above. Most of these

features are used only once or are calculated features (Measuring

Vegetation (NDVI & EVI), 2000). These features are used less and in-

clude features such as wind speed, pressure, and images. The calculated

features are MODIS Enhanced Vegetation Index (MODIS-EVI), Nor-

malized Vegetation Index (NDVI), and Enhanced Vegetation Index

Table 2 (continued)

Retrieved From Reference Title Algorithm used Year

Springer Link Shah et al. (2018) Smart Farming System: Crop Yield Prediction Using Regression Techniques Support vector machine, random forest, multivariate polynomial

regression

2018

Springer Link Monga (2018) Estimating Vineyard Grape Yield from Images Neural networks 2018

Google Scholar Wang et al. (2018) Deep transfer learning for crop yield prediction with remote sensing data Neural networks 2018

Science Direct Xu et al. (2019) Design of an integrated climatic assessment indicator (ICAI) for wheat production: A case study in

Jiangsu Province, China

Random forest, support vector machine 2019

Scopus Filippi et al. (2019b) An approach to forecast grain crop yield using multi-layered, multi-farm data sets and machine

learning

Random forest 2019

Google Scholar Rao & Manasa (2019) Artiﬁcial Neural Networks for Soil Quality and Crop Yield Prediction using Machine Learning Neural networks 2019

Springer Link Ranjan & Parida (2019) Paddy acreage mapping and yield prediction using sentinel-based optical and SAR data in Sahibganj

district, Jharkhand (India)

Linear regression 2019

Springer Link Charoen-Ung & Mittrapiyanuruk

(2019)

Sugarcane Yield Grade Prediction Using Random Forest with Forward Feature Selection and Hyper-

parameter Tuning

Random forest 2019

Fig. 5. Distribution of the type of 50 primary publications.

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

(EVI) (Filippi et al., 2019a).

To represent all the features gathered through this SLR study, we

drew a feature map depicted in Fig. 6 shows the signiﬁcant features and

sub-features.

To address the ﬁrst research question (RQ1), machine learning al-

gorithms were investigated and summarized. The algorithms used more

than once are listed in Table 5. As shown in the table, Neural Networks

(NN) and Linear Regression algorithms are the two algorithms used

mostly. Also, Random Forest (RF) and Support Vector Machines (SVM)

are widely used, according to Table 5.

To address research question three (RQ3), evaluation parameters

were identiﬁed. All the evaluation parameters that were used and the

number of times they were used are shown in Table 6. As the table

shows, Root Mean Square Error (RMSE) is the most used parameter in

the studies.

Apart from the evaluation parameters, several validation ap-

proaches were used as well. Most of the time, cross-validation is used.

The most used evaluation method was 10-fold cross-validation.

To address research question four (RQ4), the publications were read

to see if they stated any problems or improvements for future models. In

several studies, insuﬃcient availability of data (too few data) was

mentioned as a problem. The studies stated that their systems worked

for the limited data that they had at hand, and indicated data with more

variety should be used for further testing. This means data with dif-

ferent climatic circumstances, diﬀerent vegetation, and longer time-

series of yield data. Another suggested improvement is that more data

sources should be integrated. Finally, the publication indicated that the

use of machine learning in farm management systems should be ex-

plored. If the models work as requested, software applications must be

created that allow the farmer to make decisions based on the models.

5. Deep learning-based crop yield prediction

In the ﬁrst part of our research (i.e., Systematic Literature Review),

we observed that Artiﬁcial Neural Networks (ANN) is the most used

algorithm for crop yield prediction. Recently, deep learning, which is a

sub-branch of machine learning, has provided state-of-the-art results in

many diﬀerent domains, such as face recognition and image classiﬁ-

cation. These Deep Neural Networks (DNN) algorithms use similar

concepts of ANN algorithms; however, they include diﬀerent hidden

layer types such as convolutional layer and pooling layer and consist of

many hidden layers instead of a single hidden layer.

As such, in the second part of our research, we aimed to investigate

to what extent deep learning algorithms have been applied in crop yield

prediction. To broaden our analysis and reach recent applications of

deep learning algorithms in yield prediction, we designed a new search

criterion (i.e., “deep learning” AND “yield prediction”) and performed a

new search in the same electronic databases that were used during the

SLR study. We reached the following 30 papers shown in Table 7. We

investigated these articles in detail, extracted, and synthesized the deep

learning algorithms applied by researchers.

Fig. 7 shows the yearly distribution of deep learning-based papers.

Although we are in the half of the year 2020, the number of papers that

belong to the year 2020 is now equal to the number of papers published

in 2019. This shows that the number of papers is increasing every year.

In Table 8, we show the distribution of deep learning-based papers

per database. Most of the papers were retrieved from Google Scholar,

and the second top database was Scopus. Science Direct and Springer

Link returned a similar number of deep learning-based papers.

In Table 9, we show the distribution of applied deep learning al-

gorithms in the identiﬁed papers list. The most applied deep learning

algorithm is Convolutional Neural Networks (CNN), and the other

widely used algorithms are Long-Short Term Memory (LSTM) and Deep

Neural Networks (DNN) algorithms. Since some papers applied more

than one deep learning algorithm, the total number of usages shown in

the second column is larger than the total number of papers.

These deep learning algorithms are shortly described as follows:

• Deep Neural Networks (DNN): These DNN algorithms are very similar

to the traditional Artiﬁcial Neural Networks (ANN) algorithms ex-

cept the number of hidden layers. In DNN networks, there are many

hidden layers that are mostly fully connected, as in the case of ANN

algorithms. However, for other kinds of deep learning algorithms

such as CNN, there are also diﬀerent types of layers, such as the

convolutional layer and the pooling layer.

•Convolutional Neural Networks (CNN): Compared to a fully con-

nected network, CNN has fewer parameters to learn. There are three

types of layers in a CNN model, namely convolutional layers,

pooling layers, and fully-connected layers. Convolutional layers

consist of ﬁlters and feature maps. Filters are the neurons of the

layer, have weighted inputs, and create an output value (Brownlee,

2016). A feature map can be considered as the output of one ﬁlter.

Pooling layers are applied to down-sample the feature map of the

previous layers, generalize feature representations, and reduce the

Table 3

All features used.

Feature # of times used

Temperature 24

Soil type 17

Rainfall 17

Crop information 13

Soil maps 12

Humidity 11

pH-value 11

Solar radiation 10

Precipitation 9

Images 8

Area of production 8

Fertilization 7

NDVI 6

Cation exchange capacity 6

Nitrogen 6

Irrigation 5

Potassium 5

Wind speed 5

Zinc 3

Magnesium 3

Shortwave radiation 2

Sulphur 2

Boron 2

Calcium 2

Organic carbon 2

EVI 2

Phosphorus 2

Gamma radiametrics 1

MODIS-EVI 1

Forecasted rainfall 1

Photoperiod 1

Climate 1

Degree-days 1

Time 1

Pressure 1

Leaf area index 1

Manganese 1

Table 4

Grouped features.

Group # of times used

Soil information 54

Solar information 39

Humidity 38

Nutrients 28

Other 24

Crop information 14

Field management 12

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

overﬁtting (Brownlee, 2019). Fully-connected layers are mostly

used at the end of the network for predictions. The general pattern

for CNN models is that one or more convolutional layers are fol-

lowed by a pooling layer, and this structure is repeated several

times, and ﬁnally, fully connected layers are applied (Brownlee,

2016, 2019).

• Long-Short Term Memory (LSTM): LSTM networks were designed

speciﬁcally for sequence prediction problems. There are several

LSTM architectures (Brownlee, 2017), namely vanilla LSTM, stacked

LSTM, CNN-LSTM, Encoder-Decoder LSTM, Bidirectional LSTM, and

Generative LSTM. There are several limitations of Multi-Layer

Fig. 6. Feature diagram.

Table 5

Most used machine learning algorithms.

Most used machine learning algorithms # of times used

Neural Networks 27

Linear Regression 14

Random Forest 12

Support Vector Machine 10

Gradient Boosting Tree 4

Table 6

All evaluation parameters used.

Key Evaluation parameter # of times used

RMSE Root mean square error 29

R-squared 19

MAE Mean absolute error 8

MSE Mean square error 5

MAPE Mean absolute percentage error 3

RSAE Reduced simple average ensemble 3

LCCC Lin’s concordance correlation coeﬃcient 1

MFE Multi factored evaluation 1

SAE Simple average ensemble 1

rcv Reference change values 1

MCC Matthew’s correlation coeﬃcient 1

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Table 7

Deep learning-based publications.

Retrieved From Reference Title Deep Learning Algorithm(s) used Year

Science Direct Schwalbert et al. (2020) Satellite-based soybean yield forecast: Integrating machine learning and weather data for

improving crop yield prediction in southern Brazil

Long-Short Term Memory (LSTM) 2020

Science Direct Chu and Yu (2020) An end-to-end model for rice yield prediction using deep learning fusion The combination of Back-Propagation Neural Networks (BPNNs) and

Independently Recurrent Neural Network (IndRNN)

2020

Science Direct Tedesco-Oliveira et al. (2020) Convolutional neural networks in predicting cotton yield from images of commercial ﬁelds Convolutional Neural Networks (CNN) 2020

Science Direct Nevavuori et al. (2019) Crop yield prediction with deep convolutional neural networks Convolutional Neural Networks (CNN) 2019

Science Direct Maimaitijiang et al. (2020) Soybean yield prediction from UAV using multimodal data fusion and deep learning Deep Neural Networks (DNN) 2020

Science Direct Yang et al. (2019) Deep convolutional neural networks for rice grain yield estimation at the ripening stage using

UAV-based remotely sensed images

Convolutional Neural Networks (CNN) 2019

Google Scholar Khaki and Wang (2019) Crop Yield Prediction Using Deep Neural Networks Deep Neural Networks (DNN) 2019

Google Scholar Rahnemoonfar and Sheppard

(2017)

Real-time yield estimation based on deep learning Convolutional Neural Networks (CNN) 2017

Google Scholar Chen et al. (2019) Strawberry Yield Prediction Based on a Deep Neural Network Using High-Resolution Aerial

Orthoimages

Faster Region-based Convolutional Neural Networks (Faster R-CNN) 2019

Google Scholar Sun et al. (2019) County-Level Soybean Yield Prediction Using Deep CNN-LSTM Model The combination of Convolutional Neural Networks and Long-Short Term

Memory Networks (CNN-LSTM)

2019

Google Scholar Khaki et al. (2020) A CNN-RNN Framework for Crop Yield Prediction The combination of Convolutional Neural Networks and Recurrent Neural

Networks (CNN-RNN)

2020

Google Scholar Terliksiz and Altýlar (2019) Use Of Deep Neural Networks For Crop Yield Prediction: A Case Study Of Soybean Yield in

Lauderdale County, Alabama, USA

3D Convolutional Neural Networks (3D CNN) 2019

Google Scholar Lee et al. (2019) A Self-Predictable Crop Yield Platform (SCYP) Based On Crop Diseases Using Deep Learning Convolutional Neural Networks (CNN) 2019

Google Scholar Elavarasan and Vincent (2020) Crop Yield Prediction Using Deep Reinforcement Learning Model for Sustainable Agrarian

Applications

Deep Recurrent Q-Network 2020

Google Scholar Wang et al. (2020) Winter Wheat Yield Prediction at County Level and Uncertainty Analysis in Main Wheat-

Producing Regions of China with Deep Learning Approaches

The combination of Convolutional Neural Networks and Long-Short Term

Memory (CNN-LSTM)

2020

Google Scholar Wolanin et al. (2020) Estimating and understanding crop yields with explainable deeplearning in the Indian Wheat Belt Convolutional Neural Networks (CNN) 2020

Springer Link Bhojani and Bhatt (2020) Wheat crop yield prediction using new activation functions in neuralnetwork Deep Neural Networks (DNN) 2020

Springer Link Fathi et al. (2019) Crop Yield Prediction Using Deep Learning in Mediterranean Region Deep Neural Networks (DNN) 2019

Springer Link Shidnal et al. (2019) Crop yield prediction: two-tiered machine learning model approach Convolutional Neural Networks (CNN) 2019

Springer Link Khaki and Wang (2019) Crop Yield Prediction Using Deep Neural Networks Deep Neural Networks (DNN) 2019

Springer Link Nguyen et al. (2019) Spatial-Temporal Multi-Task Learningfor Within-Field Cotton Yield Prediction Spatial-Temporal Multi-Task Learning 2019

Springer Link De Alwis et al. (2019) Duo Attention with Deep Learning on Tomato Yield Prediction and Factor Interpretation Duo Attention Long-Short Term Memory 2019

Wiley Jiang et al. (2020) A deep learning approach to conﬂating heterogeneous geospatial data for corn yield estimation: A

case study of the US Corn Belt at the county level

Long-Short Term Memory (LSTM) 2020

Scopus Saravi et al. (2019) Quantitative model of irrigation eﬀect on maize yield by deep neural network Deep Neural Networks (DNN) 2019

Scopus Zhang et al. (2020) Combining Optical, Fluorescence, Thermal Satellite, and Environmental Data to Predict County-

Level Maize Yield in China Using Machine Learning Approaches

Long-Short Term Memory (LSTM) 2020

Scopus Kang et al. (2020) Comparative assessment of environmental variables and machine learning algorithms for maize

yield prediction in the US Midwest

Long-Short Term Memory (LSTM) and Convolutional Neural Networks (CNN) 2020

Scopus Wang et al. (2020) Combining Multi-Source Data and Machine Learning Approaches to Predict Winter Wheat Yield in

the Conterminous United States

Deep Neural Networks (DNN) 2020

Scopus Ju et al. (2020) Machine learning approaches for crop yield prediction with MODIS and weather data Long-Short Term Memory (LSTM) Convolutional Neural Networks (CNN),

Stacked-Sparse AutoEncoder (SSAE)

2020

Scopus Yalcin (2019) An Approximation for A Relative Crop Yield Estimate from Field Images Using Deep Learning Convolutional Neural Networks (CNN) 2019

Scopus Wang et al. (2018) Deep Transfer Learning for Crop Yield Prediction with Remote Sensing Data Long-Short Term Memory (LSTM) for Transfer Learning 2018

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Perceptron (MLP) feedforward ANN algorithms, such as being sta-

teless, unaware of temporal structure, messy scaling, ﬁxed sized

inputs, and ﬁxed-sized outputs (Brownlee, 2017). Compared to the

MLP network, LSTM can be considered as the addition of loops to

the network. Also, LSTM is a special type of Recurrent Neural Net-

work (RNN) algorithm. Since LSTM has an internal state, is aware of

the temporal structure in the inputs, can model parallel input series,

can process variable-length input to generate variable-length

output, they are very diﬀerent than the MLP networks. The memory

cell is the computational unit of the LSTM (Brownlee, 2017). These

cells consist of weights (i.e., input weights, output weights, and

internal state) and gates (i.e., forget gate, input gate, and output

gate).

• 3D CNN: This network is a special type of CNN model in which the

kernels move through height, length, and depth. As such, it produces

3D activation maps. This type of model was developed to improve

the identiﬁcation of moving, as in the case of security cameras and

medical scans. 3D convolutions are performed in the convolutional

layers of CNN (Ji et al., 2012).

• Faster R-CNN: The Region-Based Convolutional Neural Network (R-

CNN) is a family of CNN models that were designed speciﬁcally for

object detection (Brownlee, 2019). There are four variations of R-

CNN, namely R-CNN, Fast R-CNN, Faster R-CNN, and Mask R-CNN.

In Faster R-CNN, a Region Proposal Network is added to interpret

features extracted from CNN (Ren et al., 2015).

• Autoencoder: Autoencoders are unsupervised learning approaches

that consist of the following four main parts: encoder, bottleneck,

decoder, and reconstruction loss. The architecture of autoencoders

can be designed based on simple feedforward neural networks, CNN,

or LSTM networks (Baldi, 2012; Vincent et al., 2008).

• Hybrid networks: It is possible to combine the power of diﬀerent deep

learning algorithms. As such, researchers combine diﬀerent algo-

rithms in a diﬀerent way. Chu and Yu (2020) combined Back-Pro-

pagation Neural Networks (BPNNs) and Independently Recurrent

Neural Network (IndRNN) and applied this model for crop yield

prediction. Sun et al. (2019) combined Convolutional Neural Net-

works and Long-Short Term Memory Networks (CNN-LSTM) for

soybean yield prediction. Khaki et al. (2020) combined Convolu-

tional Neural Networks and Recurrent Neural Networks (CNN-RNN)

for yield prediction. Wang et al. (2020) combined CNN and LSTM

(CNN-LSTM) networks for the wheat yield prediction problem.

•Multi-Task Learning (MTL): In multi-task learning, we share re-

presentations between tasks to improve the performance of our

models developed for these tasks (Ruder, 2017). It has been applied

in many diﬀerent domains, such as drug discovery, speech re-

cognition, and natural language processing. The aim is to improve

the performance of all the tasks involved instead of improving the

performance of a single task. Zhang and Yang (2017) reviewed

several multi-task learning approaches for supervised learning tasks

and also explained how to combine multi-task learning with other

learning categories, such as semi-supervised learning and re-

inforcement learning. They divided supervised MTL approaches into

the following categories: feature learning approach, low-rank ap-

proach, task clustering approach, task relation learning approach,

and decomposition approach.

• Deep Recurrent Q-Network (DQN): In reinforcement learning, agents

observe the environment and act based on some rules and the

available data. Agents get rewards based on their actions (i.e., po-

sitive or negative reward) and try to maximize this reward. The

environment and agents interact with each other continuously. DQN

algorithm was developed in 2015 by the researchers of DeepMind

acquired by Google in 2014. This DQN algorithm that combines the

power of reinforcement learning and deep neural networks solved

several Atari games in 2015. The classical Q-learning algorithm was

enhanced with deep neural networks, and also, the experience re-

play technique was integrated (Mnih et al., 2015). Elavarasan and

Vincent (2020) applied this algorithm for crop yield prediction.

The number of papers that apply deep learning for crop yield pre-

diction is increasing. As such, we expect to see more research in this

direction.

6. Discussion

•General discussion: Such research is susceptible to threats to va-

lidity, and potential threats to validity can be external, construct

validity, and reliability (Šmite et al., 2010). The external validity

and construct validity are addressed for this SLR study since the

initial search string was broad, and the query returned a substantial

number of studies: 567 publications in total. The search string

covered the whole scope of the SLR. For reliability of the SLR, the

validity can be considered well-addressed since the process of the

SLR has been described clearly and is replicable. If this SLR is re-

plicated, it could return slightly diﬀerent selected publications, but

the diﬀerences would be a result of diﬀerent personal judgments.

However, it is highly unlikely that the overall ﬁndings would

change.

• Search-related discussion: There is a possibility that valuable

publications might have been missed. More synonyms could have

been used, and a broader search could have returned new studies.

However, the search string resulted in a high number of publications

Fig. 7. Yearly distribution of deep learning-based papers.

Table 8

Distribution of deep learning-based papers per database.

Database # of papers Percentage of Papers (%)

Science Direct 6 20

Scopus 7 23,33

Web of Science 0 0

Springer Link 6 20

Wiley 1 3,33

Google Scholar 10 33,33

Total 30 100

Table 9

Distribution of deep learning algorithms.

Algorithms used # of usages Percentage (%)

CNN 10 30,30

LSTM 7 21,21

DNN 7 21,21

Hybrid 4 12,12

Autoencoder 1 3,03

Multi-Task Learning (MTL) 1 3,03

Deep Recurrent Q-Network (DQN) 1 3,03

3D CNN 1 3,03

Faster R-CNN 1 3,03

Total 33 100

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

indicating a broad enough search.

• Analysis-related discussion: Another issue that could be a threat

to validity the way the analysis is conducted. For example, not all

publications stated what kind of evaluation parameters were used,

and sometimes just a few examples of features were explained. Thus,

sometimes this information that is required to address the research

questions could not be found in the paper. This way, the data that

was used to answer the research questions were derived from a few

numbers of publications than a total of 50 selected publications. To

get more information about the publications, the authors could

potentially have been contacted, but this line of action was not

feasible within the context of this research, and that might also not

solve all the issues.

• RQ1-Related (algorithms) discussion: Linear Regression is the

second most used algorithms, according to Table 5. Linear Regres-

sion is used as a benchmarking algorithm in most cases to check

whether the proposed algorithm is better than Linear Regression or

not. Therefore, although it is shown in many articles, it does not

mean that it is the best performing algorithm. Table 5 should be

interpreted carefully because “most used” does not mean the best-

performing ones. In fact, Deep Learning (DL), which is a sub-branch

of Machine Learning, has been used for the crop yield prediction

problem recently and is believed to be very promising. In this study,

we also identiﬁed several deep learning-based studies. There are

several additional promising aspects of DL methods, such as auto-

matic feature extraction and superior performance. We expect that

more research will be conducted on the use of DL approaches in crop

yield prediction in the near future due to the superior performance

of DL algorithms in other problem domains.

Among the selected publications, both classiﬁers and clustering al-

gorithms are used. Since pictures are used for clustering in those pub-

lications, the publication is in connection with the machine vision in-

stead of ML using a numerical dataset. The use of clustering algorithms

for this problem can be investigated in detail to ﬁnd diﬀerent research

perspectives in this problem.

• RQ2-related (features) discussion: Groups are created for features

and algorithms to visualize the main features and algorithms. Due to

this decision, detailed information is lost, but clarity has been

maintained. The most used features are soil type, rainfall, and

temperature. Apart from those features that are used in several

studies, there are also features that were used in speciﬁc studies.

Those features are gamma radiation, MODIS-EVI, forecast rainfall,

humidity, photoperiod, pH-value, irrigation, leaf area, NDVI, EVI,

and crop information. There are also studies that use diﬀerent nu-

trients as features, which are magnesium, potassium, sulphur, zinc,

nitrogen, boron, and calcium. The most used features are not always

the same kind of data. Temperature, for example, is measured as

average temperature, but more features like maximum temperature

and minimum temperature are also applied.

•RQ3-related (evaluation parameters and approaches) discus-

sion: There are not many evaluation parameters reported in the

selected papers. Almost every study used RMSE as the measurement

of the quality of the model. Other evaluation parameters are MSE,

, and MAE. Some parameters were used in speciﬁc studies, most of

these parameters look like some of the previously mentioned para-

meters, with a small diﬀerence. These are MAPE, LCCC, MFE, SAE,

rcv, RSAE, and MCC. Most of the models had outcomes with high

accuracy values for their evaluation parameters, which means that

the model made correct predictions. As the evaluation approach, the

10-fold cross-validation approach was preferred by researchers.

• RQ4-related (challenges) discussion: Challenges were reported

based on the explicit statements in the articles. However, there

might be additional challenges that were not stated in the identiﬁed

papers. The challenges are mainly in the ﬁeld of improvement of a

working model. When more data is gathered to train and test, much

more can be said about the precision of the model. Another chal-

lenge is the implementation of the models into the farm manage-

ment systems. When applications are made that the farmer can use,

then only can the models be useful to make decisions, also during

the growing season. When speciﬁc parameters for that speciﬁc place

are measured and added, predictions will have higher precision.

7. Conclusion

This study showed that the selected publications use a variety of

features, depending on the scope of the research and the availability of

data. Every paper investigates yield prediction with machine learning

but diﬀers from the features. The studies also diﬀer in scale, geological

position, and crop. The choice of features is dependent on the avail-

ability of the dataset and the aim of the research. Studies also stated

that models with more features did not always provide the best per-

formance for the yield prediction. To ﬁnd the best performing model,

models with more and fewer features should be tested. Many algorithms

have been used in diﬀerent studies. The results show that no speciﬁc

conclusion can be drawn as to what the best model is, but they clearly

show that some machine learning models are used more than the

others. The most used models are the random forest, neural networks,

linear regression, and gradient boosting tree. Most of the studies used a

variety of machine learning models to test which model had the best

prediction.

Since Neural Networks is the most applied algorithm, we also aimed

to investigate to what extent deep learning algorithms were used for

crop yield prediction. After the identiﬁcation of 30 papers that applied

deep learning, we extracted and synthesized the applied algorithms. We

observed that CNN, LSTM, and DNN algorithms are the most preferred

deep learning algorithms. However, there are also other kinds of al-

gorithms applied to this problem. We consider that this article will pave

the way for further research on the development of crop yield predic-

tion problem.

In our future work, we aim to build on the outcomes of this study

and focus on the development of a DL-based crop yield prediction

model.

Declaration of Competing Interest

The authors declare that they have no known competing ﬁnancial

interests or personal relationships that could have appeared to inﬂu-

ence the work reported in this paper.

Appendix A

In Table A1, features used per publications are shown. If there is a ‘1’ in the box, it means that that speciﬁc feature was used.

In Table A2, the evaluation parameters used per publication are presented.

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Table A1

Features used per selected publication.

Paper Soil

type

Gamma

radia-

metrics

Soil

maps

MODIS-

EVI

Rainfall Forecas-

ted

rainfall

Precipi-

tation

Temper-

ature

Humidi-

Photop-

eriod

Fertiliz-

ation

Climate pH-

value

Irrigati-

Cation

ex-

change

capacity

Magnes-

ium

Potassi-

Area of

produc-

tion

Wind

Speed

Filippi et al., 2019 1 1 1 1 1 1

Jeong et al., 2016 1 1 1 1 1 1 1

Zhong et al., 2018 1 1 1 1

Villanueva and

Salenga, 2018

Crane-Droesch,

2018

1 1 1 1 1 1 1

Gonzalez-Sanchez

et al., 2014

1 1 1 1 1

Xu et al., 2019 1 1 1

Pantazi et al., 2016 1 1 1

Kouadio et al.,

2018

1 1 1 1 1

Kunapuli et al.,

2015

1 1

Shekoofa et al.,

2014

1 1

Pantazi et al., 2014 1 1 1 1

Goldstein et al.,

2018

1 1 1

Mola-Yudego et

al., 2016

1 1

Girish et al., 2018 1

Rao and Manasa,

2019

1 1 1 1 1 1

Khanal et al., 2018 1 1 1 1 1 1

Cheng et al., 2017

Everingham et al.,

2009

1 1

Everingham et al.,

2016

1 1

Bargoti and

Underwood,

2017

Fernandes and

Ebecken, 2017

Johnson et al.,

2013

Matsumura et al.,

2015

1 1 1

Taherei Ghazvinei,

2018

1 1 1 1 1

(continued on next page)

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Table A1 (continued)

Paper Soil

type

Gamma

radia-

metrics

Soil

maps

MODIS-

EVI

Rainfall Forecas-

ted

rainfall

Precipi-

tation

Temper-

ature

Humidi-

Photop-

eriod

Fertiliz-

ation

Climate pH-

value

Irrigati-

Cation

ex-

change

capacity

Magnes-

ium

Potassi-

Area of

produc-

tion

Wind

Speed

Romero et al.,

2013

Su et al., 2017 1 1 1 1 1 1 1 1

You et al., 2017

Ahmad et al., 2018 1 1

Črtomir et al.,

2012

Osman et al., 2017 1 1 1 1 1

Ranjan and Parida,

2019

1 1

Shah et al., 2018 1 1 1

Russ et al., 2008 1 1

Monga, 2018

Russ and Kruse,

2010

1 1 1

Baral et al., 2011 1 1 1

Ahamed et al.,

2015

1 1 1 1 1 1

Ali et al., 2017 1 1

Cakir et al., 2014 1 1 1 1

Gandhi et al., 2016 1 1 1

Wang et al., 2018

Charoen-Ung and

Mittrapiyanur-

uk, 2019

1 1 1 1

Ananthara et al.,

2013

1 1 1 1

Bose et al., 2016 1

Gandhi et al., 2016 1 1 1

Gandhi and

Armstrong,

2016

1 1 1 1

Paul et al., 2015 1 1

Rahman and Haq,

2014

1 1 1

Sujatha and Isakki,

2016

1 1

Paper Shortw-

ave

radia-

tion

Degree-

days

Time Solar

radia-

tion

Pressure Sulphur Zinc Nitroge-

Boron Calcium Crop

Inform-

ation

Leaf

Area

Index

Phosph-

orus

Manga-

nese

Organic

carbon

Images NDVI EVI

Filippi et al., 2019

Jeong et al., 2016

Zhong et al., 2018 1

Villanueva and

Salenga, 2018

Crane-Droesch,

2018

1 1 1

(continued on next page)

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Table A1 (continued)

Paper Shortw-

ave

radia-

tion

Degree-

days

Time Solar

radia-

tion

Pressure Sulphur Zinc Nitroge-

Boron Calcium Crop

Inform-

ation

Leaf

Area

Index

Phosph-

orus

Manga-

nese

Organic

carbon

Images NDVI EVI

Gonzalez-Sanchez

et al., 2014

Xu et al., 2019 1

Pantazi et al., 2016

Kouadio et al.,

2018

1 1 1 1 1

Kunapuli et al.,

2015

1 1

Shekoofa et al.,

2014

Pantazi et al., 2014 1 1 1

Goldstein et al.,

2018

1 1

Mola-Yudego et

al., 2016

Girish et al., 2018

Rao and Manasa,

2019

1 1 1 1 1 1

Khanal et al., 2018

Cheng et al., 2017 1

Everingham et al.,

2009

Everingham et al.,

2016

1 1

Bargoti and

Underwood,

2017

Fernandes and

Ebecken, 2017

Johnson et al.,

2013

1 1

Matsumura et al.,

2015

Taherei Ghazvinei,

2018

Romero et al.,

2013

Su et al., 2017 1

(continued on next page)

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Table A1 (continued)

Paper Shortw-

ave

radia-

tion

Degree-

days

Time Solar

radia-

tion

Pressure Sulphur Zinc Nitroge-

Boron Calcium Crop

Inform-

ation

Leaf

Area

Index

Phosph-

orus

Manga-

nese

Organic

carbon

Images NDVI EVI

You et al., 2017 1

Ahmad et al., 2018 1 1 1 1 1

Črtomir et al.,

2012

Osman et al., 2017 1

Ranjan and Parida,

2019

Shah et al., 2018

Russ et al., 2008 1

Monga, 2018 1

Russ and Kruse,

2010

Baral et al., 2011

Ahamed et al.,

2015

Ali et al., 2017 1 1 1

Cakir et al., 2014 1

Gandhi et al., 2016 1

Wang et al., 2018 1 1

Charoen-Ung and

Mittrapiyanur-

uk, 2019

Ananthara et al.,

2013

1 1

Bose et al., 2016 1 1

Gandhi et al., 2016 1

Gandhi and

Armstrong,

2016

Paul et al., 2015 1 1 1

Rahman and Haq,

2014

Sujatha and Isakki,

2016

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Table A2

Evaluation parameters used per publication.

Paper Root mean

square error

Lin’s

concordance

correlation

coeﬃcient

Mean square

error

R-squared Mean

absolute

error

Mean

absolute

percentage

error

Multi

factored

evaluation

Simple

average

ensemble

Reference

change

values

Reduced

simple

average

ensemble

Matthew’s

correla-

tion

coeﬃcient

Filippi et al., 2019 1 1 1

Jeong et al., 2016 1

Zhong et al., 2018 1 1

Villanueva and Salenga,

2018

Crane-Droesch, 2018 1

Gonzalez-Sanchez et al.,

2014

1 1 1

Xu et al., 2019 1 1

Pantazi et al., 2016 1

Kouadio et al., 2018 1 1

Kunapuli et al., 2015 1

Shekoofa et al., 2014

Pantazi et al., 2014

Goldstein et al., 2018 1

Mola-Yudego et al., 2016 1 1

Girish et al., 2018

Rao and Manasa, 2019

Khanal et al., 2018 1 1

Cheng et al., 2017 1 1 1 1

Everingham et al., 2009 1 1 1

Everingham et al., 2016 1

Bargoti and Underwood,

2017

Fernandes and Ebecken,

2017

1 1

Johnson et al., 2013

Matsumura et al., 2015 1 1

Taherei Ghazvinei, 2018 1 1

Romero et al., 2013

Su et al., 2017 1

You et al., 2017 1

Ahmad et al., 2018 1 1 1 1

Črtomir et al., 2012 1

(continued on next page)

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Table A2 (continued)

Paper Root mean

square error

Lin’s

concordance

correlation

coeﬃcient

Mean square

error

R-squared Mean

absolute

error

Mean

absolute

percentage

error

Multi

factored

evaluation

Simple

average

ensemble

Reference

change

values

Reduced

simple

average

ensemble

Matthew’s

correla-

tion

coeﬃcient

Osman et al., 2017 1

Ranjan and Parida, 2019

Shah et al., 2018 1 1 1

Russ et al., 2008 1 1

Monga, 2018 1 1

Russ and Kruse, 2010 1

Baral et al., 2011

Ahamed et al., 2015 1

Ali et al., 2017 1 1

Cakir et al., 2014 1

Gandhi et al., 2016 1 1 1 1

Wang et al., 2018 1 1

Charoen-Ung and

Mittrapiyanuruk, 2019

Ananthara et al., 2013

Bose et al., 2016 1 1 1

Gandhi et al., 2016 1 1 1

Gandhi and Armstrong,

2016

1 1 1

Paul et al., 2015

Rahman and Haq, 2014 1

Sujatha and Isakki, 2016

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Appendix B. Supplementary material

Supplementary data to this article can be found online at https://doi.org/10.1016/j.compag.2020.105709.

References

Ahamed, A.T.M.S., Mahmood, N.T., Hossain, N., Kabir, M.T., Das, K., Rahman, F.,

Rahman, R.M., 2015. Applying data mining techniques to predict annual yield of

major crops and recommend planting diﬀerent crops in diﬀerent districts in

Bangladesh. In: 2015 IEEE/ACIS 16th International Conference on Software

Engineering, Artiﬁcial Intelligence, Networking and Parallel/Distributed Computing,

SNPD 2015 - Proceedings, https://doi.org/10.1109/SNPD.2015.7176185.

Ahmad, I., Saeed, U., Fahad, M., Ullah, A., Habib-ur-Rahman, M., Ahmad, A., Judge, J.,

2018. Yield forecasting of spring maize using remote sensing and crop modeling in

Faisalabad-Punjab Pakistan. J. Indian Soc. Remote Sens. 46 (10), 1701–1711.

https://doi.org/10.1007/s12524-018-0825-8.

Ali, I., Cawkwell, F., Dwyer, E., Green, S., 2017. Modeling managed grassland biomass

estimation by using multitemporal remote sensing data—a machine learning ap-

proach. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 10 (7), 3254–3264. https://

doi.org/10.1109/JSTARS.2016.2561618.

Alpaydin, E., 2010. Introduction to Machine Learning, 2nd ed. Retrieved from https://

books.google.nl/books?hl=nl&lr=&id=TtrxCwAAQBAJ&oi=fnd&pg=PR7&dq=

introduction+to+machine+learning&ots=T5ejQG_7pZ&sig=0xC_

H0agN7mPhYW7oQsWiMVwRnQ#v=onepage&q=introduction to machine

learning&f=false.

Ananthara, M.G., Arunkumar, T., Hemavathy, R., 2013. CRY-An improved crop yield

prediction model using bee hive clustering approach for agricultural data sets. In:

Proceedings of the 2013 International Conference on Pattern Recognition,

Informatics and Mobile Engineering, PRIME 2013, 473–478. https://doi.org/10.

1109/ICPRIME.2013.6496717.

Ayodele, T.O., 2010. Introduction to Machine Learning.

Baldi, P., 2012. Autoencoders, unsupervised learning, and deep architectures. In:

Proceedings of ICML workshop on unsupervised and transfer learning, pp. 37–49.

Baral, S., Kumar Tripathy, A., Bijayasingh, P., 2011. Yield Prediction Using Artiﬁcial

Neural Networks, pp. 315–317. https://doi.org/10.1007/978-3-642-19542-6_57.

Bargoti, S., Underwood, J.P., 2017. Image segmentation for fruit detection and yield

estimation in apple orchards. J. Field Rob. 34 (6), 1039–1060. https://doi.org/10.

1002/rob.21699.

Beulah, R., 2019. A survey on diﬀerent data mining techniques for crop yield prediction.

Int. J. Comput. Sci. Eng. 7 (1), 738–744. https://doi.org/10.26438/ijcse/v7i1.738744.

Bhojani, S.H., Bhatt, N., 2020. Wheat crop yield prediction using new activation functions

in neural network. Neural Comput. Appl. 1–11.

Bose, P., Kasabov, N., Bruzzone, L., n.d. Spiking neural networks for crop yield estimation

based on spatiotemporal analysis of image time series. Ieeexplore.Ieee.Org. Retrieved

from https://ieeexplore.ieee.org/abstract/document/7524771/.

Brownlee, J., 2016. Deep learning with Python: develop deep learning models on Theano

and TensorFlow using Keras. Machine Learning Mastery.

Brownlee, J., 2017. Long Short-term Memory Networks with Python: Develop Sequence

Prediction Models with Deep Learning. Machine Learning Mastery.

Brownlee, J., 2019. Deep Learning for Computer Vision: Image Classiﬁcation, Object

Detection, and Face Recognition in Python. Machine Learning Mastery.

Cakir, Y., Kirci, M., Gunes, E.O., 2014. Yield prediction of wheat in south-east region of

Turkey by using artiﬁcial neural networks. In: 2014 The 3rd International Conference

on Agro-Geoinformatics, Agro-Geoinformatics 2014. https://doi.org/10.1109/Agro-

Geoinformatics.2014.6910609.

Charoen-Ung, P., Mittrapiyanuruk, P., 2019. Sugarcane yield grade prediction using

random forest with forward feature selection and hyper-parameter tuning, pp. 33–42.

https://doi.org/10.1007/978-3-319-93692-5_4.

Chen, Y., Lee, W.S., Gan, H., Peres, N., Fraisse, C., Zhang, Y., He, Y., 2019. Strawberry

yield prediction based on a deep neural network using high-resolution aerial or-

thoimages. Remote Sens. 11 (13), 1584.

Cheng, H., Damerow, L., Sun, Y., Blanke, M., 2017. Early yield prediction using image

analysis of apple fruit and tree canopy features with neural networks. J. Imag. 3 (1),

6. https://doi.org/10.3390/jimaging3010006.

Chlingaryan, A., Sukkarieh, S., Whelan, B., 2018. Machine learning approaches for crop

yield prediction and nitrogen status estimation in precision agriculture: a review.

Comput. Electron. Agric. 151, 61–69. https://doi.org/10.1016/j.compag.2018.05.012.

Chu, Z., Yu, J., 2020. An end-to-end model for rice yield prediction using deep learning

fusion. Comput. Electron. Agric. 174.

Crane-Droesch, A., 2018. Machine learning methods for crop yield prediction and climate

change impact assessment in agriculture. Environ. Res. Lett. 13 (11), 114003.

https://doi.org/10.1088/1748-9326/aae159.

Črtomir, R., Urška, C., Stanislav, T., Denis, S., Karmen, P., Pavlovič, M., Marjan, V., 2012.

Application of neural networks and image visualization for early forecast of apple

yield. Erwerbs-Obstbau 54 (2), 69–76. https://doi.org/10.1007/s10341-012-0162-y.

De Alwis, S., Zhang, Y., Na, M., Li, G., 2019. Duo attention with deep learning on tomato

yield prediction and factor interpretation. In: Paciﬁc Rim International Conference on

Artiﬁcial Intelligence. Springer, Cham, pp. 704–715.

Elavarasan, D., Vincent, P.D., 2020. Crop yield prediction using deep reinforcement

learning model for sustainable agrarian applications. IEEE Access 8, 86886–86901.

Elavarasan, D., Vincent, D.R., Sharma, V., Zomaya, A.Y., Srinivasan, K., 2018. Forecasting

yield by integrating agrarian factors and machine learning models: a survey. Comput.

Electron. Agric. 155, 257–282. https://doi.org/10.1016/j.compag.2018.10.024.

Everingham, Y., Sexton, J., Skocaj, D., Inman-Bamber, G., 2016. Accurate prediction of

sugarcane yield using a random forest algorithm. Agron. Sustainable Dev. 36 (2).

https://doi.org/10.1007/s13593-016-0364-z.

Everingham, Y.L., Smyth, C.W., Inman-Bamber, N.G., 2009. Ensemble data mining ap-

proaches to forecast regional sugarcane crop production. Agric. For. Meteorol. 149

(3–4), 689–696. https://doi.org/10.1016/J.AGRFORMET.2008.10.018.

Fathi, M.T., Ezziyyani, M., Ezziyyani, M., El Mamoune, S., 2019. Crop yield prediction using

deep learning in Mediterranean Region. In: International Conference on Advanced

Intelligent Systems for Sustainable Development. Springer, Cham, pp. 106–114.

Fernandes, J.L., Ebecken, N.F.F., Esquerdo, J.C.D.M., 2017. Sugarcane yield prediction in

Brazil using NDVI time series and neural networks ensemble. Int. J. Remote Sens. 38

(16), 4631–4644. https://doi.org/10.1080/01431161.2017.1325531.

Filippi, P., Jones, E.J., Wimalathunge, N.S., Somarathna, P.D.S.N., Pozza, L.E., Ugbaje,

S.U., Bishop, T.F.A., 2019a. An approach to forecast grain crop yield using multi-

layered, multi-farm data sets and machine learning. Precis. Agric. 1–15. https://doi.

org/10.1007/s11119-018-09628-4.

Filippi, P., Jones, E.J., Wimalathunge, N.S., Somarathna, P.D.S.N., Pozza, L.E., Ugbaje,

S.U., Bishop, T.F.A., 2019b. An approach to forecast grain crop yield using multi-

layered, multi-farm data sets and machine learning. Precis. Agric. https://doi.org/10.

1007/s11119-018-09628-4.

Gandhi, N., Armstrong, L., 2016. Applying data mining techniques to predict yield of rice

in humid subtropical climatic zone of India. In: Proceedings of the 10th INDIACom;

2016 3rd International Conference on Computing for Sustainable Global

Development, INDIACom 2016, 1901–1906. Retrieved from https://ieeexplore.ieee.

org/abstract/document/7724597/.

Gandhi, N., Armstrong, L.J., 2016b. A review of the application of data mining techniques

for decision making in agriculture. In: Proceedings of the 2016 2nd International

Conference on Contemporary Computing and Informatics, https://doi.org/10.1109/

IC3I.2016.7917925.

Gandhi, N., Petkar, O., Armstrong, L.J., Tripathy, A.K., 2016. Rice crop yield prediction in

India using support vector machines. In: 2016 13th International Joint Conference on

Computer Science and Software Engineering, JCSSE 2016. https://doi.org/10.1109/

JCSSE.2016.7748856.

Girish, L., Gangadhar, S., Bharath, T., Balaji, K., n.d. Crop Yield and Rainfall Prediction in

Tumakuru District using Machine Learning. Ijream.Org. Retrieved from https://

www.ijream.org/papers/NCTFRD2018015.pdf.

Goldstein, A., Fink, L., Meitin, A., Bohadana, S., Lutenberg, O., Ravid, G., 2018. Applying

machine learning on sensor data for irrigation recommendations: revealing the

agronomist’s tacit knowledge. Precis. Agric. 19 (3), 421–444. https://doi.org/10.

1007/s11119-017-9527-4.

Gonzalez-Sanchez, A., Frausto-Solis, J., Ojeda-Bustamante, W., 2014. Predictive ability of

machine learning methods for massive crop yield prediction. Spanish J. Agric. Res. 12

(2), 313–328. https://doi.org/10.5424/sjar/2014122-4439.

Jeong, J.H., Resop, J.P., Mueller, N.D., Fleisher, D.H., Yun, K., Butler, E.E., Kim, S.H.,

2016. Random forests for global and regional crop yield predictions. PLoS ONE 11

(6). https://doi.org/10.1371/journal.pone.0156571.

Ji, S., Xu, W., Yang, M., Yu, K., 2012. 3D convolutional neural networks for human action

recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35 (1), 221–231.

Jiang, H., Hu, H., Zhong, R., Xu, J., Xu, J., Huang, J., Lin, T., 2020. A deep learning ap-

proach to conﬂating heterogeneous geospatial data for corn yield estimation: a case

study of the US Corn Belt at the county level. Glob. Change Biol. 26 (3), 1754–1766.

Johnson, M.D., 2013. Crop Yield Forecasting on the Canadian Prairies by Satellite Data

and Machine Learning Methods. Master’s Thesis, University of British Columbia,

Atmospheric Science. Retrieved from https://www.sciencedirect.com/science/

article/pii/S0168192315007546.

Ju, S., Lim, H., Heo, J., 2020. Machine learning approaches for crop yield prediction with

MODIS and weather data. 40th Asian Conference on Remote Sensing: Progress of

Remote Sensing Technology for Smart Future, ACRS 2019.

Kang, Y., Ozdogan, M., Zhu, X., Ye, Z., Hain, C.R., Anderson, M.C., 2020. Comparative

assessment of environmental variables and machine learning algorithms for maize

yield prediction in the US Midwest. Environ. Res. Lett.

Khaki, S., Wang, L., 2019. Crop yield prediction using deep neural networks. Front. Plant

Sci. 10, 621.

Khaki, S., Wang, L., Archontoulis, S.V., 2020. A cnn-rnn framework for crop yield pre-

diction. Front. Plant Sci. 10, 1750.

Khanal, S., Fulton, J., Klopfenstein, A., Douridas, N., Shearer, S., 2018. Integration of high

resolution remotely sensed data and machine learning techniques for spatial pre-

diction of soil properties and corn yield. Comput. Electron. Agric. 153, 213–225.

https://doi.org/10.1016/J.COMPAG.2018.07.016.

Kitchenham, B., Charters, S., Budgen, D., Brereton, P., Turner, M., Linkman, S., Visaggio,

G., 2007. Guidelines for performing Systematic Literature Reviews in Software

Engineering. Retrieved from https://userpages.uni-koblenz.de/~laemmel/

esecourse/slides/slr.pdf.

Kouadio, L., Deo, R.C., Byrareddy, V., Adamowski, J.F., Mushtaq, S., Phuong Nguyen, V.,

2018. Artiﬁcial intelligence approach for the prediction of Robusta coﬀee yield using

soil fertility properties. Comput. Electron. Agric. 155, 324–338. https://doi.org/10.

1016/J.COMPAG.2018.10.014.

Kunapuli, S.S., Rueda-Ayala, V., Benavidez-Gutierrez, G., Cordova-Cruzatty, A., Cabrera,

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

A., Fernandez, C., Maiguashca, J., 2015. Yield prediction for precision territorial

management in maize using spectral data. In: Precision Agriculture 2015 - Papers

Presented at the 10th European Conference on Precision Agriculture, ECPA 2015 (pp.

199–206). Retrieved from https://www.scopus.com/inward/record.uri?eid=2-s2.0-

84947244569&partnerID=40&md5=241e9b9de12f2eb0fae3ed0ee2fd22c0.

Lee, S., Jeong, Y., Son, S., Lee, B., 2019. A self-predictable crop yield platform (SCYP)

based on crop diseases using deep learning. Sustainability 11 (13), 3637.

Li, B., Lecourt, J., Bishop, G., 2018. Advances in non-destructive early assessment of fruit

ripeness towards deﬁning optimal time of harvest and yield prediction—a review.

Plants 7 (1). https://doi.org/10.3390/plants7010003.

Liakos, K.G., Busato, P., Moshou, D., Pearson, S., Bochtis, D., 2018. Machine learning in

agriculture: a review. Sensors (Switzerland) 18 (8). https://doi.org/10.3390/

s18082674.

Maimaitijiang, M., Sagan, V., Sidike, P., Hartling, S., Esposito, F., Fritschi, F.B., 2020.

Soybean yield prediction from UAV using multimodal data fusion and deep learning.

Remote Sens. Environ. 237.

Matsumura, K., Gaitan, C.F., Sugimoto, K., Cannon, A.J., Hsieh, W.W., 2015. Maize yield

forecasting by linear regression and artiﬁcial neural networks in Jilin, China. J. Agric.

Sci. 153 (3), 399–410. https://doi.org/10.1017/S0021859614000392.

Mayuri, P.K., Priya, V.C., n.d. Role of image processing and machine learning techniques

in disease recognition, diagnosis and yield prediction of crops: a review. Int. J. Adv.

Res. Comput. Sci., 9(2). https://doi.org/10.26483/ijarcs.v9i2.5793.

McQueen, R.J., Garner, S.R., Nevill-Manning, C.G., Witten, I.H., 1995. Applying machine

learning to agricultural data. Comput. Electron. Agric. 12 (4), 275–293. https://doi.

org/10.1016/0168-1699(95)98601-9.

Measuring Vegetation (NDVI & EVI), 2000. Retrieved from https://earthobservatory.

nasa.gov/features/MeasuringVegetation/measuring_vegetation_2.php.

Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Petersen, S.,

2015. Human-level control through deep reinforcement learning. Nature 518 (7540),

529–533.

Mola-Yudego, B., Rahlf, J., Astrup, R., Dimitriou, I., 2016. Spatial yield estimates of fast-

growing willow plantations for energy based on climatic variables in northern

Europe. GCB Bioenergy 8 (6), 1093–1105. https://doi.org/10.1111/gcbb.12332.

Monga, T., 2018. Estimating vineyard grape yield from images, pp. 339–343. https://doi.

org/10.1007/978-3-319-89656-4_37.

Nevavuori, P., Narra, N., Lipping, T., 2019. Crop yield prediction with deep convolutional

neural networks. Comput. Electron. Agric. 163.

Nguyen, L.H., Zhu, J., Lin, Z., Du, H., Yang, Z., Guo, W., Jin, F., 2019. Spatial-temporal

multi-task learning for within-ﬁeld cotton yield prediction. In: Paciﬁc-Asia

Conference on Knowledge Discovery and Data Mining. Springer, Cham, pp. 343–354.

Osman, T., Psyche, S.S., Kamal, M.R., Tamanna, F., Haque, F., Rahman, R.M., 2017.

Predicting early crop production by analysing prior environment factors, pp.

470–479. https://doi.org/10.1007/978-3-319-49073-1_51.

Pantazi, X.E., Moshou, D., Mouazen, A.M., Kuang, B., Alexandridis, T., 2014. Application

of supervised self organising models for wheat yield prediction, pp. 556–565. https://

doi.org/10.1007/978-3-662-44654-6_55.

Pantazi, X.E., Moshou, D., Alexandridis, T., Whetton, R.L., Mouazen, A.M., 2016. Wheat

yield prediction using machine learning and advanced sensing techniques. Comput.

Electron. Agric. 121, 57–65. https://doi.org/10.1016/j.compag.2015.11.018.

Paul, M., Vishwakarma, S.K., Verma, A., 2015. Analysis of soil behaviour and prediction

of crop yield using data mining approach. In: 2015 International Conference on

Computational Intelligence and Communication Networks (CICN). IEEE, pp.

766–771. https://doi.org/10.1109/CICN.2015.156.

Rahman, M., Haq, N., n.d. Machine learning facilitated rice prediction in Bangladesh.

Ieeexplore.Ieee.Org. Retrieved from https://ieeexplore.ieee.org/abstract/document/

7113655/.

Rahnemoonfar, M., Sheppard, C., 2017. Real-time yield estimation based on deep

learning. In: Autonomous Air and Ground Sensing Systems for Agricultural

Optimization and Phenotyping II Vol. 10218. International Society for Optics and

Photonics, pp. 1021809.

Ranjan, A.K., Parida, B.R., 2019. Paddy acreage mapping and yield prediction using

sentinel-based optical and SAR data in Sahibganj district, Jharkhand (India). Spatial

Inf. Res. https://doi.org/10.1007/s41324-019-00246-4.

Rao, T., Manasa, S., n.d. Artiﬁcial Neural networks for soil quality and crop yield pre-

diction using machine learning. Ijfrcsce.Org. Retrieved from http://www.ijfrcsce.

org/download/browse/Volume_5/January_19_Volume_5_Issue_1/1547885118_19-

01-2019.pdf.

Ren, S., He, K., Girshick, R., Sun, J., 2015. Faster r-cnn: Towards real-time object de-

tection with region proposal networks. In: Advances in neural information processing

systems, pp. 91–99.

Romero, J.R., Roncallo, P.F., Akkiraju, P.C., Ponzoni, I., Echenique, V.C., Carballido, J.A.,

2013. Using classiﬁcation algorithms for predicting durum wheat yield in the pro-

vince of Buenos Aires. Comput. Electron. Agric. 96, 173–179. https://doi.org/10.

1016/j.compag.2013.05.006.

Ruder, S., 2017. An overview of multi-task learning in deep neural networks. arXiv

preprint arXiv:1706.05098.

Ruß, G., Kruse, R., 2010. Regression models for spatial data: an example from precision

agriculture, pp. 450–463. https://doi.org/10.1007/978-3-642-14400-4_35.

Ruß, G., Kruse, R., Schneider, M., Wagner, P., 2008. Data mining with neural networks for

wheat yield prediction. In: Lecture Notes in Computer Science (including subseries

Lecture Notes in Artiﬁcial Intelligence and Lecture Notes in Bioinformatics), Vol.

5077 LNAI. Springer Berlin Heidelberg, Berlin, Heidelberg, pp. 47–56. https://doi.

org/10.1007/978-3-540-70720-2_4.

Saravi, B., Nejadhashemi, A.P., Tang, B., 2019. Quantitative model of irrigation eﬀect on

maize yield by deep neural network. Neural Comput. Appl. 1–14.

Schwalbert, R.A., Amado, T., Corassa, G., Pott, L.P., Prasad, P.V., Ciampitti, I.A., 2020.

Satellite-based soybean yield forecast: Integrating machine learning and weather data

for improving crop yield prediction in southern Brazil. Agric. For. Meteorol. 284.

Shah, A., Dubey, A., Hemnani, V., Gala, D., Kalbande, D.R., 2018. Smart Farming System:

Crop Yield Prediction Using Regression Techniques. Springer, Singapore, pp. 49–56.

https://doi.org/10.1007/978-981-10-8339-6_6.

Shekoofa, A., Emam, Y., Shekoufa, N., Ebrahimi, M., Ebrahimie, E., 2014. Determining

the most important physiological and agronomic traits contributing to maize grain

yield through machine learning algorithms: a new avenue in intelligent agriculture.

PLoS ONE 9 (5), e97288. https://doi.org/10.1371/journal.pone.0097288.

Shidnal, S., Latte, M.V., Kapoor, A., 2019. Crop yield prediction: two-tiered machine

learning model approach. Int. J. Inf. Technol. 1–9.

Šmite, D., Wohlin, C., Gorschek, T., Feldt, R., 2010. Empirical evidence in global software

engineering: a systematic review. Empirical Softw. Eng. 15 (1), 91–118. https://doi.

org/10.1007/s10664-009-9123-y.

Somvanshi, P., Mishra, B.N., 2015. Machine learning techniques in plant biology. In:

PlantOmics: The Omics of Plant Science. Springer India, New Delhi, pp. 731–754.

https://doi.org/10.1007/978-81-322-2172-2_26.

Su, Y.X., Xu, H., Yan, L.J., 2017. Support vector machine-based open crop model

(SBOCM): case of rice production in China. Saudi J. Biol. Sci. 24 (3), 537–547.

Sujatha, R., Isakki, P., 2016. A study on crop yield forecasting using classiﬁcation tech-

niques. In: 2016 International Conference on Computing Technologies and Intelligent

Data Engineering, ICCTIDE 2016. https://doi.org/10.1109/ICCTIDE.2016.7725357.

Sun, J., Di, L., Sun, Z., Shen, Y., Lai, Z., 2019. County-level soybean yield prediction using

deep CNN-LSTM model. Sensors 19 (20), 4363.

Taherei-Ghazvinei, P., Hassanpour-Darvishi, H., Mosavi, A., Yusof, K.W., Alizamir, M.,

Shamshirband, S., Chau, K., 2018. Sugarcane growth prediction based on meteor-

ological parameters using extreme learning machine and artiﬁcial neural network.

Eng. Appl. Comput. Fluid Mech. 12 (1), 738–749. https://doi.org/10.1080/

19942060.2018.1526119.

Tedesco-Oliveira, D., da Silva, R.P., Maldonado Jr, W., Zerbato, C., 2020. Convolutional

neural networks in predicting cotton yield from images of commercial ﬁelds. Comput.

Electron. Agric. 171.

Terliksiz, A.S., Altýlar, D.T., 2019. Use Of deep neural networks for crop yield prediction: a case

study Of Soybean Yield in Lauderdale County, Alabama, USA. In: 2019 8th International

Conference on Agro-Geoinformatics (Agro-Geoinformatics). IEEE, pp. 1–4.

Villanueva, M.B., Louella, M., Salenga, M., 2018. Bitter Melon Crop Yield Prediction using

Machine Learning Algorithm. IJACSA) International Journal of Advanced Computer

Science and Applications, Vol. 9. Retrieved from www.ijacsa.thesai.org.

Vincent, P., Larochelle, H., Bengio, Y., Manzagol, P.A., 2008. Extracting and composing

robust features with denoising autoencoders. In: Proceedings of the 25th interna-

tional conference on Machine learning, pp. 1096–1103.

Wang, A., Tran, C., Desai, N., Lobell, D., n.d. Deep transfer learning for crop yield pre-

diction with remote sensing data. Dl.Acm.Org. Retrieved from https://dl.acm.org/

citation.cfm?id=3212707.

Wang, X., Huang, J., Feng, Q., Yin, D., 2020. Winter wheat yield prediction at county

level and uncertainty analysis in main wheat-producing regions of china with deep

learning approaches. Remote Sens. 12 (11), 1744.

Wang, A.X., Tran, C., Desai, N., Lobell, D., Ermon, S., 2018. Deep transfer learning for

crop yield prediction with remote sensing data. In: Proceedings of the 1st ACM

SIGCAS Conference on Computing and Sustainable Societies, pp. 1–5.

Wang, Y., Zhang, Z., Feng, L., Du, Q., Runge, T., 2020. Combining multi-source data and

machine learning approaches to predict winter wheat yield in the conterminous

United States. Remote Sens. 12 (8), 1232.

Witten, I.H., Frank, E., Hall, M.A., Pal, C.J., 2016. Data Mining: Practical Machine

Learning Tools and Techniques. Data Mining: Practical Machine Learning Tools and

Techniques. https://doi.org/10.1016/c2009-0-19715-5.

Wolanin, A., Mateo-García, G., Camps-Valls, G., Gómez-Chova, L., Meroni, M., Duveiller,

G., Guanter, L., 2020. Estimating and understanding crop yields with explainable

deep learning in the Indian Wheat Belt. Environ. Res. Lett. 15 (2).

Xu, X., Gao, P., Zhu, X., Guo, W., Ding, J., Li, C., Wu, X., 2019. Design of an integrated

climatic assessment indicator (ICAI) for wheat production: a case study in Jiangsu

Province, China. Ecol. Ind. 101, 943–953. https://doi.org/10.1016/j.ecolind.2019.

01.059.

Yalcin, H., 2019. An approximation for a relative crop yield estimate from ﬁeld images

using deep learning. In: 2019 8th International Conference on Agro-Geoinformatics

(Agro-Geoinformatics). IEEE, pp. 1–6.

Yang, Q., Shi, L., Han, J., Zha, Y., Zhu, P., 2019. Deep convolutional neural networks for

rice grain yield estimation at the ripening stage using UAV-based remotely sensed

images. Field Crops Res. 235, 142–153.

Ying-xue, S., Huan, X., Li-jiao, Y., 2017. Support vector machine-based open crop model

(SBOCM): Case of rice production in China. Saudi J. Biol. Sci. 24 (3), 537–547.

https://doi.org/10.1016/j.sjbs.2017.01.024.

You, J., Li, X., Low, M., Lobell, D., Ermon, S., 2017. Deep Gaussian process for crop yield

prediction based on remote sensing data. In: Proceedings of the Thirty-First AAAI

Conference on Artiﬁcial Intelligence (AAAI-17), 4559–4566. https://doi.org/10.

1109/MWSCAS.2006.381794.

Zhang, Y., Yang, Q., 2017. A survey on multi-task learning. arXiv preprint arXiv:1707.

08114.

Zhang, L., Zhang, Z., Luo, Y., Cao, J., Tao, F., 2020. Combining optical, ﬂuorescence,

thermal satellite, and environmental data to predict county-level maize yield in china

using machine learning approaches. Remote Sens. 12 (1), 21.

Zhong, H., Li, Xiaocheng, Lobell, D., Ermon, S., Brandeau, M.L., 2018. Hierarchical

modeling of seed variety yields and decision making for future planting plans.

Environ. Syst. Decis. 38, 458–470.

T. van Klompenburg, et al. Computers and Electronics in Agriculture 177 (2020) 105709

Machine Learning Techniques for Cereal Crops Yield Prediction: A Comprehensive Review

Article

Full-text available

Jun 2024

Cereals are sensitive to small changes in complex combinations of biotic and abiotic factors. Such a complexity can be deciphered using techniques such as Machine learning (ML). Using the PRISMA approach, this paper explores the features and ML techniques in cereal yield prediction based on 115 articles from 2007 to 2023 in six databases. Results showed that most data in the articles were from secondary sources and only 28.68% used experiments or primary data. China (31) and the United States (18) contributed most. Wheat (48%), maize (33%), and rice (17%) represented the most studied cereals. Climate, remote sensing data, and soil parameters were the most used predictors. The most frequently used ML techniques for cereal prediction were support vector machine (SVM) (51%), multi-layer perceptron (MLP) (41%), linear regression (34%), random forest (RF) (24%), and XGBoost (20%). However, RF, MLP, and SVM models were the best-performing techniques to predict grain yield based on reported R-square and mean absolute error (MAE). The models in the studied articles generally performed well from test data, with an R-square between 0.7 and 1. The study further reveals that the data's availability and quality are the main obstacles to using ML models for crop prediction.

Regional Prediction of Crop Yield Success Rate in the Philippines using Geographic Trend Analysis Algorithm

Conference Paper

Full-text available

Jan 2024

HypsLiDNet: 3D-2D CNN Model and Spatial–Spectral Morphological Attention for Crop Classification with DESIS and LiDAR Data

Article

Full-text available

Jan 2024

The advent of cloud computing and advanced processing technologies has elevated Deep Learning (DL) as a leading method for Hyper-Spectral Imaging (HSI) classification. Classifying crops accurately is vital for generating precise agricultural data to support informed decision-making. This study, introduces a DL framework called HypsLiDNet, tailored for remote sensing activities. This model processes HSI in conjunction with innovative, comprehensive Light Detection and Ranging (LiDAR) data from Hungary to conduct thorough examinations of the Earth's surface. Integrating LiDAR attributes with HSI is anticipated to enhance classification accuracy beyond HSIonly techniques. LiDAR integration provides a significant advantage by adding structural details to spectral data, aiding in the correct identification of objects with similar spectral characteristics but different shapes. The HypsLiDNet method utilizes morphological operations on LiDAR data to extract features indicative of the land's shape and texture. These features are then combined with HSI data through an attention mechanism that selectively highlights key features from both data types, improving the model's accuracy in predictions. This is particularly beneficial for complex environmental assessments, like distinguishing between plant species. The attention mechanism also refines the feature selection process, prioritizing relevant information, which boosts computational efficiency and reduces the use of resources. Moreover, this method requires a smaller number of training samples. HypsLiDNet showcases its ability to classify with precision by harnessing the combined power of HSI and LiDAR data. Experimental results show a significant improvement in classification outcomes, outperforming traditional machine learning approaches by more than 14% and recent DL techniques by approximately 1-3%.

Wheat Yield Prediction Using Machine Learning Method Based on UAV Remote Sensing Data

Article

Full-text available

Jun 2024

Accurate forecasting of crop yields holds paramount importance in guiding decision-making processes related to breeding efforts. Despite significant advancements in crop yield forecasting, existing methods often struggle with integrating diverse sensor data and achieving high prediction accuracy under varying environmental conditions. This study focused on the application of multi-sensor data fusion and machine learning algorithms based on unmanned aerial vehicles (UAVs) in wheat yield prediction. Five machine learning (ML) algorithms, namely random forest (RF), partial least squares (PLS), ridge regression (RR), k-nearest neighbor (KNN) and extreme gradient boosting decision tree (XGboost), were utilized for multi-sensor data fusion, together with three ensemble methods including the second-level ensemble methods (stacking and feature-weighted) and the third-level ensemble method (simple average), for wheat yield prediction. The 270 wheat hybrids were used as planting materials under full and limited irrigation treatments. A cost-effective multi-sensor UAV platform, equipped with red–green–blue (RGB), multispectral (MS), and thermal infrared (TIR) sensors, was utilized to gather remote sensing data. The results revealed that the XGboost algorithm exhibited outstanding performance in multi-sensor data fusion, with the RGB + MS + Texture + TIR combination demonstrating the highest fusion performance (R2 = 0.660, RMSE = 0.754). Compared with the single ML model, the employment of three ensemble methods significantly enhanced the accuracy of wheat yield prediction. Notably, the third-layer simple average ensemble method demonstrated superior performance (R2 = 0.733, RMSE = 0.668 t ha−1). It significantly outperformed both the second-layer ensemble methods of stacking (R2 = 0.668, RMSE = 0.673 t ha−1) and feature-weighted (R2 = 0.667, RMSE = 0.674 t ha−1), thereby exhibiting superior predictive capabilities. This finding highlighted the third-layer ensemble method’s ability to enhance predictive capabilities and refined the accuracy of wheat yield prediction through simple average ensemble learning, offering a novel perspective for crop yield prediction and breeding selection.

Assessing Predictive Models for Tea Yield: A Statistical and Machine Learning Approach in Assam's Biswanath Chariali District

Article

Full-text available

Jun 2024

Climatic factors significantly impact Assam tea production. The tropical climate of Assam, characterized by high precipitation and temperatures up to 36°C during the monsoon, creates ideal conditions for tea cultivation, contributing to the region's unique malty flavor. Here, in this study an attempt has been made to bring a comparison among statistical and machine learning models in prediction of tea production and evaluate an optimal model among them. A time span of last 23 years data were collected from Biswanath College of Agriculture under Assam Agriculture University situated at Biswanath Chariali district. The study has found that mean absolute percentage error of random forest regression model is 6.49 percent followed by decision tree (7.3 percent) and linear regression model (7.5 percent). From the evaluation metrics, random forest algorithm fits well in comparison to decision tree and linear regression. This study could be generalized to comparison among more predictive machine learning models.

Temperature Forecasting of Grain in Storage: An Improved Approach Based on Broad Learning Network

Article

Full-text available

Jan 2024

Temperature forecasting of grain in storage is crucial for timely granary temperature control, mitigating adverse effects of extreme temperatures on grain quality. Although traditional machine learning methods are lightweight and relatively quick to train, they suffer from poor stability and high error rates in predicting grain storage temperature. Conversely, deep learning models, while more accurate, are time-consuming and have heavy parameters. To address these problems, an improved model with light weight and good accuracy is proposed in this paper, which broad learning network is combined with one-dimensional convolution module and multi-head self-attention mechanism (BLN-1DCNN-MHSA). Firstly, we employ a one-dimensional convolution module at the feature nodes of the model to extract local temporal correlations, compensating for temporal sequence learning limitations of the BLN. Secondly, a multi-head self-attention mechanism at the enhancement nodes to captures important features dependencies and global temporal correlations. Lastly, our model achieves better prediction through enhanced representation ability of model nodes. The results with real grain storage temperature data demonstrate that the RMSE, MAPE, and MAE of the proposed model are 0.341, 0.54%, 0.28, respectively, which represent more than 2 times improvement in accuracy compared to the BLN, and it also reduces training time by more than 90% compared with LSTM and Transformer models. Additionally, the generalization and robustness of the improved approach are demonstrated through promising results in a classification experiment on the MNIST dataset. In general, the model provides a certain feasibility for early warning of grain storage risks by predicting its temperature trends.

Kisan Dhan - Crop Price Prediction Using Random Forest

Article

Jun 2024

Sonali Antad

Accurate prediction of agricultural commodity prices holds an important role for ensuring food security, profitability for farmers in farming, and making well-informed decisions for both farmers and industry stakeholders. Most of the prediction is made for farmers in the proposed system. The proposed system aims to find the relationship between weather conditions and agricultural prices by utilizing a comprehensive dataset spanning past years, including historical price data, modal, maximum and minimum prices, productivity, production and key meteorological things affecting like rainfall and temperature. The system also uses machine learning algorithms to classify the effects of climate factors, on price variations in combination with data collection. In addition to showing superior prediction capacity of the Random Forests than Decision Trees, this project is very good and major in terms of agriculture prices. These findings offer a good prediction for farmers in the agricultural industry to make secured decisions and face the challenges of price volatility. In a world where the stability of food production and economic sustainability depends on price predictability, this project contributes a practical and powerful tool for enhancing the prediction of the agricultural sector.

Artificial intelligence and its role in soil microbiology and agricultural sustenance

Chapter

Jan 2024

AI-Enhanced Precision Crop Rotation Management for Sustainable Agriculture

Conference Paper

Apr 2024

An In‐Depth Review on Machine Learning Infusion in an Agricultural Production System

Chapter

Jun 2024

Machine learning and Artificial neural networks in weld quality prediction

Poster

Full-text available

Jul 2019

Sreelekshmi Soman G

Winter Wheat Yield Prediction at County Level and Uncertainty Analysis in Main Wheat-Producing Regions of China with Deep Learning Approaches

Article

Full-text available

May 2020

Timely and accurate forecasting of crop yields is crucial to food security and sustainable development in the agricultural sector. However, winter wheat yield estimation and forecasting on a regional scale still remains challenging. In this study, we established a two-branch deep learning model to predict winter wheat yield in the main producing regions of China at the county level. The first branch of the model was constructed based on the Long Short-Term Memory (LSTM) networks with inputs from meteorological and remote sensing data. Another branch was constructed using Convolution Neural Networks (CNN) to model static soil features. The model was then trained using the detrended statistical yield data during 1982 to 2015 and evaluated by leave-one-year-out-validation. The evaluation results showed a promising performance of the model with the overall R 2 and RMSE of 0.77 and 721 kg/ha, respectively. We further conducted yield prediction and uncertainty analysis based on the two-branch model and obtained the forecast accuracy in one month prior to harvest of 0.75 and 732 kg/ha. Results also showed that while yield detrending could potentially introduce higher uncertainty, it had the advantage of improving the model performance in yield prediction.

Crop Yield Prediction Using Deep Reinforcement Learning Model for Sustainable Agrarian Applications

Article

Full-text available

May 2020

Predicting crop yield based on the environmental, soil, water and crop parameters has been a potential research topic. Deep-learning-based models are broadly used to extract significant crop features for prediction. Though these methods could resolve the yield prediction problem there exist the following inadequacies: Unable to create a direct non-linear or linear mapping between the raw data and crop yield values; and the performance of those models highly relies on the quality of the extracted features. Deep reinforcement learning provides direction and motivation for the aforementioned shortcomings. Combining the intelligence of reinforcement learning and deep learning, deep reinforcement learning builds a complete crop yield prediction framework that can map the raw data to the crop prediction values. The proposed work constructs a Deep Recurrent Q-Network model which is a Recurrent Neural Network deep learning algorithm over the Q-Learning reinforcement learning algorithm to forecast the crop yield. The sequentially stacked layers of Recurrent Neural network is fed by the data parameters. The Q- learning network constructs a crop yield prediction environment based on the input parameters. A linear layer maps the Recurrent Neural Network output values to the Q-values. The reinforcement learning agent incorporates a combination of parametric features with the threshold that assist in predicting crop yield. Finally, the agent receives an aggregate score for the actions performed by minimizing the error and maximizing the forecast accuracy. The proposed model efficiently predicts the crop yield outperforming existing models by preserving the original data distribution with an accuracy of 93.7%.

Combining Multi-Source Data and Machine Learning Approaches to Predict Winter Wheat Yield in the Conterminous United States

Article

Full-text available

Apr 2020

Winter wheat (Triticum aestivum L.) is one of the most important cereal crops, supplying essential food for the world population. Because the United States is a major producer and exporter of wheat to the world market, accurate and timely forecasting of wheat yield in the United States (U.S.) is fundamental to national crop management as well as global food security. Previous studies mainly have focused on developing empirical models using only satellite remote sensing images, while other yield determinants have not yet been adequately explored. In addition, these models are based on traditional statistical regression algorithms, while more advanced machine learning approaches have not been explored. This study used advanced machine learning algorithms to establish within-season yield prediction models for winter wheat using multi-source data to address these issues. Specifically, yield driving factors were extracted from four different data sources, including satellite images, climate data, soil maps, and historical yield records. Subsequently, two linear regression methods, including ordinary least square (OLS) and least absolute shrinkage and selection operator (LASSO), and four well-known machine learning methods, including support vector machine (SVM), random forest (RF), Adaptive Boosting (AdaBoost), and deep neural network (DNN), were applied and compared for estimating the county-level winter wheat yield in the Conterminous United States (CONUS) within the growing season. Our models were trained on data from 2008 to 2016 and evaluated on data from 2017 and 2018, with the results demonstrating that the machine learning approaches performed better than the linear regression models, with the best performance being achieved using the AdaBoost model (R2 = 0.86, RMSE = 0.51 t/ha, MAE = 0.39 t/ha). Additionally, the results showed that combining data from multiple sources outperformed single source satellite data, with the highest accuracy being obtained when the four data sources were all considered in the model development. Finally, the prediction accuracy was also evaluated against timeliness within the growing season, with reliable predictions (R2 > 0.84) being able to be achieved 2.5 months before the harvest when the multi-source data were combined.

Wheat crop yield prediction using new activation functions in neural network

Article

Full-text available

Sep 2020
NEURAL COMPUT APPL

This research mainly based on multilayer perceptron (MLP) neural networks technique of data mining to forecast the wheat crop yield at the district level. There are many statistical and simulation models available, but the proposed algorithm with new activation function provides promising results in a shorter time with more accuracy. Sigmoid and hyperbolic tangent activation functions are widely used in the neural network. The activation functions play an important role in the neural network learning algorithm. The main objective of the proposed work is to develop an amended MLP neural network with new activation function and revised random weights and bias values for crop yield estimation by using the different weather parameter datasets. MLP model has been tested by existing activation functions and newly created activation functions with different cases including weights and bias values. In this research study, we evaluate the result of different activation functions and recommend some new simple activation functions, named DharaSig, DharaSigm and SHBSig, to improve the performance of neural networks and accurate results. Also, three new activation functions created with little variations in the DharaSig function named DharaSig1, DharaSig2 and DharaSig3. In this research study, variable numbers of hidden layers are tested with the variable number of neurons per hidden layer for the agriculture dataset. Variable values of momentum, seed and learning rate are also used in this study. Experiments show that newly created activation functions provide better results compared to ‘sigmoid’ default neural network activation function for agriculture datasets.

Comparative assessment of environmental variables and machine learning algorithms for maize yield prediction in the US Midwest

Article

Full-text available

Jun 2020
ENVIRON RES LETT

Crop yield estimates over large areas are conventionally made using weather observations, but a comprehensive understanding of the effects of various environmental indicators, observation frequency, and the choice of prediction algorithm remains elusive. Here we present a thorough assessment of county-level maize yield prediction in U.S. Midwest using six statistical/machine learning algorithms (Lasso, Support Vector Regressor, Random Forest, XGBoost, Long-short term memory (LSTM), and Convolutional Neural Network (CNN)) and an extensive set of environmental variables derived from satellite observations, weather data, land surface model results, soil maps, and crop progress reports. Results show that seasonal crop yield forecasting benefits from both more advanced algorithms and a large composite of information associated with crop canopy, environmental stress, phenology, and soil properties (i.e. hundreds of features). The XGBoost algorithm outperforms other algorithms both in accuracy and stability, while deep neural networks such as LSTM and CNN are not advantageous. The compositing interval (8-day, 16-day or monthly) of time series variable does not have significant effects on the prediction. Combining the best algorithm and inputs improves the prediction accuracy by 5% when compared to a baseline statistical model (Lasso) using only basic climatic and satellite observations. Reasonable county-level yield foresting is achievable from early June, almost four months prior to harvest. At the national level, early-season (June and July) prediction from the best model outperforms that of the United States Department of Agriculture (USDA) World Agricultural Supply and Demand Estimates (WASDE). This study provides insights into practical crop yield forecasting and the understanding of yield response to climatic and environmental conditions.

Crop Prediction Using Machine Learning and Artificial Neural Network

Chapter

Jul 2023

Tanya Saraswat

Estimating Vineyard Grape Yield from Images

Chapter

Apr 2018

Tanya Monga

Agricultural yield estimation from natural images is a challenging problem to which machine learning can be applied. Convolutional Neural Networks have advanced the state of the art in many machine learning applications such as computer vision, speech recognition and natural language processing. The proposed research uses convolution neural networks to develop models that can estimate the weight of grapes on a vine using an image. Trained and tested with a dataset of 60 images of grape vines, the system manages to achieve a cross-validation yield estimation accuracy of 87%.

An end-to-end model for rice yield prediction using deep learning fusion

Article

Jul 2020
COMPUT ELECTRON AGR

Rice yield is essential for more than half of the world’s population, and thus, accurate predictions of rice yield are of great importance for trade, development policies, humanitarian assistance, decision-makers, etc. However, traditional mechanistic models and statistical machine learning models need to identify features, making the research on and application of these models laborious and time-consuming. In this paper, a novel end-to-end prediction model that fuses two back-propagation neural networks (BPNNs) with an independently recurrent neural network (IndRNN), named BBI-model, is proposed to address these challenges. In stage one, BBI-model preprocesses the original area and meteorology data. In stage two, one BPNN and the IndRNN are used to learn deep spatial and temporal features in parallel. In stage three, another BPNN combines two kinds of deep features and learns the relationships between these deep features and rice yields to make predictions for summer and winter rice yields. The experimental results indicate that BBI-model achieved the lowest mean absolute error (MAE) and root mean square error (RMSE) for the summer rice prediction (0.0044 and 0.0057, respectively) and corresponding values of 0.0074 and 0.0192 for the winter rice prediction when the number of layers in the network was set to six. Moreover, the errors of the model using the combination of deep spatial-temporal features were significantly lower than when simply using deep temporal features. Furthermore, the model converged quickly with 100 iterations and then remained stable. These findings confirm that the model can make accurate predictions for summer and winter rice yields of 81 counties in the Guangxi Zhuang Autonomous Region, China.

Convolutional neural networks in predicting cotton yield from images of commercial fields

Article

Apr 2020
COMPUT ELECTRON AGR

One way to improve the quality of mechanized cotton harvesting is to change harvester settings and adjustments throughout the process, according to information obtained during the operation. We believe that yield predictions are important for managing the quality of operation, aiming at increasing efficiency and reducing losses. Therefore, this study aimed to develop an automated system for cotton yield prediction from color images acquired by a simple mobile device. We propose a robust approach to environmental conditions, training detection algorithms with images acquired at different times throughout the day, and evaluating three different scenarios (low-, average-, and high-demand computational resources). The experimental results for the average demand computational scenario, which are suitable for real-time deployment on low-cost devices such as smartphones and other ARM-processed devices, indicated the possibility of counting bolls using images acquired at different times throughout the day, with mean errors of 8.84% (∼5 bolls). Furthermore, we observed a 17.86% error when predicting yield using 205 images from the testing dataset, which is equivalent to about 19.14 g.

Crop yield prediction using machine learning: A systematic literature review

Abstract and Figures

Recommended publications

Temporal convolutional network based rice crop yield prediction using multispectral satellite data

Deep learning for crop yield prediction: a systematic literature review

A Comprehensive Review on Deep learning Techniques for Crop Yield Prediction

Hybrid Deep Learning-based Models for Crop Yield Prediction