The ANP Network for Predicting Tennis Matches between Djokovic and Federer data for the two players playing. The process is explained in the following section. In the experiment, the alternatives are the world's top two tennis players who are about to face each other in the US open final match. Before this match, these two players played 41 matches from Apr17, 2006 at the Monte Carlo Masters, Federer won 21 (51.2%) matches and Djokovic won 20 (48.8%) matches, and their ranks are very close (Djokovic:1, Federer:3), so it was hard to predict who will be the winner.

The ANP Network for Predicting Tennis Matches between Djokovic and Federer data for the two players playing. The process is explained in the following section. In the experiment, the alternatives are the world's top two tennis players who are about to face each other in the US open final match. Before this match, these two players played 41 matches from Apr17, 2006 at the Monte Carlo Masters, Federer won 21 (51.2%) matches and Djokovic won 20 (48.8%) matches, and their ranks are very close (Djokovic:1, Federer:3), so it was hard to predict who will be the winner.

Source publication
Article
Full-text available
This paper is about predicting the outcome of tennis matches of the Association of Tennis Professionals (ATP) and the Women’s Tennis Association (WTA) using both data and judgments. There are many factors that influence that outcome. An important question is which factors have significant influence on the outcome. We have identified numerous factor...

Contexts in source publication

Context 1
... decision-making network of criteria and alternative outcomes was constructed as shown in Figure 5. The object is to predict which player will win the final match in the US OPEN 2015 by incorporating expert judgment with historical data about the two players. ...
Context 2
... decision-making network of criteria and alternative outcomes was constructed as shown in Figure 5. The object is to predict which player will win the final match in the US OPEN 2015 by incorporating expert judgment with historical data about the two players. ...

Similar publications

Article
Full-text available
The Association of Tennis Professionals (ATP) distributes a considerable amount of money in prizes each year. Studies have shown that only the top 100 ranked players can self-finance; hence, it is convenient to introduce changes to the prize distribution to promote a more sustainable system. A Linear Programming model to distribute the tournament’s...

Citations

... This article proposes a more efficient alternative, using a combination method that relies on the binomial distribution, which achieves the same accuracy as the recursive method. Wei Gu [2] et al proposed Predicting the Outcome of a Tennis Tournament: Based on Both Data and Judgments. This article is based on a combination of data and judgment, establishing an ANP Decision-making Network to predict tennis matches. ...
Article
Full-text available
Predicting match results has always been an important problem in the field of sports. Athletes and coaches can benefit from accurately predicting the path of a match, allowing them to strategize ahead of time and take control during the match. The PSO-BP neural network combines the advantages of particle swarm optimization and backpropagation neural networks to improve the accuracy of the model in processing complex nonlinear data interactions. ARIMAX is ideal for evaluating data that exhibits linear trends and seasonality. The model combines an autoregressive synthetic moving average component with external variables to account for linear trends, seasonality, and external influences in the data. Through the data in https://github.com/JeffSackmann/tennis_wta, this article establishes a combination of two models to predict game scores. Variables include advantage, turnover, lead, distance traveled, and serve. Using K-fold cross-validation, the accuracy rates of the PSO-BP neural network and ARIMAX prediction model were measured to be 82.89% and 80.12% respectively. Then bringing this model into the basketball game, we get its prediction accuracy: 77.12% and 76.21%. A random forest analysis method was used to evaluate the significance of each variable in prediction. The significance of each indicator is as follows: 0.33, 0.26, 0.17, 0.15 and 0.09.
... The advantages of the LR model include simplicity of computation, ease of understanding and interpretation, fast training speed, and good handling of sparse data. In addition, the LR model is suitable for binary classification problems and maintains good performance with high feature dimensions [10] . ...
Article
Full-text available
In any competitive games, the most exciting thing for the audience is the change of the scores of both players, and quantifying the trend of the scores of both players can help coaches provide timely guidance to the players during the games and also help the players in the preparation for the games. The author firstly obtains the data of the players' psychological state, physical reserve and athletic skills during the game, and then compares the LGBM, SVC, MLP and LR models, analyzes the accuracy and regression rate of each model, and then selects the optimal LGBM model, and then calculates the weights of each parameter through the selected model, and sums them up to obtain the quantitative value of "momentum". Finally, the fluctuating images of the two sides' "momentum" in the whole game are drawn, through which the scoring trend can be visualized and the turning points of the momentum can be marked. The use of such methods is a kind of innovation for sports events, which can scientifically help coaches to make decisions, and can bring more exciting sports events to the audience to fully reflect the athletes' style.
... They verified that bookmakers' odds are a good predictor of outcomes of both men's and women's tennis matches. Gu and Saaty (2019) predicted the outcome of tennis matches of Grand slam tournaments as well as of the ATP and the Women's Tennis Association (WTA) using both data and (unqualified, subjective) judgments, and this way identified numerous factors and systematically prioritized them subjectively and objectively, so as to improve the accuracy of the prediction. In McHale and Morton (2011), a Bradley-Terry type model was proposed for forecasting the top tier of the WTA and ATP competition. ...
Article
Full-text available
In this manuscript, different approaches for modeling and prediction of tennis matches in Grand Slam tournaments are proposed. The data used here contain information on 5,013 matches in men’s Grand Slam tournaments from the years 2011–2022. All regarded approaches are based on regression models, modeling the probability of the first-named player winning. Several potential covariates are considered including the players’ age, the ATP ranking and points, odds, elo rating as well as two additional age variables, which take into account that the optimal age of a tennis player is between 28 and 32 years. We compare the different regression model approaches with respect to three performance measures, namely classification rate, predictive Bernoulli likelihood, and Brier score in a 43-fold cross-validation-type approach for the matches of the years 2011 to 2021. The top five optimal models with highest average ranks are then selected. In order to predict and compare the results of the tournaments in 2022 with the actual results, a comparison over a continuously updating data set via a “rolling window” strategy is used. Also, again the previously mentioned performance measures are calculated. Additionally, we examine whether the assumption of non-linear effects or additional court- and player-specific abilities is reasonable.
... The very high accuracy reported in this study relative to others may indicate that future matches were incorrectly used to predict past matches. Gu and Saaty 30 predicted tennis match results using data and judgments. An analytic network process 31 that incorporated factor analysis and clustering was applied to 63 men's and 31 women's 2015 US Open matches, and achieved 85.1% accuracy. ...
Preprint
Full-text available
Elo ratings-based methods have been found to perform well when forecasting tennis match results, however, whether they can outperform ML has not been established. A comparative evaluation of the two types of methods is conducted using the Sports Result Prediction CRISP-DM experimental framework. The first full year of men’s tennis data (2006), in a dataset containing matches from 2005 to 2020, was set to be the initial training set and one year of data was incrementally added to this set to predict 14 test years, from 2007 to 2020. Features were ranked based on their average rank across five feature selection techniques. Of the five ML models, Alternating Decision Trees (ADTrees)and Logistic Regression achieved higher accuracies than Elo ratings and similar accuracies to predictions derived from betting odds. ADTrees show potential in this domain, with solid performance achieved with an interpretable decision tree that allows for variation in the average betting odds difference threshold.
... Sports are a part of our lives, and one of the things people love, whether as spectators or players. In real life, spectators are interested in predicting the outcomes of a match and verifying their predictions (Csató, 2021;Gu & Saaty, 2019;Healy & Kole, 2021;Sarlis & Tjortjis, 2020). Such prediction can be performed using many available approaches, including statistical, probabilistic and machine learning techniques (Constantinou & Fenton, 2017). ...
Article
Full-text available
Simulating and predicting tournament outcomes has become an increasingly popular research topic. The outcomes can be influenced by several factors, such as attack, defence and home advantage strength values, as well as tournament structures. However, the claim that different structures, such as knockout (KO), round-robin (RR) and hybrid structures, have their own time restraints and requirements has limited the evaluation of the best structure for a particular type of sports tournament using quantitative approaches. To address this issue, this study develops a decision support system (DSS) using Microsoft Visual Basic, based on the object-oriented programming approach, to simulate and forecast the impact of the various tournament structures on soccer tournament outcomes. The DSS utilized the attack, defence and home advantage values of the teams involved in the Malaysia Super League 2018 to make better prediction. The rankings produced by the DSS were then compared to the actual rankings using Spearman correlation to reveal the simulated accuracy level. The results indicate that a double RR produces a higher correlation value than a single RR, indicating that more matches played provide more data to create better predictions. Additionally, a random KO predicts better than a ranking KO, suggesting that pre-ranking teams before a tournament starts does not significantly impact the prediction. The findings of this study can help tournament organizers plan forthcoming games by simulating various tournament structures to determine the most suitable one for their needs.
... Other studies collected data from websites that gathered match data independently Kovalchik and Ingram, 2018;Fagan et al., 2019;Ingram, 2019;Makino et al., 2020). In addition, there were some studies that had no information about the data source (Pollard et al., 2006;Newton and Aslam, 2009;Tudor et al., 2014;Gu and Saaty, 2019;Stefani, 2020). A characteristic of these studies was their large data size. ...
... In relation to racket sports analytics, various works about predicting tennis match outcomes [4], [8], [12], [17] have also been published. The works are similar in that the match outcome prediction is based on the players' past performance data (aces, scored, successful first serves etc.), situational data (tournament level, court surface type, etc.) and players' basic demographic (seed, height, and etc.) information about the players. ...
... Since the survey subjects have different subjective priorities regarding parameters importance, the results of their judgments should take into account the subjective priority, and after that the geometric mean is formed. Consideration of the relationship between events using the AHP model was carried out by Saaty in another paper [15], where the authors examined the mutual influence of gains and losses. A pairwise comparison of events was carried out, and they proved the need of using correlation dependencies in the AHP model. ...
... A pairwise comparison of events was carried out, and they proved the need of using correlation dependencies in the AHP model. Comparison of the model results and actual data showed a high coincidence (85.10%) [15]. ...
... In fact, the model [14,15] already uses correlation dependencies. However, the model [14] does not make it possible to identify conflicting pairs of subjects of a survey. ...
Article
In this paper, a method of identifying conflict relations between the subjects of business processes is presented. The proposed solution seems quite important due to the high sensitivity of modern high-tech enterprises’ business processes to negative factors, as well as the need to develop correct management decisions in conflict situations. A company’s ability to identify internal conflicts and to take them into account during management decision-making is a feature of an effective business process. Modern methods of conflict detection that are available for practical use are able to identify conflict situations only at the stage of open conflict. In this case, the impact of the conflict on the business process is already material and may lead to deterioration in the company’s performance. Unfortunately, existing methods have a significant disadvantage: they are not able to identify conflicts at an early stage, when the impact of the situation on the business process is not noticeable. An innovative approach based on analytical processing of survey-based data is proposed. This approach is able to identify hidden conflicts among employees of the enterprise. Identifying a conflict situation at an early stage makes it possible to manage conflict and reduce subsequent financial loss.
... For the game of tennis, machine learning based match outcome prediction model was proposed by Wei Gu and Thomas L. Saaty [31]. Their proposed model combines data analysis and human judgement to predicts the win or loss outcome of 2015 US Open tennis matches. ...
Article
Full-text available
There has been a rapid growth in the domains of artificial intelligence, data mining and machine learning during the last few years. Machine learning techniques have been extensively used for outcome predication and classification in different spheres of research now a days. Machine learning shows excellent performance for outcome prediction and classification in the domains of medicine, cyber security, banking fraud, drug discovery etc. However, in the field of sports, particularly for the game of badminton, outcome result prediction with the aid of artificial intelligence and machine learning is still unexplored. The machine learning techniques for outcome prediction have been used for limited games only. This paper presents machine learning based technique for badminton match outcome prediction with less input attributes. Here, supervised learning approach with feature reduction techniques has been proposed for badminton match outcome prediction. The raw data related to Australian Open, Malaysian Open, German Open and Singapore Open Badminton tournaments from 2016 to 2019 are collected from internet sources (official websites and other websites). CSV file is formulated from the scarp data with total thirty features for singles tournament and thirty-four features for doubles tournaments. Correlation Feature Selection Method, Info Gain Attribute Selection Method, ReliefF Attribute Selection Method, Probabilistic Significance Attribute Evaluation Method and Symmetrical Uncertainty Attribute Evaluation feature reduction techniques are employed to evaluate feature significance. Fourteen significant features as input predictors for three machine learning classifiers are selected for badminton match result prediction. The classifiers performance for match outcome prediction is evaluated in terms of accuracy, root mean square error, receiver operating characteristics and other confusion matrices parameters. Results for each tournament with reduced features are analysed and compared with full feature dataset. It has been observed that Naïve Bayes with correlation based feature weighting shows remarkable performance in contrast to other proposed classifiers in match outcome prediction for reduced feature dataset.
... Many popular sports such as football (soccer), basketball, boxing, table tennis, volleyball, bowling, American football, and handball involve matches between two teams or players where each team have the possibility of scoring points throughout the match. Several research papers seek to predict the end match result (e.g., Karlis and Ntzoufras (2003), Groll et al. (2019), Gu and Saaty (2019), Cattelan et al. (2013)) in order to infer the match winner and potentially the winner of a tournament (Ekstrøm et al. 2020;Baboota and Kaur 2018). While the overall match result is highly interesting it conveys very little information about the individual development and trends throughout the match and modeling approaches that allow finer granularity of the running score difference throughout the match are needed. ...
Article
Many popular sports involve matches between two teams or players where each team have the possibility of scoring points throughout the match. While the overall match winner and result is interesting, it conveys little information about the underlying scoring trends throughout the match. Modeling approaches that accommodate a finer granularity of the score difference throughout the match is needed to evaluate in-game strategies, discuss scoring streaks, teams strengths, and other aspects of the game. We propose a latent Gaussian process to model the score difference between two teams and introduce the Trend Direction Index as an easily interpretable probabilistic measure of the current trend in the match as well as a measure of post-game trend evaluation. In addition we propose the Excitement Trend Index-the expected number of monotonicity changes in the running score difference-as a measure of overall game excitement. Our proposed methodology is applied to all 1143 matches from the 2019-2020 National Basketball Association season. We show how the trends can be interpreted in individual games and how the excitement score can be used to cluster teams according to how exciting they are to watch. Supplementary information: The online version contains supplementary material available at 10.1007/s10182-022-00452-w.