menu

Friday, 4 May 2018

Anthony Constantinou's football prediction system wins second spot in international competition

 
Anthony Constantinou

QMUL lecturer Dr Anthony Constantinou of the RIM research group has come second in an international competition to produce the most accurate football prediction system. Moreover, the winners (whose predictive accuracy was only very marginally better) actually based their model on the previously published pi-ratings system of Constantinou and Fenton.





Anthony's model Dolores was developed for the International Machine Learning for Soccer Competition hosted by the Machine Learning journal.

All participants were provided with the results of matches from 52 different leagues around the world - with some missing data as part of the challenge. They had to produce a single model before the end of March 2017 that would be tested on its accuracy of predicting 206 future match outcomes from 26 different leagues, played from March 31 to April 9 in 2017.

Dolores was ranked 2nd with a predictive accuracy almost the same as the top ranked system (there was less than 1% error rate difference between the two; the error rate was nearly 120% lower than the participants ranked lowest among those that passed the basic criteria).

Dolores is  designed to predict football match outcomes in one country by observing football matches in multiple other countries.It is based on a) dynamic ratings and b) Hybrid Bayesian Networks.

Unlike past academic literature which tends to focus on a single league or tournament, Dolores provides empirical proof that a model can make a good prediction for a match outcome between teams 𝑥 and 𝑦 even when the prediction is derived from historical match data that neither 𝑥 nor 𝑦 participated in. This implies that we can still predict, for example, the outcome of English Premier League matches, based on training data from Japan, New Zealand, Mexico, South Africa, Russia, and other countries in addition to data from the English Premier league.

The Machine Learning journal has published the descriptions of the highest ranked systems in its latest issue published online today. The full reference for Anthony's paper is:

Constantinou, A. (2018). Dolores: A model that predicts football match outcomes from all over the world. Machine Learning, 1-27, DOI: https://doi.org/10.1007/s10994-018-5703-7

The full published version can be viewed (for free) at https://rdcu.be/Nntp. An open access pre-publication version (pdf format) is available for download here.

This work was partly supported by the European Research Council (ERC), research project ERC-2013-AdG339182-BAYES_KNOWLEDGE
The DOLORES Hybrid Bayesian Network was built and run using the AgenaRisk software.

The full reference for the pi-ratings model (used by the competition's winning team) is:
Constantinou, A. C. & Fenton, N. E. (2013). Determining the level of ability of football teams by dynamic ratings based on the relative discrepancies in scores between adversaries. Journal of Quantitative Analysis in Sports. Vol. 9, Iss. 1, 37–50. DOI: http://dx.doi.org/10.1515/jqas-2012-0036
Open access version here.
See also:

Monday, 30 April 2018

Bayesian Nets to Determine Impact of Agricultural Development Policy



An interesting paper - describing use of Bayesian nets to determine impact of agricultural development policy on household nutrition in Uganda -  uses the new 'Value of Information' functionality developed in BAYES-KNOWLEDGE.

Full reference:
Cory W. Whitney Denis Lanzanova Caroline Muchiri Keith D. Shepherd Todd S. Rosenstock Michael Krawinkel John R. S. Tabuti Eike Luedeling (2018), "Probabilistic Decision Tools for Determining Impacts of Agricultural Development Policy on Household Nutrition", Earth's Future (Open Access) https://doi.org/10.1002/2017EF000765

Tuesday, 17 April 2018

Explaining Bayesian Networks through a football management problem



Today's Significance Magazine (the magazine of the Royal Statistical Society and the American Statistical Association) has published an article by Anthony Constantinou and Norman Fenton that explains, through the use of an example from football management, the kind of assumptions required to build useful Bayesian networks (BNs) for complex decision-making. The article highlights the need to fuse data with expert knowledge, and describes the challenges in doing so. It also explains why, for fully optimised decision-making, extended versions of BNs, called Bayesian decision networks, are required.

The published pdf (open source) is also available here and here.

Full article details:
Constantinou, A., Fenton, N.E, "Things to know about Bayesian networks", Significance, 15(2), 19-23 April 2018, https://doi.org/10.1111/j.1740-9713.2018.01126.x

Wednesday, 14 March 2018

Tuesday, 6 March 2018

Two coins: one fair one biased

Alexander Bogolmony tweeted this problem:


If there is no reason to assume in advance that either coin is more likely to be the coin tossed once (i.e. the first coin) then all the (correct) solutions show that the first coin is more likely to be biased with a probability of 9/17 (=0.52941). Here is an explicit Bayesian network solution for the problem:


The above figure shows the result after entering the 'evidence' (i.e. one Head on the coin tossed once and two Heads on the coin tossed three times). The tables displayed are the conditional probability tables defined for the associated with the variables.

This model took just a couple of minutes to build in AgenaRisk and requires absolutely no manual calculations as the Binomial distribution is one of many functions pre-defined. The model (which can be run in the free version of AgenaRisk is here). The nice thing about this solution compared to the others is that it is much more easily extendible. It also shows the reasoning very clearly.


Monday, 12 February 2018

An Improved Method for Solving Hybrid Influence Diagrams

Most decisions are made in the face of uncertain factors and outcomes. In a typical decision problem, uncertainties involve both continuous factors (e.g. amount of profit) and discrete factors (e.g. presence of a small number of risk events). Tools such as decision trees and influence diagrams are used to cope with uncertainty regarding decisions, but most implementations of these tools can only deal with discrete or discretized factors and ignore continuous factors and their distributions.

A paper just published in the International Journal of Approximate Reasoning presents a novel method that overcomes a number of these limitations. The method is able to solve decision problems with both discrete and continuous factors in a fully automated way. The method requires that the decision problem is modelled as a Hybrid Influence Diagrams, which is an extension of influence diagrams containing both discrete and continuous nodes, and solves it by using a state-of-the-art inference algorithm called Dynamic Discretization. The optimal policies calculated by the method are presented in a simplified decision tree.



The full reference is:

Yet, B., Neil, M., Fenton, N., Dementiev, E., & Constantinou, A. (2018). "An Improved Method for Solving Hybrid Influence Diagrams". International Journal of Approximate Reasoning. DOI: 10.1016/j.ijar.2018.01.006  Preprint (open access) available here.
UPDATE (22 Feb 2018): The full published version the paper is available online for free for 50 days here: https://authors.elsevier.com/c/1Wc6D,KD6ZG8y-

Acknowledgements: Part of this work was performed under the auspices of EU project ERC-2013-AdG339182-BAYES_KNOWLEDGE

Friday, 9 February 2018

Decision-making under uncertainty: computing "Value of Information"


Information gathering is a crucial part of decision making under uncertainty. Whether to collect additional information or not, and how much to invest for such information are vital questions for successful decision making. For example, before making a treatment decision, a physician has to evaluate the benefits and risks of additional imaging or laboratory tests and decide whether to ask for them. Value of Information (VoI) is a quantitative decision analysis technique for answering such questions based on a decision model. It is used to prioritise the parts of a decision model where additional information is expected to be useful for decision making.

However, computing VoI in decision models is challenging especially when the problem involves both discrete and continuous variables. A new paper in the IEEE Access journal illustrates a simple and practical approach that can calculate VoI using Influence Diagram models that contain both discrete and continuous variables. The proposed method can be applied to a wide variety of decision problems as most decisions can be modelled as an influence diagram, and many decision modelling tools, including Decision Trees and Markov models, can be converted to an influence diagram.

The full reference is:

Yet, B., Constantinou, A., Fenton, N., & Neil, M. (2018). Expected Value of Partial Perfect Information in Hybrid Models using Dynamic Discretization.  IEEE Access. DOI: 10.1109/ACCESS.2018.2799527

Acknowledgements: Part of this work was performed under the auspices of EU project ERC-2013-AdG339182-BAYES_KNOWLEDGE, EPSRC project EP/P009964/1: PAMBAYESIAN, and ICRAF Contract No SD4/2012/214 issued to Agena.