Tiens, un autre sujet sur le MDM... ^^

Azby |

22u geleden

[For those who really don't have the time:[/b]

The engine currently operates with a discrete grid of 11 xG steps (0.22 to 0.32). Below 0.25 the conversion rate is flat at 25%, above that it rises linearly to 32%.

A patch in April shifted the distribution upwards: from 32% to 71% of chances greater than or equal to 0.28 xG. Result: +20% goals per match, all competitions combined.

My own proposals, which are in line with those of Tomasm and Woods, are to raise the xG ceiling to 0.50 to reward really big chances, and to increase the frequency of chances to reduce the variance of inconsistent matches. The two together, without touching the conversion, which is well done.

_________________

Following the many recent discussions on the match engine, with the traditional carrot topic but also Tomasm's improvement project (https://www.virtuafoot.com/#forum?topic=170250) and Woods' message on the frequency of updates, I wanted to share with you an analysis based on the data available in the public database. The idea is not to call into question the work of the MDJ, who has made some defensible and rather elegant game design choices, as we shall see, but to contribute some statistically analysed elements to the debate. I've cross-checked my results with those of another manager, Misha, who has also carried out his own regression on around 300,000 occasions, and our figures converge almost perfectly, which gives me confidence in what follows. Brewen has also worked on this and has also dug deeper on a number of points. This is a 'macro' statistical analysis, we don't go into the details of individual matches. It's an attempt at explanation, not truth.

I'm sorry, but there will be mathematical terms, as is obligatory in this kind of discussion. Images are included to make it more meaningful for those who find it hardest to get to grips with the maths.

It's also a hell of a lot of work.

[For the method, we worked on around 800,000 events from the public database[/b] (chances, goals, fouls, cards, etc), with our own time filter: intervals of less than 10 minutes between consecutive events to avoid half-time and substitution artefacts. The hypothesis and logic of the filter: a team's xG rises continuously during a match (aggressiveness, fouls, invisible micro-events), and an event displayed merely updates the counter at time T. Over long intervals, the delta calculated between two occasions is no longer pure xG but xG plus latent residue. The 10 min filter limits this contamination.

I've checked that this choice of filter doesn't bias the results: repeated at 10, 20, 30 min and without any filter, the difference between the periods I'm going to compare remains the same to within one point. Note also that a short filter mechanically excludes more events in defensive matches (longer intervals) than in prolific matches. It's a selection bias but it acts identically before and after the patches.

[An important clarification: for xG values below 0.22, I cannot say that they are "real" opportunities in the sense of the engine. These are deltas calculated between consecutive events, which may include latent residuals (any event that "raises xG" without triggering a visible opportunity). On the other hand, on the main grid we'll be looking at, i.e. 0.22-0.32 xG, the figures are astonishing and reproducible.

We'll start with Misha's data and analysis. Here we can see that the engine works with a discrete grid of xG steps ranging from 0.22 to 0.32 (if we choose to use steps of 0.01 xG). Here's how the conversion rate behaves as a function of the step size:

Table and raw curve of goal occurrences per xG generated by the occasion (from 0.00 to 0.32). Source: Misha's analysis of 300k chances.

This can be summarised in two lines:

Below 0.25 xG: strictly flat conversion at around 25%, regardless of the quality of the second-hand car.
Above 0.25 xG: linear progression at a slope of +1 point per notch, up to 32% for the 0.32 level.

Smoothed curve of the occurrence of goal: you can see the floor around 25% and then the progression from 0.26.

To show this even more clearly, here is the same data broken down into two separate linear regressions:

On the left: regression on 0.00-0.24 or R² = 0, completely flat conversion at 25.2%. Right: regression on 0.25-0.32 or R² = 0.99, slope of +1 point per notch.

The image speaks for itself: below 0.25 xG, the quality of the opportunity is mathematically useless. Above 0.25 xG, there is a real linear progression that rewards good opportunities. If we also look at off-grid values (below 0.22 xG), we find a conversion rate that fluctuates around 25%, with no clear trend. As a reminder, this is precisely the zone that I refrain from analysing because these values are probably a mixture of real mini-opportunities and indistinguishable latent residues.

This is probably not accidental. It's a deliberate design, which seems to want to balance determinism and chance: enough chance for each opportunity to retain a value, enough slope for the best ones to be rewarded. Mathematically, it's pretty clever. But also frustrating.

Now that we've said that, we need to talk a bit about the recent changes. For that I'm going to switch to the data I used (covering matches played between 4 March and 7 April 2026). You can guess two patches. The first, at the end of March, is a little trial balloon, and the second, on 2 April, leads to visible changes in the engine:

Night of 19/20 March: the proportion of chances + goals awarded in the so-called "premium" tiers (≥ 0.28 xG) rises from 32% to 43%.
2 April: another jump, this time from 43% to 71%!

To visualise this change, here's the distribution before and after the patches:

Distribution of around 340,000 opportunities BEFORE patches and their conversion rate, 10-minute interval.

Distribution and conversion rate AFTER patches. The low tiers (0.22-0.24) saw their volume plummet. Premium tiers (0.28-0.32) have almost doubled.

This is a real collapse in absolute terms, not a simple redistribution in percentage terms: the low tiers have fallen from 1.85 opportunities + goals per match to 0.36 (-81%), the medium tiers from 1.88 to 0.38 (-80%), and the premium tiers have exploded from 2.93 to 7.22 (+147%). And this lost mass has not gone elsewhere in the form of fouls or cards: their volumes per match are stable to within 1%. The engine simply produces more premium chances per match.

Now let's look at the matches themselves. In the league (ch=1), the overall statistics before and after the patches speak for themselves: from 2.85 goals per match before 20 March to 3.45 after 2 April, an increase of 21% in 13 days. The rate of games without a goal fell from 11.3% to 8.8%, and the proportion of games with 4 goals or more climbed from 33.6% to 46.3%.

There is one caveat, however: we've only just come out of the league break, so the post-match volume is low, with only 272 matches in my database. For those who remain sceptical about the statistical basis, let's look at friendlies, where the volume is much more solid. In friendlies (ch=0), there are 16,040 matches before 20 March and 5,210 after 2 April. Goals per match rose from 2.93 to 3.52, an increase of 20%. The 0-0 rate fell from 10.8% to 7.7%, and the proportion of prolific matches (4+ goals) rose from 36.4% to 48.1%. The relative amplitude is almost identical to that observed in the league.

Over all matches in all competitions (21,656 before, 6,306 after), the number of goals per match rose from 2.84 to 3.40, again an increase of 20%.

To dig a little deeper, I also looked at the EIs separately. Here, the effect is measurably attenuated: we go from 2.20 to 2.50 goals per game, or +13%. But the IE (as you can see from the export stats) are the highest-level competition, with teams on average twice as strong as in the friendlies. The matches are tighter and more defensive, and that's where the boost from the patches is least felt.

Well, we'll calculate a few other things here... but I think we're pretty well there already ^^.

[So once we've said that, how do we make the link with the improvement project behind it?

Tomasm suggested a few weeks ago doubling the xG cap to 0.54 to reduce the carrot feeling. Woods has more recently suggested going back to the old allocation probabilities and increasing the number of updates per match. Both approaches have the same objective: to make matches more readable and less frustrating. And the data I have lends credence to both.

Recent patches have indeed increased the number of goals, but by twisting the xG distribution upwards rather than tackling the underlying problems: the ceiling hasn't moved much, remaining at 0.32 xG at present. The frequency of chances has barely changed. The median interval between occasions has dropped from 388s to 368s, i.e. -5%, and I couldn't tell whether this is a real effect or just noise. The feeling of carrot and inconsistency remains.

Tomasm is right about the cap. As long as the cap stays at 0.32, a big opportunity is worth 32% success at best. The engine simply doesn't have the mathematical space to reward a clear-cut chance, and that's probably the structural reason why we all howled at the MDM during a big domination. It's also, in my opinion, what forced the hand of recent patches: not wanting to raise the ceiling, the MDJ had to compress the whole grid upwards to generate more goals. A clever workaround, but not a permanent solution. Raising it to 0.50 would give the engine the space it needs to distinguish between a genuine free-kick and a half-chance, without having to twist the distribution. The most important point for me is that freeing up the ceiling doesn't eliminate the overall randomness. A shot at 0.5 xG during a period of great domination is still missed half the time. You retain all the element of surprise that (in my opinion) is the joy of football.

Woods is right about variance. With a constant frequency of chances (around 4 per team per match), the variance per match is mechanically enormous: with 4 shots at 25%, you come up empty-handed around 32% of the time, whatever your level. Moving to 6-8 chances per team would reduce this variance and generate fewer matches with inconsistent results. Randomness is still present, and there will always be matches that you 'should' have won and will lose, but to a lesser extent.

And for me, the two tracks go together. The ceiling alone opens the door to frustrating "0.5 xG misses" if the volume remains low.

The frequency alone brings us back to the same grid compression that the patches have just produced.

Taken together, they make the engine more readable without reducing its randomness. This seems to me to be fairly faithful to what MDJ has already built around what we perceive of its match engine (which isn't 'rubbish', far from it).

If anyone has a few minutes to take a look at this analysis and check that we don't have any methodological biases... I'd be happy to share other figures if I can. And if other managers have data or thoughts to add, don't hesitate. It's by pooling our analyses that we'll be able to make collective progress.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Galywat |

21u geleden

The analysis is interesting (more so than I thought when you brought it up). However, I still have this reservation:

there's no analysis (and it's not your fault) of the news items where xG is generated without opportunity/goal --> that's the whole problem, in my opinion, because the opportunity/goal conversion rate is nice, you learn little tricks, but it allows for very limited analysis. And the game is largely made up of news items (with xG) that don't provide any opportunities. (You only have to look at the number of people complaining that nothing happens in matches, and rightly so, since nothing can be displayed for dozens of minutes sometimes). Now, an event at 0.2xG that doesn't generate any chances is no less dangerous on paper than an opportunity at 0.1xG. The opportunity is just a display.

If there were to be an evolution, I'd tend to agree with woodz. The impact of variance is much less significant when you increase the number of events. And I hope that this kind of analysis will make Aymeric want to give the evolution of xG for each event, not just for each action, even if it means reducing the size of the sample of matches.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

aloisio |

20u geleden

Great job Azby, a future data analyst if you're not one already!

In any case, if I've understood correctly, recently the match engine seems to be much more punitive as soon as a premium opportunity is created.
That's why I made the link with LR's 260 points, because as the dominant cartel, it wouldn't have been surprising if they'd benefited from this 'bonus' in terms of concretisation.

As for the rest, the options are on the table, even if there may be (i) counter-measures to these options (such as reducing the number of chances scored on goal to something more dispersed) or (ii) other options: creating a malus on afk, dealing with the bugs of attacks that are too strong on the flanks, etc.

Well done!

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

jul068 |

13u geleden

CP has come a long way 😁

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Magpie |

12u geleden

Or it must have been strange for him to have an audience that understood his paintings 😁

Either way, hats off to the artists. Even if you can have several angles of reading, and an opinion on the raw data, at least it gives you something to think about.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Ced90 |

12u geleden

Very clean analysis, well done 👍🏼

It confirms the feeling, and I also think that the various solutions proposed can improve the situation.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Azby |

11u geleden

After that, it's highly likely that we'll get the basic premise wrong, in which case the analysis no longer holds. 🤷

It's really just a thought exercise.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Adrimax |

11u geleden

Well done, and thanks for your work.

Has pieutte understood anything?

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

myforsans |

8u geleden

Nice work ...even if I didn't understand everything ;( ;( ;(

And well done for the investment and, I'd even go so far as to say, the self-sacrifice, because all this work, which is likely to have taken several hours, can be wiped out in 5 seconds with a small, discreet and unannounced change to a programme line or even a single digit in one of the programme lines that manages the MDM!

Unfortunately in this game, the truths of today are not the truths of yesterday, nor are they the truths of tomorrow. For example, you only have to look at the current releases of players from the CDF, detections and players from the VF store!

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

pieutte |

7u geleden

Adrimax: Bravo, et merci pour le taff.

Est ce que pieutte a compris qque chose ?

I wrote it...

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

OMstar83 |

7u geleden

Adrimax: Bravo, et merci pour le taff.

Est ce que pieutte a compris qque chose ?

You already have the answer, admit it.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

zejl |

7u geleden

What a job this analysis has done, and even if it's difficult to understand in detail, it allows the majority to understand how the whole thing works, the creator to have another vision, and the more mathematically minded to get their bearings and bounce back.

Most of us wanted a change and we got it, and that's a very positive thing.

So yes, there are bound to be some adjustments, but we're moving in the right direction and we have the impression that we're being heard.

For the first time in a long time I'm thinking that maybe I won't end up getting bored with this game in the medium term, and that's great.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Deck |

7u geleden

jul068: Le niveau CP a sacrément évolué 😁

It's an average.

I'll let you imagine the idiots we have to catch up with. And right now I'm going to propose an idea so that you understand what I'm talking about 😅

Idea: to avoid having the impression that nothing is happening, to have regular "actions" and, in addition to fouls, cards, chances and goals, we add pink rectangles with comments such as "The teams are neutralising each other" "Absolutely nothing is happening today. "What a boring match..."

I realise as I'm writing this how boring it is 😄

Listen to the others instead, have a good day

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

brewen |

5u geleden

Galywat: L'analyse est intéressante (plus que ce que je pensais quand vous l'avez ébruité -le filtre temporel est par ailleurs malin-). Néanmoins, j'ai toujours cette réserve :

il n'y a pas d'analyse (et c'est pas de votre faute) sur les actu où sont générées de la xG sans occasion/but --> c'est tout le problème selon moi car le taux de conversion occasion/but, c'est sympa, on apprend des petits trucs, mais ça permet des analyses très limitées. Or le jeu est en grande partie constitué d'actus (avec...

This is precisely the major point I noted when reading Misha and Azby's analyses, and which I've tried to explore further.

What you get:

The raw xG of the update generated in the file (the analysis here focuses on that)

What we don't see (in the style of the bias that made us think that a fault generated xG):

The probability of the event that generated xG being fired.
The unknown influence of independent factors on xG (styles of play, aggressiveness, attacking vs. defending, etc.).

So, my hypothesis was that an action at 0.01xG at 0.25 will generate a 25% chance of blue and a 75% chance of green. Even so, the probability of seeing a set [green,blue] for a stock at 0.01xG will potentially be 4%, compared with 80% for a stock at 0.2xG. So we could have the following distribution of draws:
0.01xG: 1 goal, 3 actions, 96 events with no indications
0.2xG: 20 goals, 60 actions, 20 events with no indications

On paper, the consistency of the xGs would therefore be followed, but the display would show that 25% of the chances [action,goal] would result in a goal (1/4 vs 20/80). However, I don't think it's possible to analyse this ratio because, unless I'm mistaken, it hasn't been quantified. Furthermore, we don't know how certain parameters could influence these ratios, for example, aggressiveness could very well increase changing a ratio of 0.01xG as it is:
0.01xG: 1 goal, 3 actions, 96 events with no indications (no aggressiveness)
Aggressiveness++: 20 goals, 20 actions, 20 yellow cards, 20 red cards, 20 fouls

The only analysis I've found that could be used to answer this question slightly is based on the figures in Misha's first table. The first thing that caught my eye was the 'Opportunities' column (shown in blue, including green and blue actu), where there are several trends that go with the narrative:

Up to 0.22xG: Gradual increase in the number of chances (as if the probability of goals increased, but also the probability of chances)
0.22-0.24: We reach a sort of threshold in the number of chances and goal conversions, which does not vary particularly.
0.25+: The number of chances increases considerably, with the conversion rate equivalent to the xG overall.

I therefore carried out a totally empirical test based on the assumption that the xG really represents the probability of scoring independently of the conversion rate of 25% of the subset [goals,actions] as I detailed above, by dividing the number of goals by the xG of the levels, to obtain "xEvents" (the number of events expected per xG increment, including goals, actions, and non-visible artefacts), shown in green (note that this value is biased for very low xG, dividing by very low increments leads to very high Y variations, xEvents stabilises at around 0.03xG. Ideally, we should either exclude them or subtract them by subgroups of 0.001 of xG). Behind this, I simply projected the ratio of Occasions to xEvents, to observe empirically the proportion of occasions to the number of events at each level of xG (in red). Empirically, we can see that :

Up to 0.25xG, the probability of an occasion is not 100%. There may be other events different from the subset [action,goals] whose existence we don't know. But if we follow the curve, we would have xOccasion = 4xG, xBut = xG, xAction = 3xG (an occasion has a 25% chance of ending up as a goal, and a 75% chance of ending up as an action), and xArtefact = 1 - 4xG (an event that is not an occasion).
From 0.25xG, every event seems to be an opportunity [action,goal], which represents Azby and Misha's "premium" bracket. Here, we would have xBut = xG (Azby/Misha analysis), and xOccasion = 1 - xG, for no artefact event.

The xG would therefore really represent the probability of scoring, despite the fact that an opportunity always has a 25% chance of scoring.... assuming that no parameter can be used to considerably increase the probability of generating an opportunity, without modifying the xG. And above all, it generates a feeling of gag goals all the time, because 25% of the "gagesque" chances you'll see displayed will end up as goals, without seeing the "artefact" news package.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Pierabou |

4u geleden

Hats off to you guys for your analysis!

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Galywat |

4u geleden

brewen: C'est justement le point majeur que j'ai noté lors de la lecture des analyses de Misha et Azby, et que j'ai tenté d'approfondir.

Ce qu'on a :

La xG brute de l'actualisation générée dans le fichier (l'analyse ici porte sur ça)

Ce qu'on ne voit pas (dans le style du biais qui nous faisait penser qu'une faute générait de la xG) :

La proba qu'avait l'event ayant actualisé de la xG, d'être tiré.

L'influence inconnue de facteurs indépendants sur la xG (les styles de jeux, l'agressivité, of...

Yes, I think it's a pretty good method. My main point was that you can't get much out of this analysis.

So yeah, there's not much to analyse in terms of news without opportunities/goals because you can't always see the xGs. Having said that, aymeric's file also gives the news with fouls (which don't generate any change in xG) but displays the xG at the time. So you can see at the start of a match which news items had xG without a chance/goal.

Based on the start of the match, you can see the % of each event (chances/goals/0 chances) by xG. It's not the most detailed because the number of samples is small, but it does give a trend that may help some people to understand xG better.

image](https://i.imgur.com/w8K2zlv.png)

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Galywat |

2u geleden

> **Azby**: Pour ceux qui n'ont vraiment pas le temps :
>
> Le moteur fonctionnerait actuellement avec une grille discrète de 11 paliers de xG (0,22 à 0,32). Sous 0,25 le taux de conversion est plat à 25 %, au-dessus il monte linéairement jusqu'à 32 %.
>
> Un patch en avril a déplacé la distribution vers le haut : on est passé de 32 % à 71 % d'occasions supérieure ou égale à 0,28 xG. Résultat : +20 % de buts par match, toutes compétitions confondues.
>
> Mes propositions qui n’engagent que moi et qui rejo...

In fact, I'm only thinking about it now because I didn't realise it last night when I read the post. But it's perfectly normal for the opportunity/goal ratio to be higher than 0.25.

If we consider a rate of 25% on the ratio goal opportunity, it just means that 0.1 xG gives 10% chance to have a goal. We can estimate a 30% chance of having an opportunity. (40% goal or opportunity combined)

at 0.25xG, 25% goal 75% chance (100% chance/goal)

beyond that, of course, the ratio decreases: for example, 0.28 xG gives you a 28% chance of a goal but you won't have an 84% chance of an opportunity, so you'll have 72% (so your ratio goes from 25% to 28%), but that's just mathematics, there's no real analysis to be done.

Besides, on the graph I've put up it's pretty clear.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Galywat |

1u geleden

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Socrate |

1u geleden

Where is @demi brain ... ? 🧐🤓

PS: Nice work guys it's strong, very strong!
PS2 : No bickering on this post please

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Rull43 |

15 minuten geleden

You're in luck, we're on the CP discord every day and there are several of them, they're multiplying.

I just wanted to have a mug at the start. In the end, I'm drinking alone and colouring while they chat.

Dit bericht is vertaald. (FR) Oorspronkelijk bericht

Feedback

Azby |

Galywat |

aloisio |

jul068 |

Magpie |

Ced90 |

Azby |

Adrimax |

myforsans |

pieutte |

OMstar83 |

zejl |

Deck |

brewen |

Pierabou |

Galywat |

Galywat |

Galywat |

Socrate |

Rull43 |

Le Jeu et sa Communauté