Moneyball and the CS:GO stat problem

Can we understand CS:GO better by the numbers?

Photo via DreamHack

On May 28, Jarek “DeKay” Lewis reported that Epsilon was seeking a ludicrous $200,000 buyout for Fredrik “REZ” Sterner. In contrast, Timothy “Autimatic” Ta’s contract was sold to TSM for just $35,000 in the beginning of 2016.

Recommended Videos

As prizemoney, salaries, and transfer fees continue to increase in Counter-Strike:Global Offensive and esports elsewhere, we should expect that the amount of time and effort used to scout an incoming player will increase proportionately. There are simply greater financial risks. And with these ever-increasing risks, you can expect that, sooner or later (with stops and starts) someone will try to push away experts and “the eye test” for a more “objective” methodology.

In a recent interview with theScore esports, Niclas “enkay J” Krumhorn, an analyst for G2 Esports, suggested that using stats in CS:GO could already be a “useful tool” for picking up players and determining their strengths and weakness. “Sure, players will most likely know themselves because they are professionals,” he said. “But it’s something else if you hear it from ‘just an analyst’ or if you actually have numbers backing this up, because numbers don’t lie.”

It’s understandable why stats have gained such reverence in CS:GO. Unlike a MOBA such as League of Legends, where teamfights are infrequent and laners are engaged in a more continuous struggle, in CS:GO a series a fights and duels take place every round and a broader sequence of rounds make up the larger mosaic of the match. Numbers and statistics help us make sense of the large number of interactions between players that happen each map.

But numbers do lie, even when you don’t torture them. Currently, stats in CS:GO are largely presented “raw” without any further weighting or context. For example, if an unarmored player with a glock somehow kills an armored player with a AK-47 and a full grenade set, that is counted as one kill, 100 damage. The reverse is also true. If a fully bought player easily mows down an ecoing player, it’s counted exactly the same: one kill, 100 damage. Now, some stats do help contextualize these kills. HLTV might tell you whether or not the kill was a headshot or if the kill was the first kill of the round or something else along these lines. But the difficulty and context of each kill are still largely unexamined.

Kills, deaths, and damage-based statistics would be more informative if they were weighted with respect to equipment differential, the difficulty of the duel, and the strength of the opponent. Fifteen eco frags scored versus Echo Fox shouldn’t be counted exactly the same as 15 kills found in gun rounds versus Astralis. The difficulty of the kill or damage done aside, these statistics could also be examined and weighted with respect to their effect on the larger game. If the bomb is three seconds away from going off in the B site in the 15th round of a half, a duel in the A site has no discernible effect on the game, so it should not be counted equally as a high-impact kill such as a Terrorist eliminating a solo site defender mid-round.

But perhaps simply talking about “good” stats versus which “bad” stats or weighted stats versus non-weighted stats misses the point entirely. If you watched the very Hollywood movie take on Michael Lewis’s Moneyball (starring Brad Pitt as the Oakland A’s GM Billy Beane), you would probably walk away from it thinking that the A’s were simply using better metrics such as on-base percentage to scout and evaluate players and, in turn, understand their price tags. But that’s more on the surface. Yes, some stats are certainly better at capturing a player’s contribution than others, but most of these numbers square their focus solely on outcomes and results which obviously misses most of the action.

Let’s say a fastball is hit hard 320 ft down the left field foul line. Is it a home run? At Minute Maid park the answer is yes, but that’s not the case at the slightly larger Chase field. Depending on the size of the stadium, the weather conditions, and the opposing defense, a hit could be recorded as Home Run, an out, or anything in between, yet the stat is recorded as a the product of the hitter alone. Instead of adopting the age-old, ultra-simplistic adage that “it all evens out in the end,” Billy Beane and the A’s front office adopted a radical new approach: they stopped using stats at all —at least in the traditional sense. In the book, Michael Lewis explains how the A’s instead employed the services of AVM Systems or Stats Inc to better capture the game without using traditional outcome-dependent statistics. Instead of trying to create better and better versions of a certain statistics, such as RBIs or batting average, AMV’s system recorded every event that occurred on the field and evaluated each with an “expected run value.”

With zero men on base and zero outs, the offense can expect a certain number of runs in an inning based on historical statistics (the exact number ranges from 0.55 to 0.45 depending on various datasets and methodologies.) Let’s say a hit with a certain velocity and trajectory point lands at point No. 115 on the grid as defined by AMV. Looking at 2,000 essentially identical hits, the system could say that this specific hit will be a double 100 percent of the time. If we know the value of having no outs and no men on base is 0.5, and the expected value of having a man on second with no outs is 1.2, then we know the exact “value” of this hit: it’s .7 runs (1.2 – 0.5 = .7).

Now, if the hit has a chance of being caught for an out, the expected value might be lower—say, 0.35—but it doesn’t really matter if the fielder gets to it or not when evaluating the batter. If the fielder happened to make a phenomenal catch he get’s credit for blocking 0.35 runs, but the hitter still gets credit for those 0.35 runs and the pitcher still is debited for 0.35. Striking out credits the batter with a negative faction of a run, while the pitcher gets a matching positive number. Hitting an easily caught pop-fly will more or less have the exact same effect. Stealing a base adds expected value for the runner. Picking him off adds positive value for the catcher or pitcher and so on. At the end of the season, a player can be evaluated by how many runs he created or prevented. While not perfect, it’s a fair system that takes a lot of luck and circumstance out the equation

This is how stats should work in CS:GO. Looking just at who won each duel is far too simplistic. Unlike baseball which mostly functions as a series of one-on-ones between the batter and the pitcher, CS:GO is a free-flowing five-on-five game comprised of thousands of tiny movements and decisions each round that leads up to those key confrontations. Players use sound cues, information plays, and deductive reasoning to figure out the positioning of the other side, while the opposing side is doing the exact same thing in the complex but crucial dance of the mid round. None of that shows up on a stats sheet. And the duels themselves are not simply a two-player stare down where the player who clicks first wins. There is cross-hair placement, positioning, movement, and spray control that’s made all the more complicated by that the use of utility and the presence of teammates. Additionally, there’s broader, more macro concepts concerning the economy and the pick and ban phase. It’s a complicated game; to reduce it to a series of outcomes feels too reductive.

Instead, what if, like baseball, every single action of a player was evaluated with respect to its effect on the game. How does waiting for a push-in positively affect your team’s team’s chances of winning the round or winning the map? What’s the value of spamming this smoke? What’s the value of throwing this flashbang? Or what’s the negative value of not lining up a smoke correctly or not checking a certain corner during the execute?

I don’t know how a program would be able to calculate all of these events and I don’t even know if creating such as system is possible. But this process of evaluation is exactly what we do anyway on our own when we watch a player demo, or review a vod, or just watch the match. It’s the eye test. We see a series of good and bad play and we try to evaluate who’s performing well, who isn’t, and the intensity of either value. Those mental calculations are subject to the whole host of human cognitive biases and hurt by our limited biological facilities, but I don’t think we crossed or come near crossing the Deep Blue barrier implicitly projected forward by Michael Lewis’s Moneyball.

Billy Bean and his sabermetrics cohorts would argue that Baseball is better understood by the numbers. To them, ball and player-tracking technology now and back then understood the game better than any human, but we’re not there yet in CS:GO. That’s not to say that CS:GO statistics currently have no illustrative power whatsoever. They do a good job of giving a broad overview of a player’s output. But, right now, they still can’t tell the difference between a single and a home run, let alone the velocity and trajectory of the moon ball that might just go over the fence. So what pedagogy is really possible through them?

For my money, the movement of corked balls, wooden bats, and men are better understood by computers, while this computer game is somehow still better interpreted by man.

Author