Wednesday, August 10, 2011

Cheater Cheater, Pumkin Eater!

By the time you read this article, no doubt, you will have already heard about the ESPN Magazine story accusing the Blue Jays of stealing signs.  According to the story, last year, four unnamed visiting players witnessed a man dressed in white, sitting just to the right of the centerfield batting eye, signaling to the batter whenever any off-speed pitch (i.e. non-fastball) was being thrown by raising his hands over his head.  The article speculates that this man may have been receiving the signals via blue tooth and then relaying the information to Jays hitters. 

Stealing signs has long been a part of baseball.  In fact, many would suggest that if you are a runner on 2nd base and aren’t trying to steal signs, then you aren’t really trying.  But there is a palpable difference between players on the field stealing signs, and employing a third party (and perhaps additional technology) to do so.  For the record, I’m not sure how I feel about this if one presumes the allegations are true.  But my own feelings aside, this is going to be (or may already be by the time you read this) a very big story.

If the story ended there – with the allegations from the four anonymous opponents, I’m not sure anyone would care.  But the report goes on to do some analysis on the numbers from 2010, and the results are intriguing.  The one piece of information that REALLY caught my attention  was the fact that the differential in home run rate at Rogers Center last year between the home and visiting team was the 3rd highest, in any stadium, in the last 60 years. 

That’s kind of hard to ignore.

Sure, one could argue that this means that there were two other seasons where a team enjoyed an even bigger advantage, and the story doesn’t discuss who those teams were and whether there were any sign stealing allegations in those cases.  At least in this sense, I would agree with the criticism being leveled that the story is not complete yet, and more analysis should have been done.

However, some of the kneejerk reaction from twitter dismissing this article is perhaps not well thought out.  The article itself looks at some of the players with very large home/road splits last year in support of its theory.  As some have pointed out, a few players, including John Buck and Edwin Encarnacion, enjoyed better seasons on the road.  While this is indeed true, that doesn’t change the impact of the large sample, whole team data.  One would expect, even with a sign stealing advantage, that a couple of players would still hit better on the road – perhaps they weren’t involved in the scheme, perhaps they are just not well suited to hitting in the Rogers Centre, perhaps they had one or two unusual hot streaks on the road (anyone recall E5 hitting 6 home runs in a 3 game series in Arizona last year?), or perhaps it is just random statistical variance.  In any event, that does nothing to convince me the story isn’t accurate (which isn’t to say I believe it is either).

Some have also pointed to the fact that the statistical advantage has significantly decreased in 2011, despite allegations that the sign stealing continues.  But the report also reveals that several teams, including the Yankees and Red Sox, have begun mixing their signs even without men on base.  As more teams defend against the alleged sign stealing, one would expect the advantage to diminish, and perhaps that’s what we’re seeing.

Another criticism knocks the article as being incomplete because it admits the evidence is circumstantial.  This is only sort of true – the article says the numbers, by themselves, are circumstantial, and then suggests that when taken in context with the evidence from the unnamed players, it is something more.  Not to get too technical hear, but whether evidence is categorized as circumstantial or direct is nothing more than a label.  Circumstantial evidence (which, by the way includes such things as DNA and other forensic evidence) can be more than compelling enough to convict even in a criminal trial.  In any event, there IS direct evidence here - the eye witness accounts of the four players (which in my mind is far less compelling than the "circumstantial" statistical evidence).

But this is not a criminal trial.  This is the court of public opinion, and in so far as it relates to baseball, the standard of proof is far lower than beyond a reasonable doubt.  The real question is, do you believe the evidence is strong enough?  I’m not sure that I’m convinced, but I greatly look forward to what I’m sure will be countless articles that provide far more in depth analysis to try to get at the truth. 

And apparently, that begins at 3:45 when AA addresses these allegations in a press conference.  This should be interesting…


  1. The thing about stats is with enough of them generated, many of them are going to be outliers. The 2010 Toronto Blue Jays hit 257 home runs, which I believe is the third or fourth highest total of all time. As they play in a hitters park, if they didn't have something close to the 3rd highest differential in the past 60 years, that would be a surprise. I'm pretty sure the 05 Rangers, 97 Mariners and 96 Orioles are also in the top ten in that home advantage list as they are the rest of the top 4 all time.

  2. I'm not a huge sports fan and much of the evidence is incomprehensible to me, but what I do wonder is, even if they cheated, do it really matter? They didn't even make the playoffs. So even with a hitting advantage (earned or stolen), the result had no impact on the final outcome of the season.

  3. There's a reason why at least three years of data are used to calculate park factors...

    Correlation does not imply causation.

  4. Samantha - I really do agree with what you're saying. As I mentioned at the outset of my post, I don't know that I really care.

    Anonymous and Percy - my biggest problem with the article is also the only thing I found compelling, which is the statistical analysis. When measuring park effect, the statistics are SUPPOSED to factor out the overall offensive ability of the team (either home or road) and measure how the park itself altered the results. The problem with the statistics in the story is that we don't have enough information to really understand the source of the numbers and what they actually mean other than some fairly ambiguous language.

    I was really hoping this would spark a good deal of statistical analysis from FanGraphs, BP and other sites, but so far, the only reasonably in depth analysis after this article was by Parkes over at Getting Blanked.

  5. James, for the stats in the article to prove anything in support of the thesis that the Blue Jays are stealing signs, we would have to assume that the Blue Jays were stealing signs in every home at-bat for every player last year.

    For example, those 2010 season stats also include the at bats where the anonymous sources say the man in white had left the game. You can't use the results of at bats where they weren't stealing signs as proof that they were.

    I just don't think its possible for the Jays to have done that, and neither do the authors. They make a point to cite the fact that the Yankees and Red Sox are using mixed signs with no one on. The implication being that mixing signs allows you to defend against the Blue Jays mischieveous ways. This makes sense and is the exact reason why mixed signs are used when runners are on base.

    If we assume that the Jays can't "cheat" if signs are being mixed then we also can't use the results of 2010 at-bats where mixed signs were being used (i.e. any time a runner is on base) as proof of anything.

    I agree with you that that particular piece of information REALLY catches one's attention. But when you think about it, it can't show at all, what the authors want it to. So all you're really left with then as proof are the anonymous sources, which I think are much weaker without the stats backing them up.

    (Aside #1: A good contrast is steroids. You could reasonably make these type of inferences about anomalous stats and steroid use, because you can generally assume that a player would be on the juice, or at least be benefitting from its effects, for every at bat)

    (Aside #2: I actually think the Jays probably steal signs, but do so in the same manner as everyone which is with runners on base).

  6. The fact that they had the third highest Ratio of home-to-visitor home runs is used to lend credence to their thesis that the Blue jays are stealing signs. If we take this as either an outlier in the data or that they were in fact stealing signs, what about the other two teams higher on that list?