Dynamics for the following provided here and here
When undergoing a rigorous statistical evaluation of a wide array of factors, there will obviously be variables that directly impact other variables. Imagine a team executing an offensive scheme that only calls for a dozen rushing plays per game, in proportion the rest of the plays have to come from somewhere, probably via the forward pass variety. Therefore rushing attempts and passing attempts have a fairly indirect relationship.
Now where these relationships breakdown is when variables are thrown into a spectrum of a multi-correlated underpinning. Multiple variables when isolated can highly correlate to one particular variable, but when evaluated together they can lead to an incompatible aggregation.
I’ve discovered this ineffably unwelcomed behavior occurs whenever an array of college football factors are forced to interact in the realization of how a line is appropriated.
Below are the highest average line correlation coefficients for each year from 2002-2009, mostly using the offense to defense differentials:
One thing that is very encouraging is some of the coefficients above can be seen is measures of efficiency. Yards per attempt, yards per play, yards per point, completion percentage, all have effectual connotations. This is good.
Where and why might these inconsistencies of significance occur? To simplify, not every team uses the same offensive or defensive philosophies. Let’s use Texas Tech and Army. Diametric opposites in terms of offensive style. For Texas Tech, passing offense is far more crucial to team wins and team performance, where as Army’s option style offense induces very little passing, much more rushing, and unfortunately far fewer wins. This is just one example, but teams throughout the league all have their different styles, offensively and defensively, so different measures of performance carry more weight.
When mixing a set of independent variables into a system to calculate an alternative possibility for the value of one single variable (average line), the aggregation undergoes an intense struggle for explanatory superiority. The explanatory factors are easily manipulated by the inclusion of one other variable which has a high correlation, and in this scenario there is no delicate balance between any combination of variables that would provide a nice equilibrium. The one constant that preserves its level of consistency is PPG differential, which is very unsettling since its overly simple, when complexity is what whets the appetite.
Again with Texas Tech, passing statistics easily dominate rushing statistics on the scale of significance, yet the opposite is true for other teams. And this isn’t just true for passing and rushing metrics. Imagine a team with immaculate special teams play, a horrible offense, and a solid defense, or in other words, Virginia Tech. Virginia Tech is often rated out of the top 100 in all offensive numbers. However, their superb special teams and overall defense accounts for a solid showing year in and year out. (I should note, special teams statistics throughout the league as a whole are not highly correlated with average line or winning percentage.) Though when running their statistics into a regression model, very large inconsistencies materialize, and to extrapolate an average line based on highly correlated statistics offers zero indication of what the linesmakers think of the Hokies, or what the line shows, or even how good the Hokies are. Now we have reached a point where offense and defense are approaching mathematical singularity. Therefore it would seem regressions, or other tests of determination of the average line variable, would have to belie the actual philosophy and makeup of each individual team. Or perhaps I should serry the teams into categories based on style of play, offensive and defensive schemes. This would in turn breed smaller sample sizes, so we’ve disappointingly arrived at a paradoxical stipulation.
More at length.
Related posts:









Recent Comments