Pitch F/X with PHP and Octave
Posted by Rufio Magillicutty in MLB, Octave, PHP, PitchFX on June 8, 2012
Generally, sabermetric nerds use R (or Excel lmfao) to appropriate their desire to waste time. I, however, prefer Octave, an open-source alternative to MATLAB. Its more intellectually stimulating than R, in my opinion. By that I mean, there isn’t a built in mechanism to download thousands of user submitted R functions and scripts (Believe it not, there is even a pitch fx library to use with R). There are degrees of laziness, and while I fall in the upper percentile, people who use R really would rather remain rigid regarding routes to regale their ridiculous yet reasonably remunerative run value research.
Actually, the primary reason I’m using Octave is its the program of choice for ML-Class. Its also more syntactically convenient.
This isn’t important. What is important, however, is something that is very easy to implement. The motivation behind using PHP with Octave is the lack of MYSQL compatability with Octave. I was having trouble getting it to work, as have others, based on a google search. Fortunately, one can call Octave, or open up a shell to essentially run anything, from PHP, with:
system('path/to/bin path/to/file'); |
Visit php.net for more info.
For Octave users on Linux (Windows users may have to specify the full path to Octave):
system('octave -q path/to/file'); |
Octave script files use the extension “.m”, and in PHP we can easily write to a file and save as “filename.m”:
$octave_file = fopen('octave_file.m','w'); fwrite($octave_file,"This is octave code to do stuff really cool"); fclose($octave_file); |
This is elementary, though vital, PHP code.
I’m going to assume the reader has a Pitch F/X database already. If not, grab my script at github here (its setup to grab from yesterday, comment that section and change the start and end dates), or download here (pbp2.zip).
Octave is going to be used for all the math, so the Octave script will include code to open a data file, a file that is initialized in the PHP code:
$file = 'pfx_data.txt'; unlink($file); $f = fopen($file,'a'); |
It is of major importance that the ‘unlink’ function is called, and the ‘a’ property be assigned to the file. The data is going to be appended to the file each time the PHP code loops through the query results. And when the PHP code is run the file itself is going to be deleted before written to.
After connecting to a database, here is a rather dull example query:
$query = "SELECT p.px, p.pz, SUM(IF(p.event IN ('Single', 'Double', 'Triple', 'Home Run'),1,0))/SUM(IF(p.event NOT LIKE '%Sac %',1,0)) AS BA FROM pitches as p JOIN (SELECT id FROM batters WHERE name_display_first_last LIKE '%$player_name%' LIMIT 1) AS t ON t.id = p.ab_id WHERE LEFT(p.gameName,4) BETWEEN ".$yr_start." AND ".$yr_end." AND p.type = 'X' AND p.px IS NOT NULL GROUP BY p.px, p.pz"; |
This is extracting a specified player’s contact Batting Average. (Different from BABIP, which excludes Home Runs, hence the phrase “Balls in Play.” Any sort of “Sacrifice” will be undefined, and ignored below.)
Subsequently on each loop, the query results are appended to the file “pfx_data.txt”.
$result = mysql_query($query); while($row = mysql_fetch_assoc($result)){ if ($row['BA']=='') continue else fwrite($f,$row['px'].','.$row['pz'].','.$row['BA']."\n"); mysql_free_result($result); mysql_close($connection); fclose($f); |
I’ll provide a file so one can decipher all the Octave code themselves. For now, here’s an example plot in Octave (default is the gnuplot graphing library):
data=load('".$file."'); X = data(:, [1, 2]); y = data(:, 3); hits=find(y==1); outs=find(y==0); plot(X(hits,1),X(hits,2),'rx','MarkerSize',7) hold on plot(X(outs,1),X(outs,2),'bo') title('".$player_name." pitch position on cBA (".$yr_start." - ".$yr_end.")'); set(gca,'xlim',[-2,2]); set(gca,'ylim',[0,5]); legend('hits','outs'); xlabel('Horizontal position (ft)'); ylabel('Vertical position (ft)'); text(-1.8,4.8,['N = ',num2str(length(data))]); text(-1.8,4.5,['hits = ',num2str(length(hits)),', outs = ',num2str(length(outs))],'fontsize',6) |
Obviously, some of the variables are PHP assignments, and wouldn’t be used in Octave. In PHP, just condense what would have been written in a script or in the Octave terminal into the call to ‘fwrite’, over-writing any text that was there before:
$file1 = 'pfx_octave.m'; $f1 = fopen($file1,'w'); fwrite($f1,"data=load('".$file."');\nX=data(:,1:2); y=data(:,3);\nplot(X(hits,1),X(hits,2),'rx','MarkerSize',7)\nhold on;\nplot(X(outs,1),X(outs,2),'bo')\ntitle('".$player_name." pitch position on cBA (".$yr_start." - ".$yr_end.")'); set(gca,'xlim',[-2,2]); set(gca,'ylim',[0,5]);\nlegend('hits','outs'); xlabel('Horizontal position (ft)'); ylabel('Vertical position (ft)'); text(-1.8,4.8,['N = ',num2str(length(data))]); text(-1.8,4.5,['hits = ',num2str(length(hits)),', outs = ',num2str(length(outs))],'fontsize',6) |
(The ‘\n’ character is just to make ‘pfx_octave.m’ look pretty. One can still use the file after closing PHP.)
The result is something like this:
I threw together a virtual strike zone and some hot and cold zones (TY to Mike Fast for his regularized strike zones, the file at the bottom provides code for LHB and RHB strike zone deviations):
hold on "plot([-1,1],[1.5,1.5],'k--') plot([-1,1],[3.5,3.5],'k--') plot([-1,-1],[1.5,3.5],'k--') plot([1,1],[1.5,3.5],'k--') plot([-1,1],[1.5+2/3,1.5+2/3],'k--') plot([-1,1],[1.5+4/3,1.5+4/3],'k--') plot([-1+2/3,-1+2/3],[1.5,3.5],'k--') plot([-1+4/3,-1+4/3],[1.5,3.5],'k--') hold off |
To calculate and plot hot/cold zones:
z=x=x1=x2[]; for i = 1:3 x=find(X(:,1)>=-1 & X(:,1)<=-1+2/3 & X(:,2)>=1.5+(i-1)*2/3 & X(:,2)<=1.5+i*2/3); x1=find(X(:,1)>=-1+2/3 & X(:,1)<=-1+4/3 & X(:,2)>=1.5+(i-1)*2/3 & X(:,2)<=1.5+i*2/3) x2=find(X(:,1)>=-1+4/3 & X(:,1)<=1 & X(:,2)>=1.5+(i-1)*2/3 & X(:,2)<=1.5+i*2/3); z=[mean(y(x,1)) mean(y(x1,1)) mean(y(x2,1)) z]; end hold on text(-.8,3.5-1/3,num2str(z(1),3)); text(-1+2/3+.2,3.5-1/3,num2str(z(2),3)); text(1-2/3+.2,3.5-1/3,num2str(z(3),3)); text(-.8,2.5,num2str(z(4),3)); text(-1+2/3+.2,2.5,num2str(z(5),3)); text(1-2/3+.2,2.5,num2str(z(6),3)); text(-.8,1.5+1/3,num2str(z(7),3)); text(-1+2/3+.2,1.5+1/3,num2str(z(8),3)); text(1-2/3+.2,1.5+1/3,num2str(z(9),3)); hold off |
Here is a multi-plot figure with a surface plot of Jeter’s data:
Rick Reed’s strike zone from 2008-2011:
Just an example of some of the things Octave can do. Copy the code directly into the PHP file and insert into a variable or directly into ‘fwrite’ and the Octave script can be opened directly from PHP.
All of this should work fine with MatLab, or even FreeMat (another open-source alternative).




Recent Comments