Sports Options Live Odds

Extracting Data/Stats from StatFox for College Football


College Football season is shortly finding its way into our interests once again, the season is only about ten weeks away.  For me, this spells chaos since I suck at betting college football.  And I’m determined to find a way to get better, or at least be somewhat serviceable.  I need to start building a database.

How does one do this without going step by step, page by page copy and paste or the excel table import function, taking hours and even days to get all the information you want?

Loops.

Open excel and record a macro of yourself extracting a web table using the built in function(Data>>Source>>From Web).  Once the data is imported, stop the macro and briefly survey the VBA code.  You’ll get a good idea how the code is formatted, and the objects/properties involved in processing information by way of macro.

Find a nice list of teams from somewhere, preferably Statfox so as to remain consistent with the pages to be included.  Fortunately enough Statfox formats their page links with identical parameters surrounding the name of the team.  All spaces in the team names are simply replaced with “+”.

After having put a list of teams together in excel, replacing any spaces with “+”, name the range whatever, maybe something like “teams”, and stick a FOR…NEXT loop around the VBA code.  Then wait.

Sub GetDataFromStatfox()
'
' Grabbing CFB Data from Statfox Macro
'
'
Dim i As Integer
Dim cnstr As String
i = 1
For i = 1 To Range("teams").Rows.Count
cnstr = "URL;http://www.statfox.com/cfb/gamelog~teamid~" & Range("teams").Cells(i, 1).Value & ".htm"
Sheets.Add.Name = Range("teams").Cells(i, 1).Value
With ActiveSheet.QueryTables.Add(Connection:=cnstr, Destination:=Range("$A$1"))
.Name = Range("teams").Cells(i, 1).Value
.FieldNames = True
.RowNumbers = False
.FillAdjacentFormulas = False
.PreserveFormatting = True
.RefreshOnFileOpen = False
.BackgroundQuery = True
.RefreshStyle = xlInsertDeleteCells
.SavePassword = False
.SaveData = True
.AdjustColumnWidth = True
.RefreshPeriod = 0
.WebSelectionType = xlEntirePage
.WebFormatting = xlWebFormattingNone
.WebPreFormattedTextToColumns = True
.WebConsecutiveDelimitersAsOne = True
.WebSingleBlockTextImport = False
.WebDisableDateRecognition = True
.WebDisableRedirections = False
.Refresh BackgroundQuery:=False
End With
ActiveWindow.SmallScroll Down:=0
Next i
End Sub

My computer is piss poor therefore it takes about 3-5 minutes for the process to arrive at its completion.  A fast computer should complete the task within 20 or 30 seconds.  Now if you want to take it a step further and extract more data (a severe invasive extraction, sorry statfox), maybe from previous years, or get all stats from 1997 to 2009 (Statfox only goes back to 1997), then what do you do?  Stick another FOR…NEXT loop in there and this time, you want to add new workbooks as well as new worksheets within those workbooks.

Dim yearstart as Integer
Dim year end as Integer
Dim k as Integer
Dim wbname as String
 
yearstart = 1997
yearend = 2009
k = 0
For k = 0 to (yearend - yearstart)
wbname = yearstart + k & ".xls"
Workbook.Add
ActiveWorkbook.SaveAs Filename: = wbname , FileFormat:=Normal _
ReadonlyRecommended:=False, Createbackup:=False
Windows(wbname).Active
'Code from above
Next k

Once again you can even take it a step further and write a macro that only finds the rows that contain a particular piece of data. Most games that will be displayed on the statfox page will correspond with a date. The macro elements ‘IsDate’ or ‘InStr(string1, string2, date)’, or ‘IsNumber’ if you just want the line, wrapped around an If…Then statement within the inner FOR…NEXT loop, and whatever information or set of statistics that you want to operate with can be manipulated.

  • Share/Bookmark

Related posts:

  1. Scraped Pinnacle Baseball Lines via PHP
  2. Sabermetric Pitching Stats and Correlations
  3. College Football 2009-2010 Preseason top 10
  4. College Baseball Ratings Resource
  5. MLB W/L Compared to Units Won
  1. No comments yet.

You must be logged in to post a comment.