Free Player Injury Database
by Josh Hermsmeyer
When I started working on Rotobase I knew I wanted to generate notes like the following:
Negatives:
Injured wrist last season. 12% chance of re-injury. Players saw power drop by 17% in the year following this injury.
However all the databases out there are either proprietary, or don’t allow for easy access to the underlying data for analysis. So, I started developing my own database, starting with 2009. It was a bear. Scouring the news sites and local papers for injury info is not fun. It makes me respect completely the work Corey Dawkins put in to generate his tool, but it also makes me upset knowing that the work had been done, but that it wasn’t available for other folks to analyze.
So 2 months ago I asked some gentlemen in India to help me. Last week they blew me away, and delivered a pretty good data set of all players back to 2002. The database uses the data model developed by Tom Tango, where each injury is broken out by body part injured, side and description. This allows for a pretty granular analysis.
The work was hard, but I’m pleased to report my portion of it is finished (for now). I’m making the CSV file and the SQL dump available for download here. I’m hoping the community will find it useful enough to help me keep it updated.
All players are referenced using retroIDs.
There are only two restrictions on using the data, and folks can use it for commercial purposes if they choose.
1. You must post a link back to Rotobase.
2. You must make any additions to the database public in CSV or SQL dump form for others to use and enjoy.
I look forward to seeing what the saber community does with the info. For my part, I’ll be posting my analysis of some of the data in the next few days.
Enjoy!
UPDATE: File is offline while I update the database.
Injury_database.zip
Comments
Thanks you for this Josh.
Wow, Josh. I think you just made my week. Also, I just got my copy of Final Fantasy XII from eBay today.
I’m in nerd heaven.
Wow, very interesting. Not sure how this can be effectively used, but I look forward to checking it out.
Thx for the kind words Brad and Nick.
@ Kyle
My use for it is above in the post. Corey Dawkins was looking to combine it with some of the work Josh Kaulk was doing with Pitchf/x to predict pitcher arm injuries. He also wrote a good piece in the Hardball Times 2010 annual that you might check out.
HTH
Josh
This is awesome. Congrats, Josh!
[...] out there to find that, but this newest one might be the best. Via Tango, Rotoblog has developed a free player injury database. It comes in both CSV and SQL dump formats, so nerds of all levels can bask in the glory of their [...]
[...] Josh Hermsmeyer’s free injury database, Jeff Zimmerman over at Beyond the Box Score has compiled a list of the top 100 players that have [...]
@ Josh:
I meant how I might be able to use it with my database of MLB pitchers that I have on video!
[...] his performance going forward? Rotobase also has this free, downloadable player injury database here. Check it out. Categories: Resources Comments (0) Trackbacks (0) Leave a comment [...]
[...] his performance going forward? Rotobase also has this free, downloadable player injury database here. Check it out. Categories: Resources Comments (0) Trackbacks (0) Leave a comment [...]
[...] Free Player Injury Database | Rotoblog Interesting idea – free database of baseball injuries dating back to 2001. (tags: sql database baseball injury) [...]
[...] completely floored by the quality and velocity of the research that has already come out using the injury database. I’ll be using this post to catalogue links to all of the analyst’s amazing [...]
Thanks Josh! Excited to use this!
p.s.
I’m having trouble downloading the zip file… it downloads as a web page then says access is denied with I open it
[...] 22, 2010 by dingers Beyond the Boxscore compiled a bevy of information on injuries (compiled by roto blog, who deserves a lot of credit for the free work he did) and calculated the most money spent on the [...]
Any idea when this database will be back up?
[...] a week ago, using Josh Hermsmeyer’s injury database that lists player injuries from 2002 to 2008, Dan Turkenkopf of Beyond the Box Score examined the [...]
[...] a week ago, using Josh Hermsmeyer’s injury database that lists player injuries from 2002 to 2008, Dan Turkenkopf of Beyond the Box Score examined the [...]
+1
Yeah, eager to see this thing!
Ditto.
how do i link the playerID to a team?
I recently stumbled on this site looking for exactly this type of information but I cannot download the file either. Is the file still offline for updating? Any idea when it will be back up?