Another Statistical Look at the Draft

As we all know, Rockets fans love to talk about the draft. We either want to trade picks, talk about who to pick etc. So here is my two sense. Before I asked the good folks at TDS to give me a little input into what the best advanced statistic was to judge players. I disregarded all of you input. This isn't because I did not think it was relevant. I did it because I decided to use an alternate technique. Let me explain.

The question I think we all care about it "how can we get an all-star." There are 2 ways, either through the draft or in trade. In this write up I'm talking about the draft.

Because we want to know how to get all stars, we have to look at all-star voting. So what is the correlation between all-star voting and draft selection order. Each year there are basically 30 players who are selected (roughly the top 30 in voting) for the all-star game, 60 whose vote totals are published, and 120 whose names appear on the ballot. By splitting each up into groups (top10, top 20, top 30,60,120) and assigning players a binary outcome on whether or not they achieved a specific group, you are able to get estimates of proportions. Using these with draft choices I created a statistical model for the relationship.

When looking at uncertainty of proportions however, unless you have a very large sample there is a HUGE margin of error associated with them. For example in the tables below you will see that the probability of getting a top 10 player in all-star votes is about 30% + or - 10%, Without the model the estimate is 40%+ or - 30%, meaning that its basically meaningless. Because of this I used logistic regression techniques to build a model which assumes draft choice is a continuous variable and models the probability of a player drafted in each position being a member of any of the groups (top 10, 20,30...).

Any reasonable Rockets fan knows that picks aren't percentages, they are players and we have more data on them than this table summarizes. I think generally these numbers are estimates of pick value projected into the future, because at any given time leading up to the draft we have no idea which player will be selected where.

We also know that not every team makes selections as good as other teams do, this means that there is something more significant that can be observed. For example what is the probability that one of the top 3 players in a draft are still available at a given pick. Again using logistic regression I got the following proportion estimates (labeled below as Probability of Top 3 Player Still Available or PT3SA).

Table shows the estimated probability that a player drafted in each spot will be voted into a given group for the All-Star game. Also using All-Star voting, the probability that a top 3 player is still available at each spot. Data From drafts from 1996-2010.



I'll be honest this is a little dirty right now, I should be able to get it a little better when I clean up the data. I just couldn't hold this back from you all any longer. If you want to ask about any of the technical details I guess you can email me?


No cursing in title. No pirated material, such as links to online game streams. Do not cut/paste entire sections of content from other websites. Thanks.

In This FanPost