|Disclaimer||Comments||This page last modified: 2011-05-07|
Have you ever wondered if your dice, as you use them in board games, are fair?
This page suggests a statistical test to answer that question. The test critiria is the maximum number of occurences of one or more faces in a given number of throws/rolls.
Firstly we model the die and briefly discuss the model, then make some simplifications and finally develop the test. This includes determining a number of throws/rolls needed to check for fairness.
For example in rolling a die kk=7 times, C(6+7-1,7) different combinations can be obtained, analogues to sampling with replacing, not taking the order of the throws into account. Meaning of C(nn,kk) is nn!/(kk!(nn-kk)!)
|Sum of occurences|
|Summarize: Sort all rows in descending order, for example|
the above two ones summarize into twice the follwing one:
For simplification of the annotation in the graph, we adapt the following:
The blue stair like function shows the probability of all possible combinations. The combinations are annotated above the step, summarized as described above. For a fair die the probabilities of all the summarized combinations are the same. The probabilities are sorted in ascending order. At the left the graph starts with 6 points at (1/6)^6 which is for each face appearing 6 times. These combinations 600000 060000 ... summarize into 600000.
The green lines indicate the test criteria at a significance level of at the step closest to 5%. That means at ≈5% risk of a Type I error: rejecting a true hypothesis. Looking at the plot: In this case the hypothesis of a fair die is accepted if no face occurs more than 3 times (below 411000) and if two different faces do not appear more than 2 times each (below 330000). We can see how weak the criteria is: One face is allowed to show up to 3 times, half the number of throws!
The criteria is at 30.5% of all possible combinations (at 141 out of max 462 on the x-axis). For higher numbers of throws this percentage will increase and the test will get better. We will see the quality of the test further below when looking at Type II errors (accepting a false hypothesis).
For throwing 10 times the green lines of the test criteria progress to 46.6% of all possible combinations (1401/3003) from 30.5% when throwing 6 times. This means it is a better test as again we shall see later.
Furthermore, for some levels of probability per combination, there is more than one summarized combination. For example combinations in 541000 and 622000, have the same probability: 10!/(5!*4!*1!)*(1/6)^10 =10!/(6!*2!*2!)*(1/6)^10=2.083810e-05. To indicate this, 622000 is noted to the right and below 541000 on the plot.
Now we will introduce an unfair die. This is necessary to check the quality of the test. We keep the unfair die very simple. To one face we add an "unfairness", from the remaining faces we substract 1/5*the unfairness. We will add the plot for the unfair die to the fair die. However we do not sort by ascending probability but adjust the plot the probabilities of the unfair die versus the summarized combinations already obtained for the fair die. Obviously more steps will appear for the unfair die, as 820000 has no longer the same probability as 280000 if the first face has the added unfairness.
The unfair die shown has an unfairness of 0.3. The following table shows the probabilities for the fair and unfair die:
As shown on the plot the fair die has 100%-5%=95% of its cumulated probability on the righ of acceptance limit of the green lines. This is the 5% (100%-95%) significance level. The unfair die has β=60.4% risk of acceptance (type II error). One also talks of "power of the test" μ=1-β=39.6%. β should be no more than maybe 10%, so it is much too high. The test cannot differentiate well between a die with unfairness 0.3 and a fair die. We need to increase the number of throws. We could also increase the unfairness to see from which level of unfairness a test with 10 throws is effective.
Also seeing that the acceptance limit 522100 allows one face to show 5 times, i.e. half the number of throws gives an indication of the weekness of the test.
We will now do exactly what we considered in the previous paragraph: Vary unfairness and numbers of throws. Then we plot β, the risk of accepting an unfair die as fair (a Type II error), versus the unfairness for different numbers of throws. The following unfairnesses are included: 0.05 0.10 0.20 0.30 0.40 0.50.
Looking at the graph on the left, we see that the higher the number of throws, the lower the risk of a Type II error. At the beginning, it is not so clear, this is because the significance level for each number is at the closest step to 5% in the density function, and varies around 5%.
β should at least be below 10%. So for the considered unfairnesses (they could further up, up to 5/6=0.833), the test with 6 throws is of no use, as the minimum beta is 31%. The test with 26 throws can be used to exclude unfairnesses of about 0.35 and above.
|From this table of the combinations around the accptance limit, we can see again how weak the test is: In the acceptance part of the table, which is the bottom half, we can find one summarized combination 13 4 4 2 2 1 which allows one face to show up to 13 times, half of the total number of throws. This can hardly be a fair die.
In order to further improve the test, the number of throws needs to be improved more or we could consider to increase the significance level, that is the risk of rejecting a fair die. Using the currently employed means the computations needed for this graph take already several hours, so we will stop here for now.
Finally we threw the die in the picture at the top of this page 26 times: 16226423231446322662216432. The decreasing sorted occurences are: 9 6 4 4 3 0. For 26 throws it is not possible to plot the summarized combination on a graph in a readable way. So we have to search this sorted combination in the claculated summarized combinations of a fair die for 26 throws. An extract is shown in the table on the right with the obtained summarized combination highlighted in red. Is is at a cumulated probability of 3.48% which is less than 5% and therefore REJECTED.
Again looking at he max permissible number of occurences of one face, this is now 11 (in the table just below the acceptance limit), and therefore better than half of the number of throws as for the previous tests.
With this test we cannot accept our die as a being fair!
|Table of sorted/summarized combinations around 5% significane level acceptance limit for fair die for 26 throws|
For usual commercialised game dies, the sum of opposite faces is 7. This has not been observed for the die used in this test and shown at the top of the page (contributed by edm).
|Die images Dice-X.png||Wikimedia commons, Nein Arimasen||GNU Free Documentation License|
|Probability plots||Own work using R: A language and environment for statistical computing.R Development Core Team (2008), R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0||GPL, NO WARRANTY|
|Text||Own work using statistical test chapter of Advanced Engineering Mathematics 8th edition by Erwin Kreyszig, published by Wiley 1999||GPL, NO WARRANTY|
Please log on to view and add comments. You can request a logon at the message page.