k0 ∈ N0 — the number of failures before the experiment is stopped, p ∈ Rm — m-vector of "success" probabilities, p0 = 1 − (p1+…+pm) — the probability of a "failure".
Suppose we have an experiment that generates m+1≥2 possible outcomes, {X0,...,Xm}, each occurring with non-negative probabilities {p0,...,pm} respectively. If sampling proceeded until n observations were made, then {X0,...,Xm} would have been multinomially distributed. However, if the experiment is stopped once X0 reaches the predetermined value k0, then the distribution of the m-tuple {X1,...,Xm} is negative multinomial. These variables are not multinomially distributed because their sum X1+...+Xm is not fixed, being a draw from a negative binomial distribution.
Properties
Marginal distributions
If m-dimensional k is partitioned as follows
and accordingly
and let
The marginal distribution of is . That is the marginal distribution is also negative multinomial with the removed and the remaining p's properly scaled so as to add to one.
The univariate marginal is the negative binomial distribution.
The table below shows an example of 400 melanoma (skin cancer) patients where the Type and Site of the cancer are recorded for each subject.
Type
Site
Totals
Head and Neck
Trunk
Extremities
Hutchinson's melanomic freckle
22
2
10
34
Superficial
16
54
115
185
Nodular
19
33
73
125
Indeterminant
11
17
28
56
Column Totals
68
106
226
400
The sites (locations) of the cancer may be independent, but there may be positive dependencies of the type of cancer for a given location (site). For example, localized exposure to radiation implies that elevated level of one type of cancer (at a given location) may indicate higher level of another cancer type at the same location. The Negative Multinomial distribution may be used to model the cancer rates at a given site and help measure some of the cancer type dependencies within each location.
If denote the cancer rates for each site () and each type of cancer (), for a fixed site () the cancer rates are independent Negative Multinomial distributed random variables. That is, for each column index (site) the column-vector X has the following distribution:
.
Different columns in the table (sites) are considered to be different instances of the random multinomially distributed vector, X. Then we have the following estimates of expected counts (frequencies of cancer):
Example:
For the first site (Head and Neck, j=0), suppose that and . Then:
and therefore,
Notice that the pair-wise NM correlations are always positive, whereas the correlations between multinomial counts are always negative. For large , the Negative Multinomial counts behave as Poisson random variables.
The marginal distribution of each of the variables is negative binomial, as the count (considered as success) is measured against all the other outcomes (failure). But jointly, the distribution of is negative multinomial, i.e., .
Parameter estimation
Method of Moments
If we let the mean vector of the negative multinomial be
then it is easy to show through properties of determinants that
. From this, it can be shown that
and
Substituting sample moments yields the method of moments estimates
and
Example
Estimation of the mean (expected) frequency counts () of each outcome () using maximum likelihood is possible. If we have a single observation vector , then If we have several observation vectors, like in this case we have the cancer type frequencies for 3 different sites, then the MLE estimates of the mean counts are , where is the cancer-type index and the summation is over the number of observed (sampled) vectors (I). For the cancer data above, we have the following MLE estimates for the expectations for the frequency counts:
Hutchinson's melanomic freckle type of cancer () is .
, we can replace the expected-means () by their estimates, , and replace denominators by the corresponding negative multinomial variances. Then we get the following test statistic for negative multinomial distributed data:
.
Next, we can estimate the parameter by varying the values of in the expression and matching the values of this statistic with the corresponding asymptotic chi-squared distribution. The following protocol summarizes these steps using the cancer data above.
Median: The median of a chi-squared random variable with 6 df is 5.261948.
Mean Counts Estimates: The mean counts estimates () for the 4 different cancer types are:
; ; and .
Thus, we can solve the equation above for the single variable of interest -- the unknown parameter . In the cancer example, suppose . Then, the solution is an asymptotic chi-squared distribution driven estimate of the parameter .
.
Solving this equation for provides the desired estimate for the last parameter.
Mathematica provides 3 distinct () solutions to this equation: {50.5466, -21.5204, 2.40461}. Since there are 2 candidate solutions.
Estimates of probabilities: Assume and , then:
Hence, , and , , and .
Therefore, the best model distribution for the observed sample is
^ abLe Gall, F. The modes of a negative multinomial distribution, Statistics & Probability Letters, Volume 76, Issue 6, 15 March 2006, Pages 619-624, ISSN 0167-7152, 10.1016/j.spl.2005.09.009.
^Zelterman, Daniel (2002). Advanced log-linear models using SAS. SAS Publishing. p. 196. ISBN978-1-59047-080-0.
Waller LA and Zelterman D. (1997). Log-linear modeling with the negative multi-
nomial distribution. Biometrics 53: 971-82.
Further reading
Johnson, Norman L.; Kotz, Samuel; Balakrishnan, N. (1997). "Chapter 36: Negative Multinomial and Other Multinomial-Related Distributions". Discrete Multivariate Distributions. Wiley. ISBN978-0-471-12844-1.