Probability multivariate distribution
Notation |
 |
---|
Parameters |
 |
---|
Support |
 |
---|
PDF |
 where Γ(x) is the Gamma function and B is the beta function. |
---|
Mean |
for  |
---|
Variance |
for  |
---|
MGF |
undefined |
---|
In probability theory and statistics, the Dirichlet negative multinomial distribution is a multivariate distribution on the non-negative integers. It is a multivariate extension of the beta negative binomial distribution. It is also a generalization of the negative multinomial distribution (NM(k, p)) allowing for heterogeneity or overdispersion to the probability vector. It is used in quantitative marketing research to flexibly model the number of household transactions across several brands.
If parameters of the Dirichlet distribution are
, and if

where

then the marginal distribution of X is a Dirichlet negative multinomial distribution:

In the above,
is the negative multinomial distribution and
is the Dirichlet distribution.
Motivation
Dirichlet negative multinomial as a compound distribution
The Dirichlet distribution is a conjugate distribution to the negative multinomial distribution. This fact leads to an analytically tractable compound distribution.
For a random vector of category counts
, distributed according to a negative multinomial distribution, the compound distribution is obtained by integrating on the distribution for p which can be thought of as a random vector following a Dirichlet distribution:


which results in the following formula:

where
and
are the
dimensional vectors created by appending the scalars
and
to the
dimensional vectors
and
respectively and
is the multivariate version of the beta function. We can write this equation explicitly as

Alternative formulations exist. One convenient representation[1] is

where
and
.
This can also be written

Properties
Marginal distributions
To obtain the marginal distribution over a subset of Dirichlet negative multinomial random variables, one only needs to drop the irrelevant
's (the variables that one wants to marginalize out) from the
vector. The joint distribution of the remaining random variates is
where
is the vector with the removed
's.
Conditional distributions
If m-dimensional x is partitioned as follows

and accordingly

then the conditional distribution of
on
is
where

and
.
That is,

Aggregation
If

then, if the random variables with positive subscripts i and j are dropped from the vector and replaced by their sum,

Conditional on the sum
The conditional distribution of a Dirichlet negative multinomial distribution on
is Dirichlet-multinomial distribution with parameters
and
. That is
.
Notice that the equation does not depend on
or
.
Correlation matrix
For
the entries of the correlation matrix are


Heavy tailed
The Dirichlet negative multinomial is a heavy tailed distribution. It does not have a finite mean for
and it has infinite covariance matrix for
. It therefore has undefined moment generating function.
Applications
Dirichlet negative multinomial as a urn model
The Dirichlet negative multinomial can also be motivated by an urn model in the case when
is a positive integer. Consider a sequence of independent and identically distributed multinomial trials, each of which has
outcomes. Call one of the outcomes a “success”, and suppose it has probability
. The other
outcomes – called "failures" - have probabilities
. If the vector
counts the m types of failures before the
success is observed, then the
have negative mulitnomial distribution with parameters
.
If the parameters
are themselves sampled from a Dirichlet distribution with parameters
, then the resulting distribution of
is Dirichlet negative multinomial. The resultant distribution has
parameters.
See also
References
- ^ Farewell, Daniel & Farewell, Vernon. (2012). Dirichlet negative multinomial regression for overdispersed correlated count data. Biostatistics (Oxford, England). 14. 10.1093/biostatistics/kxs050.