# Binomial Probabilities

Many phenomenon of interest can be described by the presence or absence of a characteristic. Opinion polls are conducted to determine whether people support or oppose a bill in congress or a candidate. A marketing study may be conducted to determine whether people are aware of a particular brand name or not. A sample may be taken of parts in a factory to determine the percentage that conform to specifications. In each case, we are concerned with binomial data.  Statisticians use the term binomial experiment to define the conditions which generate binomial data.  A binomial experiment is process in which:

1. The experiment consists of n identical trials.
2. There are only two outcomes on each trial of the experiment. One outcome is usually referred to as a success, and the other as a failure. The terms success and failure are a poor choice of terminology since the assignment to outcomes is arbitrary and there is no implication that the outcomes are good or bad.
3. The probability of success in a trial is p and does not change throughout the experiment. The probability of a failure is q = 1 - p and also does not change throughout the experiment.
4. The trials are independent.
5. The binomial random variable is the count of the successes in the n trials.

Note that item 3 requires either that we are conducting trials from an infinite pool of items or we are sampling with replacement.  In practical terms, if we are randomly taking items off of an assembly line for testing, we can reasonably assume that it represents sampling from an infinite population.  If we are taking a sample from a batch of items, however, after testing we must put the item back in the batch, mix the batch up and take our next sample.  This is referred to as sampling with replacement.  If we are not sampling with replacement, then the hypergeometric distribution describes the probabilities.  However, if we are taking samples from a batch of parts without replacement and the sample size is 10% or less than the batch of parts, then the binomial provides a very good approximation of hypergeometric distribution.

## Example

Suppose we are tossing a coin 4 times and call a "success" the result of the coin landing with the head showing.  The binomial variable is the number of "heads" in the four coin tosses, which can take on the values 0, 1, 2, 3, or 4. The probabilities of the possible outcomes for this binomial variable can be determined by enumerating all possible outcomes:

We can see in the table above that there are 16 possible outcomes.  Only one of these outcomes (TTTT) results in 0 heads, so the probability of having zero heads in four coin tosses is 1/16.  Similarly, there are only four outcomes that produce one head, so the probability of getting exactly one head is 4/16 = 1/4.

## The formula

Enumerating all of the possibilities is only practical for a small number of experiments.  Suppose we are talking about the number of defective parts in a batch of 50 steel stampings.  Since there are 2 possible conditions for each part (good or defective), inspecting all of the parts can produce 250=1,125,899,906,842,624 possible outcomes.  Obviously we don't want to list them all out just to calculate probabilities.

The formula for calculating probabilities from a binomial experiment is:

The exclamation mark (!) represent the factorial function.  For any number n the factorial function is:

For example, 3! = 3*2*1 = 6, 4!=4*3*2*1 = 24 and 5! = 5*4*3*2*1 = 120.  Obviously factorials get large rather quickly.  After 11!, 8-digit hand calculators have to go into scientific notation mode, and above 69! most hand calculators have trouble, since 70! = 1.197857166996989179607278372169e+100, which has more than two digits in the exponent.  A special case is 0!, which is defined to equal 1.

In the formula above, the term P(X = x) means the probability that the binomial random variable X (which is the count of the number of "successes") is equal to a specific value x.  In other words, in a specific application we would write P(X = 3) or P(X = 2).  In the coin tossing example above, from the table with all outcomes enumerated we have:

P(X = 0) = 1/16

P(X = 1) = 1/4

P(X = 2) = 3/8

P(X = 3) = 1/4

P(X = 4) = 1/16

Given the number of "successes" x, we next need to know the probability of a success p.  In the coin toss example it was 0.5, since the chance of heads or tails with a fair coin is 0.5.  Given that the number of trials is n = 4, we can apply the formula to the coin toss example.  For P(X = 0) we have:

Similarly, for P(X = 0) we have: