Many phenomenon of interest can
be described by the presence or absence of a characteristic. Opinion polls are
conducted to determine whether people support or oppose a bill in congress or a
candidate. A marketing study may be conducted to determine whether people are
aware of a particular brand name or not. A sample may be taken of parts in a
factory to determine the percentage that conform to specifications. In each
case, we are concerned with binomial data. Statisticians use the term **binomial
experiment **to define the conditions which generate binomial data. A
binomial experiment is process in which:

- The experiment consists of
**n**identical trials. - There are only two outcomes
on each trial of the experiment. One outcome is usually referred to as a
**success**, and the other as a**failure**. The terms success and failure are a poor choice of terminology since the assignment to outcomes is arbitrary and there is no implication that the outcomes are good or bad. - The probability of success in
a trial is
**p**and does not change throughout the experiment. The probability of a failure is**q = 1 - p**and also does not change throughout the experiment. - The trials are independent.
- The
**binomial random variable**is the count of the successes in the**n**trials.

Note that item 3 requires either that we are conducting
trials from an infinite pool of items or we are sampling with replacement.
In practical terms, if we are randomly taking items off of an assembly line for
testing, we can reasonably assume that it represents sampling from an infinite
population. If we are taking a sample from a batch of items, however,
after testing we must put the item back in the batch, mix the batch up and take
our next sample. This is referred to as **sampling with replacement**.
If we are not sampling with replacement, then the **hypergeometric**
distribution describes the probabilities. However,
if we are taking samples from a batch of parts without replacement and the
sample size is 10% or less than the batch of parts, then the binomial provides a
very good approximation of hypergeometric distribution.

Suppose we are tossing a coin 4 times and call a "success" the result of the coin landing with the head showing. The binomial variable is the number of "heads" in the four coin tosses, which can take on the values 0, 1, 2, 3, or 4. The probabilities of the possible outcomes for this binomial variable can be determined by enumerating all possible outcomes:

We can see in the table above that there are 16 possible outcomes. Only one of these outcomes (TTTT) results in 0 heads, so the probability of having zero heads in four coin tosses is 1/16. Similarly, there are only four outcomes that produce one head, so the probability of getting exactly one head is 4/16 = 1/4.

Enumerating all of the possibilities is only practical for a
small number of experiments. Suppose we are talking about the number of
defective parts in a batch of 50 steel stampings. Since there are 2
possible conditions for each part (good or defective), inspecting all of the
parts can produce 2^{50}=1,125,899,906,842,624 possible outcomes.
Obviously we don't want to list them all out just to calculate probabilities.

The formula for calculating probabilities from a binomial experiment is:

The exclamation mark (!) represent the factorial function.
For any number *n* the factorial function is:

For example, 3! = 3*2*1 = 6, 4!=4*3*2*1 = 24 and 5! = 5*4*3*2*1 = 120. Obviously factorials get large rather quickly. After 11!, 8-digit hand calculators have to go into scientific notation mode, and above 69! most hand calculators have trouble, since 70! = 1.197857166996989179607278372169e+100, which has more than two digits in the exponent. A special case is 0!, which is defined to equal 1.

In the formula above, the term P(X
= *x*) means the probability
that the binomial random variable X (which
is the count of the number of "successes") is equal to a specific
value *x.* In other words, in a
specific application we would write P(X = 3)
or P(X = 2). In the coin tossing
example above, from the table with all outcomes enumerated we have:

P(X = 0) = 1/16

P(X = 1) = 1/4

P(X = 2) = 3/8

P(X = 3) = 1/4

P(X = 4) = 1/16

Given the number of "successes" *x*,
we next need to know the probability of a success p.
In the coin toss example it was 0.5, since the chance of heads or tails with a
fair coin is 0.5. Given that the number of trials is *n* = 4, we can
apply the formula to the coin toss example. For P(X
= 0) we have:

Similarly, for P(X = 0) we have: