Lesson 31 – Yes or No: The language of Bernoulli trials

Downtown Miami will be flooded due to hurricane Irma.

Your vehicle will pass the inspection test this year.          

Each toss of a coin results in either a head or a tail.          

Did you notice that I am looking for an answer, an outcome that is “yes” or “no.” We often summarize data as the occurrence (or non-occurrence) of an event in a sequence of trials. For example, if you are designing dikes for flood control in Miami, you may want to look at the sequence of floods over several years to analyze the number of events, and the rate at which they occur.

There are two possibilities, a hit (event occurred – success) or miss (event did not occur – failure). A yes or a no. These events can be represented as a sequence of 0’s and 1’s (0001100101001000) called Bernoulli trials with a probability of occurrence of p. This probability is constant over all the trials, and the trials itself are assumed to be independent, i.e., the occurrence of one event does not influence the occurrence of the subsequent event.

Now, imagine these outcomes, 0’s or 1’s can be represented using a random variable X. In other words, X is a random variable that can take 0 or 1 with a probability p. If in Miami, there were ten extreme flood events in the last 100 years, the sequence will have 90 0’s and 10 1’s in some order. The probability of the event is hence 0.1. If the probability is 0.5, then, in a sequence of 100 trials (coin tosses for example), you will see 50 heads on average. We can derive the expected value of X and the variance of X as follows:

Since the Bernoulli trials are independent, the probability of a sequence of events happening will be equal to the product of the probability of each event. For instance, the probability of observing a sequence of No Flood, No Flood, No Flood and Flood over the last four years is 0.9*0.9*0.9*0.1 = 0.072 (assuming p = 0.1).

Bernoulli trials form the basis for deriving several discrete probability distributions that we will learn over the next few weeks.

While you ponder over what these distributions are, their mathematical forms, and how they represent the variation in the data, I will leave you with this image of the daily rainfall data from Miami International Airport. An approximate 6.38 inches of rain (~160mm/day) is forecasted for Sunday. Notice how you can remap the data into a sequence of 0’s (if rain is less than 160) and 1’s (if rain is greater than 160).

After tomorrow, when you hear “unprecedented rains” in the news, keep in mind that we seek the historical sequence data like this precisely because our memory is weak.

If you find this useful, please like, share and subscribe.
You can also follow me on Twitter @realDevineni for updates on new lessons.

error

Enjoy this blog? Please spread the word :)