# Lesson 41 – Struck by a smooth function

Review lesson 32.

If you assume X is a random variable that represents the number of successes in a Bernoulli sequence of n trials, then this X follows a binomial distribution. The probability that this random variable X takes any value k, i.e., the probability of exactly k successes in n trials is:

Review lesson 33.

If we consider independent Bernoulli trials of 0s and 1s with some probability of occurrence p and assume X to be a random variable that measures the number of trials it takes to see the first success, then, X is said to be Geometrically distributed. The probability of first success in the kth trial is:

Review lesson 36.

The number of times an event occurs (counts) in an interval follows a Poisson distribution. The probability that X can take any particular value P(X = k) is:

The characteristic feature in all these distributions is that the random variable X is discrete. The possible outcomes are distinct numbers, which is why we called them discrete probability distributions.

Have you asked yourself, “what if the random variable X is continuous?” What is the probability that X can take any particular value x on the real number line which has infinite possibilities?

I am going to ask you to draw a ball at random from a box of ten balls. I am also going to ask you “what is the probability of selecting any particular ball?

Your answer will include, “not again,” and “since the balls are all identical, and there are ten in the box, the probability of selecting any particular ball is one-tenth (1/10); P(X = any ball out of ten balls) = 1/10.”

As I am about to ask my next question, you will interrupt me and give me the answer. “And if there are 20 balls, the probability will be 1/20.” You might also say, “spare your next question, because the answer is 1/100, and the visual for increasing number of balls looks like this.”

I am sure you have recognized the pattern here. As the sample size (n) becomes large, the probability of any one value approaches zero. For a continuous random variable, the number of possible outcomes is infinite, hence,

P(X = x) = 0.

For continuous random variables, the probability is defined in an interval between two values. It is computed using continuous probability distribution functions.

If you go back to lesson 15, you will recall how we made frequency plots. We partitioned the real number space into intervals or groups, recorded the number of observations (values) that fall into each group and used this grouping to build stacks.

Based on the number of observations in each interval, we can compute the probability that the random variable will occur in that intervals. For example, if there are ten observations out of 100 observations in a group, we estimate the probability that the variable occurs in this group as 10/100.

For continuous random variables, the proportion of observations in the group approaches the probability of being in the group, and the size of the group (interval range) approaches zero. For a large n, we can imagine a large number of very small intervals.

Is it too abstract?

If so, let’s take some data and observe this behavior.

We will use the same data that we used last week — daily temperature data for New York City. We have this data from 1869 to 2017, a large sample of 54227 values. We can assume that temperature data is a continuous random variable that has infinite possible values on the real number line.

I will take 500 data points at a time and place them on the number line. If there are two or more observations with the same temperature value, I will stack them. Recall that this is how we create histograms, the only difference is that I am not grouping. Each value is independent.

Observe this animation.

As the number of data points (sample size) increases, the stacks get denser and denser with overlaps. The final compact histogram can be approximated using a smooth function – a continuous probability distribution function.

Since for continuous random variables, the proportion of observations in the group approaches the probability of being in the group, the area of the interval block or the area under the curve of the smooth function is the probability that X is in that interval.

Finally, from calculus, you can see that the probability of a continuous variable in an interval a and b is:

An example area computation between -1 and 2 is shown.

The continuous probability distribution functions should obey the property of unit probability.

The limits of the integral are negative to positive infinity.

Now, we can integrate this function, f(x) up to any value x to get the cumulative distribution function (F(x)).

Since cumulative distribution function is the area under the curve up to a value of x, we are essentially computing .

Having this cumulative function is handy for computing the percentiles of the random variable.

Do you remember the concept of percentiles?

We learned in lesson 14 that percentiles are order statistics that can be used to summarize the data. A 75th percentile is that value of x which has 75% of the data less than this number. In other words, F(x) = 0.75.

Can you see how the cumulative distribution function, F(x) =  can be used to compute the percentiles?

Over the next few weeks, we will learn some special types of continuous distribution functions. Since you’ve been struck by smooth functions today, I will invite you to solve this.

If X is a random variable with a probability distribution function defined as

• What is the median of X?
• What is the probability that X is between 0.2 and 0.3?
• What is the probability that X will exceed 0.9?