This summer, I affixed an air conditioner in my apartment to adapt to a changing climate. One month into the season, I was startled to see my electricity bill double. I checked my AC specs; sure enough, its energy efficiency ratio is 9.7 BTU/h.W — not the modern energy star baby by any standard.

I begin to wonder how many such low-efficiency units contribute to increased energy usage, especially in New York City where window AC is a norm for residential buildings. While getting the number of older units is challenging, Local Law 84 on Municipal Energy and Water Data Disclosure require owners of large buildings to report their energy and water consumption annually. You can get the 2015 disclosure data from here. I am showing a simple frequency plot of the weather normalized electricity intensity in kBtu per square foot of building area here.

There are 13223 buildings in the data file; 254 of them have an electricity intensity greater than 200 kBtu. I am not showing them here. The NY Stock Exchange building, Bryant Park Hotel, and Rockefeller University are notable among them.

I want to know the variety in energy use. Long time readers might guess correctly from the title that I am talking about the variability in energy usage data. We can assume that energy consumption is a random variable X, i.e. X represents the possible energy consumption values (infinite and continuous). The data we downloaded are sample observations x. We are interested in the variance of X → **V[X]**.

In lesson 24 and lesson 25, we learned that the expected value (**E[X]**) is a descriptive quantity of the average (center) of a random variable *X* with a probability distribution function *f(x)*. In the same way, a measure of the variability, i.e. deviation from the center, of the random variable is the variance V[X].

It is defined as the expected value of the squared deviation from the average.

is the expected value of the random variable → E[X]. measures the deviation from this value. We square these deviations and get the expected value of the squared deviations. If you remember lesson 17, this is exactly the equation for the variance of the data sample. Here, we generalize it for a random variable *X*.

With some derivation, we can get a useful alternative for computing the variance.

If we know the probability distribution function *f(x)* of the random variable *X*, we can also write the variance as

*f(x)* for this data might look like the thick black line on the frequency plot.

The expected value of the consumption for 2015 of the 12969 buildings is 83 kBtu/sqft; the variance is 1062 (kBtu/sqft)×(kBtu/sqft). A better way to understand this is through **standard deviation**, the **square root of the variance** — 32 kBtu/sqft. You can see from the frequency plot that the data has high variance — buildings with very low electricity intensity and buildings with high electricity intensity.

What is your building’s energy consumption this year? Does your city have this cool feature?

It came at a price though. With another local law, we can ban all the low EER AC units to solve the energy problem.

*If you find this useful, please like, share and subscribe.*

*You can also follow me on Twitter @realDevineni for updates on new lessons.*