## CalTRACK Site-level Monthly Weather Normalized, Metered Energy Savings Estimation Technical Guideline

### Methodological Overview

Site-level weather normalized, metered energy savings using monthly billing data (both electricity and gas) will use a two-stage estimation approach that closely follows methodological recommendations in the technical appendices of the Uniform Methods Project for Whole Home Building Analysis and the California Evaluation Project, with some modifications and more specific guidance developed through empirical testing to ensure consistency and replicability of results.

The idea behind two-stage site-level models is to model the energy use of each house before and after an energy efficiency retrofit.

More formally, the two-stage approach first fits **two** separate parametric models to daily average energy use, one on the pre-intervention (baseline) period and one on the post-intervention (reporting) period for a single site using an ordinary least squares regression of the general form:

Where

is average use (gas in therms, electricity in kWh) per day during billing period m for site i.

is the mean use for site , or intercept.

is the coefficient site on average heating degree days per day.

is the coefficient or site on average cooling degree days per day.

is the average number of heating degree days per day in billing period , which is a function of a fixed base temperature, the average daily temperatures from the weather station matched to site during the billing period , and the number of days in billing period with matched usage and weather data for site .

is the average number of cooling degree days per day in month , which is a function of a selected base temperature, the average daily temperatures from the weather station matched to site during month , and the number of days in month with matched usage and weather data for site .

is the site specific error term for a given month.

In the second stage, using parameter estimates from the first stage equation, weather normalized savings for both the baseline period and reporting period can be computed by using corresponding temperature normals for the relevant time period (typical year weather normalized metered energy savings), or by using current-year weather to project forward baseline period use (current year weather normalized metered energy savings) and differencing between baseline and reporting period estimated or actual use, depending on the quantity of interest.

This site-level two-stage approach without the use of a comparison group was decided by the technical working group to be appropriate for the two main use cases for CalTRACK, which emphasize effects on the grid and feedback to software vendors, rather than causal programatic effects. In addition to its long history of use in the EM&V literature, it draws on a methodological foundation developed in the more general literature on piecewise linear regression or segmented regression for policy analysis and effect estimates that is used in fields as diverse as public health, medical research, and econometrics.

We now proceed with a detailed technical treatment of the steps for monthly savings estimation.

### Technical guidelines for implementing two-stage estimation on monthly electric and gas usage data for CalTRACK

CalTRACK savings estimation begins with gas and electric usage data, project data, and weather data that have been cleaned and combined according to the Data Cleaning and Integration technical specification. Starting with the prepared data, site-level monthly weather normalized metered energy savings analysis is performed by implementing the following steps:

#### 1. Generate Use Per Day values and separate usage data into a pre- and a post-intervention data series

The CalTRACK monthly weather normalized metered energy savings analysis uses average use per day () values for each month by taking the bill-period usage values, then dividing by the number of days in that bill period, as follows:

Where

is the average use per day for a given month

is the sum of all daily use values for a given month

is the total number of daily use values provided in the usage series that are between the first calendar day of month and the last calendar day of month

*Note: If daily use data for gas or electric is not available, monthly billing data can be used for the monthly billing analysis. However, modifications of the denominators for average use per day and for average HDD and CDD per day are necessary.*

Now split the series of values into pre- and post-intervention periods according to the following rules:

*Pre-intervention period*: all UPDm values from the beginning of the series up to the the complete billing month prior to the `work_start_date`

. The month containing `work_start_date`

is excluded from this series.

*Post-intervention period*: all UPDm values from the first billing month after the `work_end_date`

to the end of the series.

**Final data sufficiency qualification check**: All qualifying sites must have at least 12 months of contiguous UPDm values in the pre-intervention series and at least 12 months of contiguous post-intervention UPDm values starting with the month after `work_end_date`

.

**All sites not meeting these minimum data requirements are thrown out of the analysis**

#### 2. Set fixed degree day base temperature and calculated HDD and CDD

Next you calculate total HDD and CDD for the each billing period in the series. CalTRACK will use a fixed degree day base for monthly billing analysis. The following balance point temperatures will be use:

HDD base temp: 60 F

CDD base temp: 70 F

HDD and CDD values are calculated as follows

Where

= Average heating degree days per day for billing period

= the number of days with both weather and usage data

= the sum of the degree over each day in billing period

= the maximum of the two values in ()

= the average temperature for day

And

Where

= Cooling degree days for billing period

= the number of days with both weather and usage data

= the sum of values in {} over each day in billing period

= the maximum of the two values in ()

= the average temperature for day

*Daily average temperatures are taken from the GSOD average data temperature dataset provided by NOAA*

#### 3. Fit All Candidate Models and Apply Qualification Criteria

For each site, all allowable models will be run as candidate models and then have minimum fitness criteria set for qualification.

For CalTRACK electric monthly savings analysis, the following candidate models are fit:

with the constraints

For electric, qualifying models for selection must have each parameter estimate meet the minimum significance criteria of p < 0.1 and are strictly positive. All qualifying models are considered for final model selection.

For CalTRACK gas monthly savings analysis, the following candidate models are fit:

with the constraints

If each parameter estimate meets minimum significance criteria and are strictly positive, then the model is a qualifying model for inclusion in model selection.

#### 4. Select the best for pre-intervenion and post-intervention periods for use in second-stage savings estimation

All qualifying pre-intervention models are compared to each other and among qualifying models, the model with the maximum adjusted R-squared will be selected for second-stage savings estimation.

For the monthly billing analysis, because we are using fixed degree days instead of variable degree days, adjusted R-squared will be defined as

Where

is the sum of squares of residuals

is the degrees of freedom of the estimate of the underlying population error variance, and is calculated using `n-p-1`

, where `n`

is the number of observations in the sample used to estimate the model and `p`

is the number of explanatory variables, not including the constant term and not including degree day base temperature as a parameter because itâ€™s fixed

is the total sum of squares

is the degrees of freedom of the estimate of the population variance of the dependent variable, and is calculated as `n-1`

, were `n`

is the size of the sample use to estimate the model

All qualifying post-intervention models are compared to each other and among qualifying models, the model with the maximum adjusted R-squared will be selected for second-stage savings estimation.

#### 5. Estimate second-stage weather normalized metered energy savings quantities based on selected first stage pre- and post-intervention models

During the second stage, up to five savings quantities will be estimated for each site that meets the minimum data sufficiency criteria for that savings statistic.

Cumulative weather normalized metered energy savings over entire performance period Year one annualized actual weather normalized metered energy savings in the the reporting (post-intervention) period Year two annualized actual weather normalized metered energy savings in the the reporting (post-intervention) period Year one annualized weather normalized metered energy savings in the normal year Year two annualized weather normalized metered energy savings in the normal year

These site-level second stage quantities are calculated as follows:

#### Cumulative weather normalized metered energy savings over entire performance period (site-level)

- Compute
`predicted_baseline_use`

for each complete billing period after`work_end_date`

using parameter estimates from the stage one model from the pre-intervention (baseline) period model and the associated average degree days for each month in the post-intervention (reporting) period, ensuring that the same degree day values calculated for stage one model fits are use in stage two estimation. - Compute
`monthly_gross_savings`

=`predicted_baseline_monthly_use - actual_monthly_use`

for every complete billing periods after`work_end_date`

for project - Sum
`monthly_gross_savings`

over every complete billing period since`work_end_date`

.

#### Year one weather normalized metered energy savings from 1 to 12 months after site visit. (site-level)

- Compute
`predicted_baseline_use`

for each complete billing period after`work_end_date`

until 12 billing periods after`work_end_date`

using parameter estimates from the stage one model from the pre-intervention (baseline) period model and the associated average degree days for each month in the post-intervention (reporting) period, ensuring that the same degree day values calculated for stage one model fits are use in stage two estimation. - Compute
`monthly_gross_savings`

=`predicted_baseline_monthly_use - actual_monthly_use`

for 12 complete billing periods after`work_end_date`

for project - Sum
`monthly_gross_savings`

over the 12 billing periods since`work_end_date`

.

#### Year two weather normalized metered energy savings from 13 to 24 months after site visit. (site-level)

- Compute
`predicted_baseline_use`

for each complete billing period starting 13 months after`work_end_date`

until 24 billing periods after`work_end_date`

using parameter estimates from the stage one model from the pre-intervention (baseline) period model and the associated average degree days for each month in the post-intervention (reporting) period, ensuring that the same degree day values calculated for stage one model fits are use in stage two estimation. - Compute
`monthly_gross_savings`

=`predicted_baseline_monthly_use - actual_monthly_use`

for month 13 to month 24 after`work_end_date`

for project - Sum
`monthly_gross_savings`

over the 12 billing periods from 13 months after`work_end_date`

to 24 months.

#### Year one site-level annualized weather normalized metered energy savings in the normal year

- Compute
`predicted_baseline_monthly_use`

using the stage one model from the baseline period and average degree days from the CZ2010 normal weather year. Use the full month of available values when calculating the average degree days per billing period for the normal year. - Compute
`predicted_reporting_monthly_use`

using a`stage one`

model fit to only the first 12 months of post-intervention values and degree days from the CZ2010 normal weather year file. Use the full month of available values when calculating the average degree days per billing period for the normal year. - Compute
`monthly_normal_year_gross_savings`

=`predicted_baseline_monthly_use - predicted_reporting_monthly_use`

for normal year months - Sum
`monthly_normal_year_gross_savings`

over entire normal year.

#### Year two site-level annualized weather normalized metered energy savings in the normal year

- Compute
`predicted_baseline_monthly_use`

using the stage one model from the baseline period and degree days from the CZ2010 normal weather year. - Compute
`predicted_reporting_monthly_use`

using a`stage one`

model fit to only the 13th-24th months of post-intervention values and degree days from the CZ2010 normal weather year file for the relevant months. - Compute
`monthly_normal_year_gross_savings`

=`predicted_baseline_monthly_use - predicted_reporting_monthly_use`

for each normal year month. - Sum
`monthly_normal_year_gross_savings`

over entire normal year.

### Post-estimation steps and portfolio aggregation

The goal of CalTRACK is to develop replicable, consistent, and methodologically defensible estimators of savings over **portfolios of homes**. In order to do that, the above site-level savings quantities must be aggregated to get portfolio-level totals, means, and variances. Taking the site-level estimates, CalTRACK then performs a set of aggregation steps that are specified here.