Methodology for Health Costs for Consumers

NHID HealthCost Analysis Methodology


Below is a description of the methodology for calculating the estimated costs for health care services reported on the New Hampshire Insurance Department (NHID) HealthCost website.  The estimates are based on the median amounts paid (by both the insurance carrier and the patient) using claims data from the New Hampshire Comprehensive Health Information System (NHCHIS) database.  The cost amount is often referred to as the "allowed rate" of payment to health care providers.

It has been well documented in the published literature that there is substantial variation in the cost of health care, even when provided by the same provider.  There are many factors that contribute to the variation, and the NHID uses several tools to help address these issues when reporting "costs" to the patient.  When the patient is insured, the cost to the patient for covered services is based on a contract between the provider and the insurance company.  When the patient is not insured, or the services are not covered under the patient's health plan, the cost is based on charges minus any discount the provider offers uninsured patients.   

The methodology used in HealthCost is consistent across payers and providers by treatment type.  So, the same selection and exclusion criteria for including or removing any observations is based on statistical measures and calculations that are consistently applied from one provider to another and from one payer to another.

Included Costs

The focus is on the total cost and the difference in total costs to the patient between providers who provide a similar service.  The cost to the patient is a combined total that does not distinguish between what is paid to the hospital (or clinic, ambulatory surgery center, or any other facility), and each physician or multiple physicians who treat the patient.   The "lead provider" associated with the costs is considered to be the most easily recognized entity that care is received from.  In many cases, it is a hospital, even though the patient actually receives treatment from several different physicians who could also be considered the provider of care.  The overall cost is determined by several variables, including: the treatment provided, the contract between the insurer and the lead provider, contracts with all other providers the patient is treated by, the volume of primary and incidental services provided (both those necessary and unnecessary), the typical illness burden of patients treated by the provider(s), and how efficient the providers are when the cost is shown as a bundle.

Calculation of Cost Estimate

The median treatment cost based on patient experience is reported instead of the average.  Consistent with the purpose of HealthCost, the median is a better measure of central tendency when predicting the cost liability to the patient and health plan.  The median is influenced less than the average by outlier observations that may skew the results.  The median also makes determining actual contract terms for payments between the insurer and the provider more difficult.

In this example, both insurance carriers would have the same median cost reported in HealthCost:

Reimbursement Contract Rates

Proportion of Patients at Make Believe Hospital Insurance Company A Insurance Company B










Median =



Average =



Total Annual Payments (1,000 Visits) =



representation of the "value" of the contract to the insurer and the provider.  However, $100 is a better estimate of the total cost for most patients, regardless of which insurance carrier they are covered by.

Cost on the site is shown using either bundled and unbundled calculations.  Bundled costs consist of all (or nearly all) of the individual charges associated with the procedure shown on the website.   When the cost shown is calculated using the bundled method, it will show on the site with a button icon and statement to indicate that the cost of the procedure shown is an aggregate of the typical costs a patient will likely pay.  When a procedure is shown as an unbundled cost, the cost shown is the cost associated with the named procedure and a patient may or may not have additional services provided at the same time.  The unbundled procedures have a notation on the website to indicate that the cost is a stand alone cost but the patient may experience more than one procedure at a time.


Whenever rates are reported, the NHID will include information on the variability of the rate.  If the historical data show low variability, then this is indicated as Precision of the Cost Estimate = "HIGH."  Likewise, if the data show extensive variation, the estimate will indicate the precision level is "LOW."  When the precision level is low, the experience of an individual patient is more likely to be different than what is reported.

The measure of variation in the rate is based on the coefficient of variation for charges, including all payers, and the difference between the median charge for the insurance company product line and the overall median for all insurance companies and product lines at the provider identified.  These values, both percentages, are summed together and translated into an ordinal scale. Like most ordinal scales, the distinction between the values at neighboring points on the scale is not necessarily the same. For instance, the range within Very Low and Low might be much less than that in Medium and High. The scale is determined based on how the variability compares to other reported insurance carrier LOB calculations within the health care service selected.  The breakdown is based on percentiles, based on 75th, 50th, and 25th break points.

When variability in the data is high (Precision of the Cost Estimate="VERY LOW") and there are fewer than four patients in the analysis, than the output for that payer product line is not reported.

Risk Adjustment

Risk adjustment is used in HealthCost by adding a column called Patient Complexity for all costs that are provided as a bundled cost.  Risk adjustment provides a relative measure for the difference in the illness burden of patients in the analysis and treated by the selected providers.  Risk adjustment can be used to explain why the historical costs at one provider may exceed that at another provider.  Risk adjustment considers more than the diagnoses for the visit of interest.  Instead, all of the diagnoses throughout the period of the analysis are considered so that the effect of multiple comorbidities can be considered in evaluating how one patient population differs from another.  Examples of the conditions checked for in a patient's history are:  congestive heart failure, epilepsy, primary pulmonary hypertension, diabetes, and cancer.  Patient populations that average more comorbidity or have the most severe forms of disease are expected to need greater health care resources than a less complex patient population.

The application of risk adjustment is specific for the patients with the identified condition.  For example, Hospital A attracts a very "average" patient population when all treatments are considered, but Hospital A attracts very complex patients for breast biopsies.  When viewing the cost rates for breast biopsies, the Patient Complexity at Hospital A would be described as "HIGH."

The risk adjustment calculation is a relative index measure, where 1.00 is the mid-point, and values above or below are a calculated difference in expected resource consumption.  For the HealthCost website, the index measure is translated to an ordinal scale based on the index value when compared to other reported insurance carrier LOB calculations within the health care service selected.  The breakdown is based on percentiles, using the 90th, 75th, 25th, and 10th separation points.  Like most ordinal scales, the distinction between the values at neighboring points on the scale is not necessarily the same.

The rates provided in HealthCost are not risk adjusted.  They are the actual calculated rates based on the NHCHIS data and the HealthCost algorithms.  The risk adjustment field is provided in order to provide a possible explanation why the costs shown may be different than that of another provider.  When the rates are reported as an unbundled cost, variability will not be available since the cost of these services will be the same regardless of patient complexity.


A process exists to remove outliers.  Outliers are data values that do not represent the typical experience for a particular service at a particular provider location, and they can exist for several reasons.  In some cases the historical claims experience is incomplete.  These circumstances may exist when the providers have not billed for all services, or the insurance carrier has not processed all of the claims submitted for the visit.  Alternatively, human error may result in a particular service that is coded incorrectly.  An extreme example might be a service related to a kidney transplant that is coded as a kidney stone removal.  In this example the cost for the kidney stone removal would appear to be excessive.  Because the median is calculated instead of the average, outliers have a small effect on the estimated costs reported in HealthCost, but they can have a substantial impact in the formula used to assess the variability in the rates.

Removal of the outliers takes place at two points.  First, a ceiling for total charges in the analyses is established.  The ceiling is where 95 percent of all charges fall below, across all providers.  Observations above the ceiling are removed.

The second point where outliers are removed is after analyzing a specific provider's experience.  Patients with total charges in the lowest one percentile or highest fifth percentile are removed from the analysis.  The calculations of the percentiles are done using standard statistical conventions, so if the observation values to do not vary much from each other, it is unlikely any will be removed.

Outpatient Procedures:

Records are selected based on the American Medical Association's Current Procedural Terminology (CPT) code.  Since many of the codes are quite specific, a record count by CPT code is performed among codes that are for a similar service (e.g. all CPT codes for mammograms) and the frequency distribution is evaluated to see which codes are the most common procedures within the health care service.  A review of the CPT code descriptions takes place, to determine the simplest and most easily recognized procedure by a layperson.  A combination of frequency, simplicity, and consumer familiarity is used to determine which procedure code is selected to identify visits.  When available, clinical insight is also considered.

Once the procedure code is selected, all other procedures, services, supplies, or other costs performed or other items billed on the same day are added together to compile a visit.  This includes procedures performed by different providers.  If there are any codes included that are known to dramatically impact the visit, but only performed some of the time, then that particular patient's entire visit is excluded from the analysis.
Individual patient records are summarized for the day of service so that total charges and total amounts paid by the insurance company and patient can be reported.

A lead provider is assigned to the visit as the one entity responsible for all of the treatment costs.  This is necessary for comparison purposes, and is most often the facility where the procedure took place.  If there is no facility, then it will be a physician's office or clinic.

A statistical analysis of the data takes place prior to separating the data by payer (and payer insurance product).  The number of observations, mean, median, mode, coefficient of variation, skewness, kurtosis, extreme observation values, and graphical distributions (stem leaf plot, boxplot, and normal probability plot) of the data are evaluated.  The data are reviewed to determine whether the median can be a useful estimate of cost.  Although median is reported, the evaluation of variability is around the mean.  Data that are not considered acceptable are usually one of the following: not normally distributed, have a bimodal distribution, are unusually skewed, have a high kurtosis value, the mean is substantially less than the median, or there is a very high degree of variation.

The following is a real example of a diagnostic mammogram procedure that at this time we do not feel meets the criteria for inclusion in the HealthCost website (CPT 76091).
The summary statistics for one provider are:

N= 221
mean= $546
median= $703
mode= $703

A graphical representation of the data looks like:


Stem and Leaf Plot of the summary statics where the dollar value spent ranges from $175 to $925, each plot on the chart represents up to three counts, and the most common amount spent is $725 while the least is $925. In this case as $725 is above the expected $546, which means hospitals are probably charging too much for this procedure.

The numbers on the left represent the various charge for the procedures, and the numbers on the right the frequency.  The frequency is also represented by the number of asterisks across from left to right.  The first warning that there is a problem for HealthCost is that the mean is less than the median.  Usually when looking at health care cost data, the distribution is skewed to the right, or positive.  That means there are high cost outliers that pull the average charge up, even when the charges for most of the procedures are much lower.  When the median exceeds the mean, this is a sign the distribution is not typical of what we would expect when looking at the data.  If the distribution is not what we expect, than our assumptions in the model may not be correct.

The second major issue is that the distribution is bimodal in nature.  A bimodal distribution typically indicates that the distribution is in fact the sum of two different distributions, each with a single notable peak. However, it can be difficult to find the differentiating factor between the samples in one distribution and those in the other.  It may patient age, prior medical history, or any factor that may influence clinical judgment and the services provided.  The summary statistics do not show the multiple charge distributions for this procedure code.  Therefore, we cannot make a reliable prediction whether a patient will be faced with a procedure that has a charge close to $775, or less than $325.

After an initial statistical analysis, the data falling above the 95 percentile and below the one percentile are removed from the analysis.  

After excluding any extreme observations, the statistical analysis is performed again, and the same measures are checked to see if there are problems with the data distribution.  Since the median is the primary calculation of interest, removing outliers normally has a minimal impact to the reported figures.  Calculation of the median charge and median allowed are then performed for each payer.

An additional review of the output takes place to determine if the results are reasonable.  Unless it can be explained, major differences in charge amounts between payers for the same service would be considered an issue.  We assume patients will not face different charges due to which insurance company they are covered by.  Major deviations from the expected costs would also undermine the use of the payment data.  Such deviations may include small insurance companies with dramatically lower payment rates, or unlikely differences between managed care and indemnity lines of business within the same insurer.  Usually the smallest insurance companies have the least favorable contracts, and managed care insurance products have the deepest discounts.

The following is an example of how the data are selected to report on outpatient bilateral mammograms:

Inpatient mammograms are removed.

Patient records with a bilateral mammogram CPT code of 76092 (mammogram "screening") are selected.  Then, anything else the provider(s) performed during the visit is bundled into the analysis.  
All patients who had a bilateral mammogram 76091 ("mammogram diagnostic") or G0202 ("diagnostic digitization") on the same day are removed from the analysis.  They are expected to cost more, inflate the results, and create comparability issues.  
Patients with total charges exceeding the top five percentile (across all providers) are removed.

Patients with total charges in the lowest one percentile or highest fifth percentile are removed from the analysis (specific to the provider organization).

Results are reviewed the median calculations are checked for reasonableness.  


The pharmacy costs are calculated using the same methodology as medical costs but are shown at retail charge only.  Due to the complexities of pharmacy formularies that can vary greatly from plan to plan under the same carrier, we are not able to show the actual out of pocket cost to the consumer at this time.

The brand name and generic drugs have also been cross walked using the FDA database to find the substance name (active ingredients)to match the brand and generic drug to allow similar drugs to be shown in tandem for ease of understanding (since many generic drug names are not familiar). Dose, unit and form information was derived from the FDA database. Oral chemotherapy (cancer) drugs were further researched using MassHealth (Massachusetts Medicaid) formulary after identifying their use in CHIS, the NDC was reviewed using this database to ensure its use as an oral treatment rather than IV;jsessionid=C204AA02DC20E069DFA48BCC371243DF?id=93  

Quality Indicators

Performance measurement data reports on National Quality Improvement Goals (NQIGs) for hospitals and is collected by The Joint Commission which is a recognized and award winning international leader with a long proven ability to identify, test and specify standardized performance measures.  These goals allow hospitals to report on key quality of care indicators in up to five treatment areas: heart attack, heart failure, community acquired pneumonia, pregnancy and related conditions, and surgical infection prevention. These conditions are the most common reasons that patients go to the hospital and they affect hundreds of thousands of patients each year. Patients who are treated according to these guidelines are more likely to improve or and have good outcomes of care.

More information on the methodology used by the Joint Commission can be found here:

Patient experience quality information is collected from actual patients who were treated at the hospitals and recorded as part of the Centers for Medicare & Medicaid Services (CMS) Hospital Quality Initiative. The Hospital Quality Initiative ( uses a variety of tools to help stimulate and support improvements in the quality of care delivered by hospitals. The intent is to help improve hospitals' quality of care by distributing objective, easy to understand data on hospital performance, and quality information from consumer perspectives.  Timeliness of care and other CMS measures methodology can be found here:

Hospitals are shown as better than average, below average and near average.  These ratings indicate how the hospitals compare to the national average.  If a hospital is within 5% of the national average, it is given a yellow circle that indicates near the average.  Greater than 5% in either direction results in a triangle which indicates either better or below average.

Updated 11/23/2015