How to Weight Percentages – MeasuringU

Absolute Percentages from Binary Data

In the task completion example above, failures can be coded as 0 and successes as 1 for each participant, so the underlying data are binary. With five failures and five successes the average is (0 + 0 + 0 + 0 + 0 + 1 + 1 + 1 + 1 + 1) / 10 = 5 / 10 = 0.5 = 50%. Because these percentages can take values from 0 to 100%, they are absolute percentages.

Net Percentages from Trinary Data

When outcomes at the individual level can be assigned values of −1, 0, or +1, the underlying data are trinary. Percentages computed from trinary data can range from −100% to +100%. One way to compute the well-known Net Promoter Score (NPS) is to assign a −1 to detractors, 0 to passives, and +1 to promoters, then average the individual scores.

For example, in 2019, we collected and analyzed data on meeting software, including the NPS. For GotoMeeting, there were 8 detractors, 13 passives, and 15 promoters (n = 36). The resulting NPS was (8(−1) + 13(0) + 15(1)) / 36 = (−8 + 15) / 36 = 7 / 36 ≈ .194 = 19.4%.

Averaging Multiple Percentages

When you have a set of percentages, you can compute their mean by adding them and dividing by the number of percentages. For example, suppose you’ve conducted usability studies on three different accounting systems with the following percentages of successful installation:

System A: 40 / 50 = 80%
System B: 36 / 45 = 80%
System C: 30 / 60 = 50%

The average of those success rates is (80% + 80% + 50%) / 3 = 70%.

Absolute and net percentages can be weighted at the group or case (individual participant) level. The process of group-level weighting is simpler than case-level weighting. The advantage of case-level weighting is that it enables more complex analyses such as weighted regression, but if there are no plans for such complex analyses, group-level weighting will usually suffice.

Group-Level Weighting

Prior experience with a product is one of the most consistently strong predictors of differences in UX metrics. People with more experience tend to be more successful at completing tasks than those with low to no experience.

Consequently, weighting by experience levels is a common and legitimate use of weighting in UX research. Consider a hypothetical dataset with a hundred cases that represents successful completions for a task evaluated in a typical unmoderated study for an accounting software package (e.g., creating a balance sheet).

As summarized in Table 1, the sample consists of 25 participants with low experience (three or fewer years of experience with the software) and 75 with high experience (more than three years of experience). But the distribution in the sample differs from the reference population, with percentages determined from a very large sample of customer data (75% low experience, 25% high experience).

Experience	Success	n	Sample	Reference
Low	36.8%	25	0.25	0.75
High	73.3%	75	0.75	0.25

Table 1: Example task success percentages for Low and High levels of experience, showing sample and reference proportions.

To weight the overall percentage to the reference population, simply multiply each group percentage by the reference proportion, then add them together. When you use this method, the weights should sum to 1 (.75 + .25 = 1). The overall sample size for the weighted percentage is the sum of the group sample sizes, 25 + 75 = 100.

WeightedPercentage = .75(36.8%) + .25(73.3%) = 27.6% + 18.3% = 45.9%

In contrast, the unweighted percentage at the group level is 55.1% (a simple average of 36.8% and 73.3%), 9.1% higher than the weighted percentage because the higher percentage of successful completions came from the smaller group in the reference population.

Case-Level Weighting

To weight at the case level, divide the reference proportion by the sample proportion.

For this example, the weights for low-experience cases would be .75 / .25 = 3; for high-experience cases, the weights would be .25 / .75 = 1 / 3. This has the effect of increasing the influence of low-experience cases and decreasing the influence of high-experience cases on the weighted percentage while keeping the sum of the hundred case weights equal to the original sample size (Low = 25(3) = 75; High = 75(1 / 3) = 25; 75 + 25 = 100).

We know from Table 1 that the low percentage is 36.8% with a sample size of 25, and the high percentage is 73.3% with a sample size of 75. If we computed an unweighted average across those 100 cases, then the percentage would be (36.8%(25) + 73.3%(75)) / 100 = 64.2%. Note that this is the same value you would get by multiplying the Low completion percentage by its sample proportion and adding that to the High completion percentage multiplied by its sample proportion (.25(36.8%) + .75(73.3%) = 9.2% + 55.0% = 64.2%).

With case weights applied, the percentage would be (36.8%(25)(3) + 73.3%(75)(1/3)) / 100 = (2760% + 1833%) / 100 = 45.9%—the same as the result obtained by weighting the group percentages with the reference proportions.

We can use the formula above to compute the case-weighted percentage because the same weight was applied to each case in a given group. Here’s the detail:

Low-Group Cases

3(Low₁) + 3(Low₂) … + 3(Low₂₅) = 3(Low₁ + Low₂ … + Low₂₅)

= 3(25)(( Low₁ + Low₂ … + Low₂₅)/25) = 3(25)(LowGroupPercentage)

High-Group Cases

(1 / 3)(High₁) + (1 / 3)(High₂) … + (1 / 3)(High₇₅) = (1 / 3)(High₁ + High₂ … + High₇₅)

= (1 / 3)(75)((High₁ + High₂ … + High₇₅) / 75) = (1 / 3)(75)(HighGroupPercentage)

Case-Weighted Percentage

CaseWeightedPercentage = (3(25)(LowGroupPercentage) + (1 / 3)(75)(HighGroupPercentage)) / (SumOfWeights)

= ((3)(25)(36.8%) + (1 / 3)(75)(73.3%)) / 100 = 45.9%

Note that the best estimates of the percentages for the Low and High experience groups are those shown in Table 1. The effect of weighting is meaningful only for adjusting the overall percentage to better match the reference population and works well only when there is a good reference population from which to obtain weights.

Suppose instead of just two levels of experience, there were the six groups shown in Table 2 (group percentages and sample proportions computed from the sample data available for download in the appendix, reference population proportions made up for this example).

Experience (months)	Success	n	Sample	Reference	Case Weight
1–6	0%	6	0.06	0.38	6.3333
7–12	20%	5	0.05	0.17	3.4000
13–24	40%	5	0.05	0.11	2.2000
25–36	40%	9	0.09	0.08	0.8889
37–48	50%	13	0.13	0.09	0.6923
49–60	77%	62	0.62	0.17	0.2742

Table 2: Data for six experience groups (experience levels are for months of subscription to accounting software).

Group-Level Weighting

For this example, the most egregious mismatches are at the lower and upper experience levels where, according to the accounting software company’s records, 38% of their subscribers have used the software for 1–6 months and 17% of subscribers have used the software for 4–5 years, but their representation in the sample was, respectively, 6% and 62%.

Using the reference proportions, the weighted percentage is:

0%(.38) + 20%(.17) + 40%(.11) + 50%(.08) + 50%(.09) + 77%(.17) = 29%

Even though almost 2/3 of the sample was in the most experienced group, with 77% successful task completions, the group-weighted percentage of 29% was much lower due to the large weight for the least experienced and least successful group.

Case-Level Weighting

First, let’s check to see if the sum of the 100 weights equals 100.

6(6.3333) + 5(3.4000) + 5(2.2000) + 9(0.8889) + 13(0.6923) + 62(.2742) = 100

The sum of the weights is 100, so the case-weighted percentage is:

(6(6.3333)(0%) + 5(3.4000)(20%) + 5(2.2000)(40%) + 9(0.8889)(50%) + 13(0.6923)(50%) + 62(.2742)(77%)) / 100 = (0% + 340% + 440% + 400% + 450% + 1316%) / 100 = 29%

The case-level process has more steps than the group-level process, but the result is the same.

If this were a real example, it would be important to investigate why the discrepancy was so large between the low and high groups in the sample and the reference population, and we caution against using these results for any important business decisions until fixing the sampling problem.

Like any other number, you can add a set of percentages together and divide by the number of percentages to get a mean for that set. This works perfectly well as long as all the percentages were computed from ratios with the same denominator (base value).

Things get trickier when the percentages have different base values. Returning to the example in the introduction, suppose you’ve measured success rates for an installation task for three different accounting systems with the following results:

System A: 40 / 50 = 80%
System B: 36 / 45 = 80%
System C: 30 / 60 = 50%

If you view these values as being the individual best estimates for each system and you want to know the average across systems, then you don’t care about the differences in the denominators. In that case, you’d compute the simple average of (80% + 80% + 50%) / 3 = 70%.

Alternatively, if you want to take the different sample sizes into account, as long as you have access to the original ratios, you could combine them to get an average weighted by sample size:

WeightedAverage = (40 + 40 + 30) / (50 + 45 + 60) = 106 / 155 ≈ 68%

Or you could use the proportion of the sample for each system for a weighted average to get the same result. The total sample size was 155, so the weights for Systems A, B, and C, in order, are 50 / 155 = .3226, 45 / 155 = .2903, and 60 / 155 = .3871 (which sum to 1), so the weighted average is:

WeightedAverage = .3226(80%) + .2903(80%) + .3871(50%) ≈ 26% + 23% + 19% = 68%

This weighted average is slightly lower than the simple average, reflecting the influence of the largest sample size for the lowest success rate.

For another example, in a previous article, we computed the no-show rate from 24 usability studies, 10 in-person and 14 remote. The average no-show rate for in-person studies was 5.8%, and for remote studies, it was 4.5%. The simple average across conditions is (5.8% + 4.5%) / 2 = 5.15%. Weighting the average by sample size results in:

(10(5.8%) + 14(4.5%)) / 24 ≈ 5.04%

The sample-size weighted average is slightly lower than the simple average because four more cases for the condition had a lower no-show rate. Rounding off to the nearest percentage results in an estimate of 5% for either method.

For these examples, there wasn’t much difference between unweighted and weighted means, but there also wasn’t much difference in the group sample sizes. When there are greater differences in group sample sizes and their percentages, the difference in weighted and unweighted averages will be larger.

In this article, we focused on the most common situation that UX researchers encounter—when there is a need to use weights to adjust sample percentages to better approximate a reference population (matching against a single variable). We also discussed weighting a set of percentages based on their proportions of a total sample size. We covered the following key points:

Percentages and proportions are two different ways of displaying the same number. Any computations (including the use of weights) that use proportions are also applicable to percentages (and vice versa).

Percentages that can be conceptualized as means can be treated as means for the purpose of weighting. Absolute percentages are fundamentally the mean of a set of 0s and 1s (binary data). Net percentages can be computed as means of trinary data after assigning −1 to low values, 0 to intermediate values, and +1 to high values (e.g., NPS detractors, passives, and promoters). Sets of percentages can be averaged (weighted or unweighted) like any other number.

Percentages can be weighted at a group or case level. For group-level weighting, multiply each group percentage by the reference proportion, then add them together. To get weights for case-level analysis, divide the reference proportion by the sample proportion for each group, then use those weights when conducting other analyses with individual data, including means, standard deviations, and even regression. When all you need are weighted percentages, the group level is simpler, but for more complex analyses, you need case weights.

When averaging a set of percentages, analysts need to decide whether it’s more appropriate to compute an unweighted average or a weighted average based on group sample sizes. The basis for this decision is whether the percentages are the best estimates for their conditions or are multiple estimates of the same condition. When they’re the best estimates for multiple conditions, an unweighted average is appropriate. When the percentages are multiple estimates of the same condition, a weighted average based on sample sizes is more appropriate. Note that when the sample sizes for each estimate are the same, the weighted and unweighted averages will be the same.

What should you do if you are trying to match a sample to a reference population against more than one variable? There are methods for doing this (e.g., raking), but they are more complicated and require specialized software. We’ll cover such methods in a future article.

This appendix provides a link to download the hundred cases used in the examples in this article. Interested readers can replicate our analyses using their preferred statistical package. The data is available for download at:

Weighting Exercise Sample Data

In the sample data, 6Groups designates the six levels of product experience in months; 2Groups has just two groups, where the Low group is made up of the 25 cases from the first four groups in 6Groups, and the High group is made up of the 75 cases from the remaining two groups in 6Groups.

How to Weight Percentages – MeasuringU

Absolute Percentages from Binary Data

Net Percentages from Trinary Data

Averaging Multiple Percentages

Group-Level Weighting

Case-Level Weighting

Group-Level Weighting

Case-Level Weighting

You may also like

Subscribe