The scoring benchmark shows how your candidates scored relative to other candidates who have taken the same tests.
Let’s say you manage a digital marketing team and you need to hire a social media specialist. You have worked extensively in social media yourself, and know that you’re very skilled at it. You put together an assessment, take it yourself, and achieve a raw score of 80%. In this situation, you can use yourself as a benchmark, comparing all candidates to your raw score — your percentage of correct answers — and searching for those who score close to or better than you.
A few months later you need an SEO specialist. You’re notoriously very bad at SEO, so you don’t want to use yourself as a benchmark. What do you do?
In this article, we explain how you can use different scoring benchmarks to help make sense of candidate scores. This article is for all users of all plans.
Approx. reading time: 5 minutes
In this article
- What is a scoring benchmark?
- Benchmarks offered
- Changing the benchmark
- Common questions
What is a scoring benchmark?
Benchmarking helps measure the performance of your candidate on the tests you gave them — after all, a candidate’s score doesn’t tell you much if you have nothing to compare it to. If you’re looking to hire an SEO specialist, it would be helpful to know how other SEO specialists perform on the tests you have chosen for your assessment.
TestGorilla handles this through our scoring benchmark feature. We present candidate scores as percentiles, benchmarking each candidate against different norm groups.
To really understand this, it’s probably good to recap some of these terms:
A norm group is a collection of similarly skilled candidates. In the case of TestGorilla, the groups are sorted by:
By default, we compare against All candidates — meaning all other candidates who have taken the same test. You can select a norm group more specific to your job role.
A percentile rank indicates the percentage of candidates in the selected norm group whose raw score is less than or equal to the raw score of the candidate in question.
So if a candidate shows a 75%, this means they are in the 75th percentile — they did as well as or better than 75% of the other candidates in the norm group.
Emily’s candidate, Maria, takes a test and achieves an 80% raw score. Is Maria a high-performing candidate or not?
Emily chooses to change the scoring benchmark to compare Maria to other candidates in the TestGorilla system who are in marketing. Maria’s score now shows as 60%. This means that while Maria did as well as or better than 60% of the candidates who took this same test, 40% of candidates performed better than her.
The average score shown for the overall assessment is the statistical mean of the individual percentile ranks for each test within the assessment, rounded to the nearest whole number.
To recap your schooling, the definition of a statistical mean: add up all the numbers in a data set and divide by the total number of data points. So the statistical mean of 20, 15, 25, 70, 85, 60, and 47 is 46.
(20 + 15 + 25 + 70 + 85 + 60 + 47) ÷ 7 = 46
Emily is using the marketing scoring benchmark to look at her candidate Maria. The assessment consisted of 5 tests, and she scored the following:
- Test 1: 90th percentile
- Test 2: 60th percentile
- Test 3: 75th percentile
- Test 4: 40th percentile
- Test 5: 99th percentile
The Average score shown would then be 73% because:
(90% + 60% + 75% + 40% + 99%) ÷ 5 ≈ 73%
Benefits of percentile ranking
We have two key reasons for utilizing percentile ranking:
- A percentile rank score normalizes for differences in the difficulty level of a test. So the scores of different tests in an assessment become comparable.
- Percentile rank scores give you great insight into the performance of a test, even if you have only one candidate. While it's helpful to have the scores from many candidates (it increases the odds that you have at least a few very good ones), you can interpret the performance of an individual candidate using a percentile rank score.
Establishment of norm groups
For all tests, we use a sample size of no less than 1,000 test-takers across all relevant norm groups. For most tests, the sample sizes are substantially higher across groups. For irrelevant combinations (such as the React test combined with the “Customer service” norm group), we use the All candidates default to express the results.
We make it possible for you to choose the benchmark group — or norm group — most relevant to your assessment. This is why we ask candidates to provide some demographic information at the end of every assessment.
The following norm groups can be selected:
- All candidates (this is the default group)
- Based on education level:
- Some high school
- High school diploma / GED
- Some college / associate degree
- Bachelor's degree
- Master's degree or higher
- Based on business function:
- Customer/IT support
- Engineering (other than software)
- Human Resources
- Quality Assurance
- Research & Development
- Sales/Account Management
- Software development
- Based on seniority:
- Junior (up to 3 years of experience)
- Senior (4 or more years of experience)
- % correct - this is technically not a norm group but simply shows the raw score
If you choose to allow extra time for your tests, you will only be able to view the % correct benchmark. We can’t make a fair comparison between candidates if they had different time limits for their tests.
It's not possible to cross-reference norm groups. For example, software developers with a Bachelor's degree.
Changing the benchmark
You can set the benchmark from two places in the platform:
- Candidates table. On the assessment overview page, you can choose your desired benchmark using the box labeled Scoring benchmark found on the top row of the candidates table.
- Candidate’s results page. On the results page, the scoring benchmark box is found below the candidate’s test results. Click on the box to choose the desired benchmark from the dropdown menu. The scores will immediately be updated to reflect your change.
Changing the benchmark in one place will also change it in the other. If you change to % correct on the candidate’s results page, the candidate’s table will also change to reflect this.
I am only seeing the option of percentage of questions correct. Why is this?
If you have allowed extra time on your assessment, you will only be able to see the percentage of questions answered correctly. To be able to provide a fair and accurate comparison between candidates, they must all have taken the tests under the same conditions i.e. with the same time constraints. You are still able to compare your candidates against each other, as the percentage displayed is the actual score they achieved in the test.
Can I use two benchmarks at the same time, like senior level marketing professionals?
It isn't currently possible to cross-reference multiple benchmarks, but this feature will likely be added in the future.