Consider the data from the 2001 Motorola marathon. It shows completion times for 4,937 entrants in the Austin Marathon. The data allows us to compare not just individual rankings (1^{st}, 2^{nd}, 3^{rd}, 4^{th}) but also the distance between them. This is called scalar data. From it we can see a very large percentage of entrants finished between 3 hours 30 minutes and 4 hours 30 minutes. This clustering of performance is characteristic of a normal distribution and normal distributions are frequent when we measure human performance. So the normal distribution is important to understand.

If you took a large random sample of Irish adults and measured their height, you would notice that there were more people of about average height than there were very tall people or very small people. If you graphed their heights, you would see that they were distributed in the form of a NORMAL DISTRIBUTION (see figure below), which is in the form of a bell-shaped curve. Other human characteristics, such as abilities, just like height, are distributed normally throughout the population.

Now, Imagine that a psychometric assessment can provide us with a quantitative measure (number) for an individual’s performance on a performance related variable. This scalar data is extremely useful as we can then compare the individual’s performance with that of others in normal distribution. However, too often we use a scoring system which is easily misunderstood in the context of the normal distribution. Although most people feel they understand percentiles, this scoring system can be a false friend.

** **Percentiles tell us the percentage of a sample that an individual has performed better than.

Percentiles are one of the most widely used statistics for comparing performance. They are used by public health nurses to communicate differences in how individual babies are growing (e.g your child is at the 40%ile in height for their age and they are at the 60th percentile in weight). If you were a worrying parent, you might think the child is heading for obesitity. However as we shall see there is not much difference between the 40^{th} and 60^{th} percentile. Percentiles have limitations because people don’t really understand the normal distribution.

**Human Error concerning the Normal Distribution and the Interpretation of Percentiles**

If you study the data in exercise 4.1 of your portfolio you will see the data approximates to a normal distribution. All normal distributions show this characteristic peakiness. Put simply more people score near the centre than at the extremes. Unfortunately this fact is not always intuitive in people’s thinking. If people intuitively understood the normal distribution they would appreciate how much easier it is in real performance terms to move from the 45^{th} percentile to 50^{th} percentile than from the 95^{th} percentile to the 99^{th}. In reality it is about 10 times easier to make a 5% improvement near the mean than at the extremes.

The “peakiness” in the distributions of human performance is a significant phenomenon and it’s underestimation is a significant human cognitive error. It leads to the potential for misunderstanding and error in the interpretation of percentile performance. This problem is underpinned by the tendency to assume linearity in all numerical scales. Percentiles are essentially a ranking scale. They tell us about a person’s position in a distribution but not the distance between their performance and that of other people. In short percentiles can be misleading because they have unequal measurement intervals. Any effective interpretation of percentile differences requires an appreciation of the fact that they are not a linear scale.

The use of percentiles can frequently lead to:

- Underestimation of performance differences at the extremes of a distribution
- Overestimation of performance differences near the centre of the distribution

Misinterpretation of percentiles is just one of many when people try to make comparisons of human performance. There is much scope for ambiguity and confusion in the use of everyday language to describe performance. Words like acceptable, good, or average are used differently by different people. A consistent pattern in research on the rating of interview performance is that interviewers cluster their ratings around the mean. They frequently ascribe ratings of “3 or 4” on a 5 point scale to over 80% of candidates. The appropriate percentages should be more like 60%. People professionals need to learn to be precise in their use of numbers and language when describing differences between people.

**The Empirical (68- 95- 99% ) Rule**

Most psychological distributions display a* normal frequency distribution. *The normal distribution is a well known phenomenon in science. All normal distribution curves satisfy the following property which is often referred to as the

*Empirical Rule*.

**68% **of the observations fall within **1 standard deviation** of the **mean**

**95% **of the observations fall within **2 standard deviations** of the **mean**.

**99% **of the observations fall within **3 standard deviations** of the **mean**

When we are dealing with normally distributed data we can use powerful statistics such as the empirical rule. Remember that the 68-95-99 rule applies to **all** normal distributions. Also remember that it applies **only** to normal distributions

Wiekie BartlettI am a registered industrial psychologist in Cape Town (South Africa). During June 2009 until December 2013, I have assessed the potential of 160 candidates of one organisation. Each time I compared the raw result of each candidate with a South African general working population norm group (n = 3000 people). I now want to compile a norm group for this specific company, using the raw scores of the 160 candidates. In future, I want to compare the results of candidates that I assess for this specific organisation with the norm group that consist of 160 candidates. What are the benfits of using the internal norm group of 160 comparing with the general norm group that consist of 3000 individuals?

Kind regards

Wiekie Bartlett

adminPost authorHi Wiekie

Interesting question- The primary benefits of an internal norm group is perceived relevance and transparency within the organisation. In addition, it can also be instructive for the company to compare the means and standard deviations for the their norm group and the external norm group. The local norm data will tell you (and your client) how their population compares to the wider population on the specific criterion measured. Going forward, they could even track how changes in their organisational practices (e.g. recruitment) affects the population they are attracting.

Overall, I am a believer in the value of creating local norms which can be compared with larger population samples. That said, I believe that the proliferation of norms (in test catalogues) based on highly specific samples for groups which are unknown to the reader is more of a hindrance than a help. The local norm group you create is likely to have limited value outside the local organisation as their circumstances may not apply elsewhere. In this respect there is a great deal to be said for larger population norms.

Declan