Evaluating Small Impairments with the Mean Opinion Scale - Reliable or Just a Guess?

101st AES Convention 1996, Preprint #4396 (E-1)

All recent listening tests used the MOS-scale (mean opinion score) to evaluate small impairments of audio quality. The quality of perceptual measurement schemes will be rated in their ability to predict the result of listening tests.
In this paper the results of several large listening series done at several test sites are evaluated to gain information about the reliability and repeatability of results based on the MOS-scale.