Sensory Analysis

Site:	Plattform für Weiterbildung und Internationalisierung der Hochschule Weihenstephan-Triesdorf
Course:	Entrepreneurship in Food
Book:	Sensory Analysis

Printed by:	Gast
Date:	Saturday, 30 May 2026, 2:36 PM

Description

1. Introduction
2. The Examiner
3. Sample Preparation
4. Do's and Don'ts
5. Thresholds
6. Types of sensory tests
7. Types of tests for recipe optimization

1. Introduction

Sensory analysis can be used in the product development stage, for quality control purposes (including storage stability), to measure consumer satisfaction, or if any changes need to be made to the recipe. This can be the case if the product needs to be improved, if the process changes, or the production cost needs to be reduced.

In the following chapters we will look at the procedures and methods of sensory analysis.

2. The Examiner

Sensory analysis or tasting can be carried out by a panel of laymen, e.g. typical consumers of the product, with or without prior experience.

It can also be carried out by trained examiners e.g. laboratory personel, who have a high sensory sensivity and are trained to do tasting.

A third group of food tester can be an appraiser, a person who has received training in sensory analysis but has also expert knowledge about the product.

All examiners should have:

A neutral attitude towards the product to be tested
Linguistic competence
No allergies or intolerances to test product
Good senses (no olfactory, gustatory disorders, etc.)

To avoid influences that could lead to erroneous judgements, examiners must avoid substances that irritate the sensory cells in the mouth and nose, like alcohol, coffee, tobacco, spicy foods, sweets, and strong perfumes, 30 to 60 minutes before the tasting exercise.

3. Sample Preparation

For successful sensory testing, sample selection, preparation, neutralization, coding and presentation are critical to provide an objective evaluation.

Sample selection
The differences between the samples should be as small as possible, i.e. all testers should receive the same samples as far as possible. Ensure all samples are available in sufficient quantities to allow multiple tastings.
Sample preparation
Samples must all be at the same temperature, unprepared samples are tasted at room temperature (except e.g. ice cream). Hot samples are heated shortly before tasting. Serving sizes, and presentation should be as similar as possible. Pay extra attention to samples that depend on solubility and mixing of ingredients.
Sample neutralization
The sample name or brand should be not identifiable. Therefore, the sample must be transferred to a neutral container. All containers must be identical.
Sample coding
The samples are neutrally coded before testing to make them anonymous. Samples should be presented in a randomized order and vary in repeated sessions.
Sample presentation
The samples are presented in neutral vessels, in terms of smell, taste, and color. Also crockery used in tasting nmeeds to be taste- and odorless.

Room preparation

The ideal room for a tasting is neutral, offers no distractions and is free of noise, where necessary areas are partitioned off. Lighting needs to be good, ideally daylight. To ensure undisracted odor and taste perception the room needs to be well-ventilated.

A tasting series should not consist of too many samples to avoid the risk of fatigue, wherby an examiner is no longer able to perceive sensory stimuli adequately.

4. Do's and Don'ts

To prepare for panel tasting it is important to formulate clear questions beforehand, which are not ambiguious or ask for information that the examiner cannot give, because she/he is not familiar with certain terms or concepts.

Provide a clear description of the task and explain the tasting process beforehand.
During the tasting always ask questions about appearance and smell first, as these are the first impressions that are perceived.
Always provide more than only 2 or 3 possible answers to JAR or popularity tests.
Avoid colloquial language, like "I don't know" or "I guess". It is better to use terms like: "Neutral" or "Rather less".
Be aware that you cannot expect sensory expertise from untrained consumer-panel members. Therefore, avoid technical insider jargon.
Do not ask questions that require long, detailed answers, but instead ask more questions that can be checked quickly.

Examiners can give personal information only voluntarily. However, discrimination by age group or gender might sometomes be useful, e.g. in hedonic and popularity testing.

5. Thresholds

1. Threshold

Lowest concentration or intensity at which a change compared to another sample is perceived, which means: the type of change is not yet recognized, only a difference has been noticed.

2. Detection Threshold

Lowest concentration or intensity at which a characteristic or substance can be identified (e.g. sweet).

3. Difference Threshold

Lowest concentration or intensity that is necessary to perceive a difference between two stimuli.

4. Saturation Threshold

Concentration or intensity, above which a further increase does not produce a stronger impression. This is reached as soon as all receptors are occupied.

6. Types of sensory tests

We can differengtiate different types of sensory tests depending on the information we require.

1. Analytical sensory testing

Requires trained examiners and is usually carried out with small panels (6-14 examiners)

This type of testing is used to evaluate specific characteristics of a product e.g. during prodcut development or process changes or for purposes of quality control.

2. Hedonic sensory testing

This type of testing evaluates the subjective perception of product characteristics and is usually done with large panels (more than 60 examiners) who do not need to fullfil any partiqular requirements, which means they can be selevted from the consumer group.

This type of testing can be used during product development, market research, or to study consumer behavior/preferences.

7. Types of tests for recipe optimization

difference test

Tests if a change made to the product results in a perceived difference.

e.g., triangle test, pairwise difference test

intensity test

Tests wether a change lead to a certain feature being perceived more or less intensively

(1-5 point scale)

popularity test

Tests wether a change improves or worsenes the taste of the product

(like/dislike scale 1-9)

JAR test (Just About Right)

Tests the perception of a specific attribute of a product

(too high, too low, or just about right)

7.1. Example: Triangle test

Triangle testing tests for the perceived difference between two slightly different products.

The panel should consist of a minimum of 5 examiners or more if untrained examiners are used.

Preparation:

The examiner is presented with 3 samples of which 2 are identical and one is different. They must be presented in identical containers and there must be no visual difference between the samples (color, filling, etc.). Samples must have the same temperature. Sample must be identified with randomized three digit numbers.

The examiner is given written instructions and is requested to find the deviating sample. The examiner is also instructed to make a guess in case no difference can be definitely determined. In that case the examiner has to leave a remark that the decision was the result of a guess.

The test can be repeated several times with different, randomized set ups (order in which the samples are presented) for higher statistical accuracy of the result. Optionally, the order in which the examiner has to taste the samples can be prescribed. However, re-tasting of the samples against each other is usually permitted.

Possible rehearsal setups

7.2. Statistical Evaluation of triangle tests

There are two possible outcomes for each tested triangle:

NO, the examiner did not find a difference between the samples and had to make a guess. This is called the hypothesis of "no difference" or null hypothesis (H₀)
YES, the examiner detected the difference and was able to correctly identify the sample. This is the opposite of the null hypothesis, also called the alternative hypothesis (H₁)

As there are 3 samples in each triangle, of which 1 is indeed different, the chance that the examiner guesses the correct sample by sheer luck is 1 out of 3 (1/3).

From this follows that the number of expected correct guesses (Ec) made by all examiners is:

The number of examiners (n) multiplied by the probability of lucky guesses (1/3) => Ec = n*(1/3)

Example: the panel consists of 42 examiners => n = 42

Ec = n*(1/3) => Ec = n/3 => Ec = 42*(1/3) => Ec = 14

That means, if 42 examiners identify the correct sample 14 times, this could be 100% lucky guesses.

This needs to be compared against the number of actually given (observed) correct answers (Oc), and we need to determine the probability (p) of the correctly selected samples not being identified because of their actual difference but by sheer chance. Obviously, to be sure that the difference was indeed correctly identified, the probability of this error should be small. We can set the level of accuracy at which we want to accept the results as significant (= significance level α), ourself. Or in other words, we can decide how sure of the result we need to be to be able to reject the null hypothesis H₀. The conventionally accepted accuracy is usually 95% or even 99%.

The significance level is calculated as follows:

x% = (1-α)*100

Which means:

0% accuracy translates to α = 1

95% accuracy translates to α = 0,05

99% accuracy translates to α = 0,01

100% accuracy translates to α = 0

=> The smaller the value of α, the higher the accuracy level against which we compare our result.To calculate the probability (p) we use the binomial distribution formula:

whereby z stands for the z-score (Standard Deviation). These are fixed values for each accuracy level:

at α = 0,01 the corresponding z = 2,33

at α = 0,05 the corresponding z = 1,64

Example:

\( p=(42/3)+2,33 \sqrt[]{2*42/9} \)

=> p = 21,12

This value is rounded up to the next whole number, in this case 22. Which means, the examiners have to perceive a difference at least 22 times to determine with an accuracy level of 0,01 (99% likelyhood) that the products have indeed a decernable difference.

So, if the observed number of correct answers (O_c) is larger than the probability of a correct answer (p) we can reject the null hyposethis that the products have no difference.

=> O_c\( \geq p \neq H₀ \)

Or else, if the observed number of correct answers (O_c) is smaller than the probability of a correct answer (p) we can determine that the null hypothesis is true, and that there is no decernable difference between the two products.

=> O_c \( \leq p = H₀ \)

7.3. Sensory Evaluation table

The calculations for each number of participants at different accuracy levels has been published in tables by Roessler et al. in 1978 and has been adapted by Lawless and Heymann in 2010.

n = the number of participants

α = the accuracy level (0,05 => 95% accuracy; 0,01 => 99% accuracy)

The numbers in the table indicate the number of required correct answers to determine that a difference between the sampes exist. If the number of correct answers is \( \leq \) the number indicated in the table, no decernable difference between samples exists with statsically significant accuracy.

7.4. Ranking Check

The ranking check tests how a number of products relate to one another in regard to a particular attribute. This test can be used to determine of the influence of:

different raw materials
different treatment methods
different packing and storage methods

Advantage: More than 2 products can be compared with each other

Disadvantage: If the differences are only small the results can be varied and unclear. In that case additional triangle tests can can be conducted to narrow down the results.

Preparation:

Several samples are placed next to each other and the examiners are requested to rank them according to the intensity of an attribute, e.g. saltiness, sweetness, etc.

The samples are then ordered from weakest to strongest intensity of that attribute.

7.5. Statistical Evaluation of a ranking check

To determine the overall ranking of a sample simply all rankings are summed up and then the samples are ranked in the order of the smallest to the largest sum

The table shows the ranking each examiner gave each sample. The total ranking is determined by adding all ranks per sample.

Rank Sum Calculation
	Sample 830	Sample 713	Sample 924	Sample 412	Sample 335
examiner 1	1	2	3	5	4
examiner 2	2	1	4	3	5
examiner 3	2	3	1	4	5
examiner 4	1	2	3	5	4
examiner 5	1	2	4	5	4
examiner 6	2	1	5	4	3
rank sum	9	11	20	26	25
Overall rank	1	2	3	5	4

7.6. Example: Popularity Test

A popularity test is used to test whether a change has improved or worsened a product.

The test can be done for individual samples or for a range of samples, but they are all judged individually.

The judging can be for the overall liking of a product and/or for certain atributes of the product, e.g. the fragrance of the product.

Like or dislike is expressed on a scale of 1 - 9, whereby:

1 = dislike extremely

2 = dislike very much

3 = dislike moderately

4 = dislike slightly

5 = neither like nor dislike

6 = like slightly

7 = like moderately

8 = like very much

9 = like extremely

7.7. JAR (Just About Right)

The JAR method is used to test the intensity or exptression of a certain attribute of a product, e.g. saltyness, or sweetness. If several attributes are classified through the JAR method they can be linked to the overall popularity of a product.

An attribute can thereby be either too strong, strong, just about right, weak or too weak. Rating is done accordingly on a 1 - 5 scale.