Donate |

Consumer Reports’ food and sensory testing

Many of Consumer Reports' tests involve the use of sensitive instruments. A liquid chromatograph determines how much caffeine is in coffee, and an atomic absorption spectrophotometer determines the amount of heavy metals in plastics and toys. A digital photometer measures light and color of TV displays. To evaluate a food's nutrition, we use sophisticated laboratory instruments. But how do we evaluate its sensory quality—the characteristics of its ingredients, the balance of its flavors? To do this, we use a very sensitive instrument called the human palate.

Consumer Reports has a panel of people who have been carefully screened. During their initial interview, they let us know, among other things, whether they're willing to eat foods they dislike (Visitors to our headquarters invariably volunteer to taste ice cream, but our testers are also required to taste buttery spreads—straight-up.) The people we hire have normal taste and odor acuity, but they've also shown the ability to recall and identify various flavors and textures, and to communicate—in precise terms—what their taste buds are telling them.

Our in-house sensory experts, who have studied food science, nutrition, statistics, and psychology, give the panelists basic training in evaluating foods. The tasters learn that personal preference must play no part in taste testing and that they should ignore irrelevant cues like color, which can make a bright red sauce seem more tomatoey than a dull orange sauce, even when it's not. Some food categories—such as wine—require a specific expertise or knowledge, so experts in that category are used and they adhere to the same testing principles as our in-house panel.

Here's what happens when we taste anything from applesauce to ziti. The example we've used here is chicken noodle soup.

Preparing for the test

Taste-testing takes multidimensional concentration. In a few bites or sips, panelists have to identify flavor (Does the peanut butter taste roasted or burnt?) and texture (Is the cookie crisp or soggy?). Moreover, they have to gauge the intensity of flavor and texture (How meaty-tasting is the hot dog? Is it very juicy, like an orange, or merely moist, like a raisin?).

At the start of each project, our in-house experts spend several days preparing panelists to discern subtle and not-so-subtle differences that they're likely to encounter in the food to be tasted. First, panelists look at representative samples of the food and list important attributes. Next, panelists synchronize taste buds, sniffing and sampling ingredients they may find later. During training for soups, the smorgasbord included chicken broth made various ways: with roasted chicken, boiled chicken, and bouillon cubes. Tasters tried monosodium glutamate dissolved in water so that they could identify it in chicken noodle soup. And to appreciate the range of flavors in vegetable soup, they tasted canned vegetables next to their fresh, boiled counterparts. Finally, tasters learn to use standard point systems and descriptive terms to compare such attributes as a food's hardness or crispness. Soup tasters used the texture of pastas cooked to varying degrees of firmness as reference points when gauging the firmness of the soup noodles—from mushy to al dente. References such as water and milk were used to compare the viscosity or thickness of the soup.

In the kitchen

We bought each food product in several locations (sometimes we buy food samples from all across the country) to ensure a representative sampling. We prepared each soup according to the manufacturer's instructions and stirred it just before ladling so that each panelist received a typical amount of vegetables and noodles. We served the soup very hot; when it had cooled to 160° F, we told panelists to begin tasting. That way, all soups were tested at the same temperature.

With the same attention to detail, we test ice cream by removing it from the container and then shaving off all of the edges of the ice cream block so that no one gets a piece with "freezer burn." And we serve it in odor-free cups (we've sniffed them to make sure). To test bread, we discard the ends. For cereal, we pour the whole box into a bowl, mix lightly, and then serve individual portions.

We control the lighting, sound, and ventilation of the testing room to allow tasters to focus solely on the food in front of them. During soup tests, that meant keeping the area free of cooking smells. Each testing booth has a breadbox-like compartment that can be opened, via wooden shutters, from the booth and the kitchen. We distribute samples from the kitchen side, and then close the shutters. The testers then open their shutters and reach in. We further minimize the possibility that kitchen smells will escape to the testing area by pressurizing the air in the booths so that odors that waft in make a U-turn back into the kitchen.

In the tasting booth

Before every taste test, a questionnaire is developed about the appearance, texture, and flavor of the food being tested. When the soup panelists sat at a booth, for example, they found soup questionnaires, a computer for logging in answers, a cup of water for rinsing between samples, and a "spit" cup for expectoration after the soup has been tasted and evaluated. When panelists heard the wooden shutter close on the kitchen side, they opened the compartment and took their first cup of soup. While awaiting word that the soup had cooled to the proper temperature, tasters answered questions about its appearance. Then they sipped, answering questions about flavor and texture. In a morning, each panelist evaluated 12 soups, with breaks after every four to refresh their palates.

Panelists tried every soup at least three times. According to a plan developed by our statisticians, we switched the order in which soups were tasted. That helped us to avoid the "context effect," the tendency to compare a food to one tasted just before. If, say, you drink fresh-squeezed orange juice, then juice from a carton, the carton juice won't taste very fresh. If the carton juice follows canned juice, it can taste much fresher. Rotating the order serves another purpose; People tend to pay more attention to the first product than to the next.

We also kept the samples unidentifiable (except by taste) from test to test by serving them in uniform containers identified only by random code numbers. And in case someone had a favorite two-digit number—a birthday, a child's age—the codes were always three digits.

Depending on the product category, sometimes it is more effective to use a roundtable test format instead of the booths. That means that the panelists are served the samples while seated around a large conference table. They taste each sample, complete their individual ballots and then a discussion about the sample is led by the Sensory Project Leader. A consensus is reached and recorded for each sample.

Turning data into Ratings

Our ultimate goal is to answer two questions for each product: How does it differ from other brands? And how high is its quality?

Answering the first question is a matter of having a statistician take data from the panelists' questionnaires and rank each soup by the intensity of each attribute, or frequency that an attribute description was selected. Then we can talk about differences by highlighting products at the extremes. We may call a vegetable soup at the high end of the saltiness scale "very salty" or say a tomato soup that ranks low on the viscosity scale has a "thinner broth than most."

To answer the second question, we develop standards for how an excellent product should taste. Our food experts develop these "criteria for high quality" based on how high-quality ingredients subjected to careful processing and handling would—and wouldn't—taste.

The criteria define a range of attributes acceptable for an excellent product. For example, an excellent chicken noodle soup may have long or short noodles, as long as they aren't mushy. An excellent chocolate chip cookie may taste buttery or not. A garlicky beef hot dog may be excellent, but so may a smoky pork or poultry one. We don't pretend to know our readers' particular likes and dislikes. Rather, we make clear the standard by which we're judging a food and provide, in the Ratings, comments describing each product or groups of products. That way, you can choose a highly rated product that suits your preferences.

Ratings result when our statisticians rank the products from those closest to the criteria for excellence to those farthest away. At that point, Consumer Reports' sensory experts step in again to decide where the best and worst foods fit on a 0-to-100, Poor-to-Excellent scale. The products are presented in rank order within quality groups—Excellent to Poor—and there are descriptions to help you make choices to meet your specific needs.

Sensory testing for non-food products

Our in-house sensory panel also evaluates other products for sensory qualities. For example, our panel was asked to evaluate the softness of sheets, and the color and clarity qualities of printed photos, graphics, and text from printers.

For a textile project such as sheets, a variety of fabrics were brought in by the sensory expert and the panelists came to agreement on such attributes as pliability, fuzziness, and smoothness and used those characteristics to determine softness scores. The panelists synchronized their scoring and the fabrics were used as references for all of the test products that were evaluated.

Likewise, for printers, examples of photos, graphics, and text illustrating a range of quality in color and clarity are used as references. The panelists synchronize their evaluations for all of the printer products using the references.

Depending on what products the sensory panelists are asked to evaluate, any combination of their five senses (sight, touch, smell, hearing, and taste) may be used as sensitive instruments.