Guttman Scale: Definition, Example

Guttman Scale Definition, ExampleIn statistical surveys conducted using structured interviews or questionnaires, a subset of the survey items having binary (e.g., YES or NO) answers forms a Guttman scale,(named after Louis Guttman); if they can be ranked in some order so that, for a rational respondent, the response pattern can be captured by a single index on that ordered scale.

In other words, on a Guttman scale, items are arranged in an order so that an individual who agrees with a particular item also agrees with items of lower rank-order.

For example, a series of items could be

  1. I am willing to be near ice cream;
  2. I am willing to smell ice cream;
  3. I am willing to eat ice cream; and
  4. I love to eat ice cream.

Agreement with anyone item implies agreement with the lower-order items.

This contrasts with topics studied using a Likert scale or a Thurstone scale. The method of scaling devised by Guttman is also called scalogram analysis.

A well-known example of a Guttman scale is the Bogardus Social Distance Scale, which is as follows:

  1. Are you willing to permit immigrants to live in your country?
  2. Are you willing to permit immigrants to live in your community?
  3. Are you willing to permit immigrants to live in your neighborhood?
  4. Are you willing to permit immigrants to live next door to you?
  5. Would you permit your child to marry an immigrant?

The concept of the Guttman scale likewise applies to a series of items in other kinds of tests, such as achievement tests that have binary outcomes.

For example, a test of math achievement might order questions based on their difficulty and instruct the examinee to begin in the middle.

The assumption is if the examinee can successfully answer items of that difficulty (e.g., summing two 3-digit numbers), s/he would be able to answer the earlier questions (e.g., summing two 2-digit numbers).

Some achievement tests are organized in a Guttman scale to reduce the duration of the test.

By designing surveys and tests such that they contain Guttman scales, researchers can simplify the analysis of the outcome of surveys, and increase the robustness.

Guttman scales also make it possible to detect and discard randomized answer patterns, as may be given by uncooperative respondents.

A hypothetical, perfect Guttman scale consists of a unidimensional set of items that are ranked in order of difficulty from least extreme to most extreme positions.

For example, a person scoring a “7” on a ten-item Guttman scale, will agree with items 1-7 and disagree with items 8, 9,10.

An important property of Guttman’s model is that a person’s entire set of responses to all items can be predicted from their cumulative score because the model is deterministic.

An important objective in Guttman scaling is to maximize the reproducibility of response patterns from a single score.

A good Guttman scale should have a coefficient of reproducibility (the percentage of original responses that could be reproduced by knowing the scale scores used to summarize them) above .85.

Another commonly used metric for assessing the quality of a Guttman scale is Menzel’s coefficient of scalability and the coefficient of homogeneity (Loevinger, 1948; Cliff, 1977; Krus and Blackman, 1988).

To maximize unidimensionality, misfitting items are re-written or discarded.

A scale is said to be unidimensional if the responses fall into a perfect pattern in which endorsement of the item reflecting the extreme position also results in endorsing all items that are less extreme.

With the Guttman technique, the ‘perfect’ implies that a person who replies a given question favorably will have a higher score than a person who answers it unfavorably.

In this situation, the number of items endorsed by a respondent giving a complete picture as to which items he agreed and disagreed with serves as his score.

If some combination of scores other than the desired combination forms a particular scale score, it is considered to be in error. If there are many scaling errors or exceptions to the desired pattern, the scale is inadequate.

The attainment of a higher degree of unidimensionality is the major concern of the Guttman scale.

However, the Guttman scale belongs to the broad category of cumulating scaling.

It is cumulative in the sense that the combination responses required to make a particular score include the responses to all questions required to make the next lower score, plus the response to one additional question, in a stepwise fashion.

According to Guttman, a ‘universe of content’ can be considered to be unidimensional only if it yields a perfect or nearly perfect cumulative scale.

Suppose a survey is conducted among 100 respondents on the opinion of a new brand of shirt labeled ‘union.’ We may have developed a preference scale of 4 items as follows:

Guttman Scale

A score enables one to determine as to which item is endorsed by the respondent.

A person with a score of 3, for example, should disagree with item 4 but agree with all others. A score of 4 indicates all statements are agreed to and represent the most favorable attitude.

According to the scalogram theory, this pattern confirms that the universe of content (attitude toward an issue understudy) is scalable.

How is a score assigned to an individual?

A commonly used general solution is to assign to each individual a score equal to the number of items he endorses.

The respondents who participated in the survey indicate whether they agree or disagree with their opinions.

If these items form a unidimensional scale, the response pattern’ will form a perfect cumulative scale of the following type:

scalogram response pattern ideal scale structure

In the case of a non-perfect pattern, the respondents will act differently.

Such a pattern will be designated as a non-scale pattern. The number of respondents doing so will be the remainder of those who followed scale pattern as in the above table.

If there are 100 persons participating in the study and 65 responded according to the scale pattern, then the remaining 35 did respond differently, resulting in a non-scale pattern, which we categorize as errors.

The responses of this pattern may be as follows:

scalogram response pattern non-scale structure

In the Guttman technique, the perfect scale implies that a person, who answers a given question favorably, will have a higher total score than a person who answers it unfavorably. In the first table, the four response patterns form a scalar pattern.

If all the responses provided by the respondents form a scalar pattern (in which case there will be none to form the second table), the items are said to form a perfect scale, and the resulting scale has been called by Guttman a “perfectly reproducible” one.

In the second table, the four rows constitute the non-scale patterns, and each of these patterns is associated with the error.

The Guttman scale is analytically complex, apart from the fact that there is no guarantee that the various items will scale, and even if they do so, the universe of the content may be narrow in coverage.

This method is more appropriate for scaling ordered behavior than less structured and broad¬based attitudes (Ghosh, 2003:173).

The reproducibility of the Guttman score can be examined by what is known as the coefficient of reproducibility (CR). The higher the value of the coefficient, the higher the proportion of scores we can reproduce accurately.

The coefficient is computed by subtracting the proportion of responses that are errors from 100 percent (all responses). The formula to be used is;

coefficient of reproducibility

Another way of expressing CR is to say that it is the proportion of responses that can be correctly predicted from the individuals’ total score.

A coefficient of reproducibility can also be calculated for each item individually as the proportion of responses on that item that are correctly predicted; CR is the simple average of these item coefficients.

In the foregoing example, E=35, N=100, n=4, so that;

coefficient of reproducibility example

Any coefficient of reproducibility over 0.90 is supposedly adequate to indicate scalability and the ability to reproduce responses to the various items from the knowledge of the total score.

However, as Edwards (1957: 191) notes, a CR of 0.90 is not a sufficient condition for the scalability of a set of statements.

Read Related Posts /