Comparing Reliability and Validity
We have seen that reliability and construct validity are similar in that they are both assessed through examination of the correlations among measured variables. However, they are different in the sense that reliability refers to correlations among different variables that the researcher is planning to combine into the same measure of a single conceptual variable, whereas construct validity refers to correlations of a measure with different measures of other conceptual variables. In this sense, it is appropriate to say that reliability comes before validity because reliability is concerned with creating a measure that is then tested in relationship to other measures. If a measure is not reliable, then its construct validity cannot be determined. Tables 5.1 and 5.2 summarize the various types of reliability and validity that researchers must consider.
One important question that we have not yet considered is “How reliable and valid must a scale be in order to be useful?” Researchers do not always agree about the answer, except for the obvious fact that the higher the reliability and the construct validity, the better. One criterion that seems reasonable is that the reliability of a commonly used scale should be at least a 5 .70. However, many tests have reliabilities well above a 5 .80.
In general, it is easier to demonstrate the reliability of a measured variable than it is to demonstrate a variable’s construct validity. This is so in part because demonstrating reliability involves only showing that the measured variables correlate with each other, whereas validity involves showing both convergent and discriminant validity. Also, because the items on a scale are all answered using the same response format and are presented sequentially, and because items that do not correlate highly with the total scale score can be deleted, high reliabilities are usually not diffi cult to achieve.
However, the relationships among different measures of the same conceptual variable that serve as the basis for demonstrating convergent validity are generally very low. For instance, the correlations observed by Snyder were only in the range of .40, and such correlations are not unusual. Although correlations of such size may seem low, they are still taken as evidence for convergent validity.
One of the greatest diffi culties in developing a new scale is to demonstrate its discriminant validity. Although almost any new scale that you can imagine will be at least moderately correlated with at least some other existing scales, to be useful, the new scale must be demonstrably different from existing scales in at least some critical respects. Demonstrating this uniqueness is difficult and will generally require that a number of different studies be conducted.
Because there are many existing scales in common use within the behavioral sciences, carefully consider whether you really need to develop a new scale for your research project. Before you begin scale development, be sure to determine if a scale assessing the conceptual variable you are interested in, or at least a similar conceptual variable, might already exist. A good source for information about existing scales, in addition to PsycINFO®, is Robinson, Shaver, and Wrightsman (1991). Remember that it is always advantageous to use an existing measure rather than to develop your own— the reliability and validity of such measures are already established, saving you a lot of work.