You have probably noticed by now that although observational research and case studies can provide a detailed look at ongoing behavior, because they represent qualitative data, they may often not be as objective as one might like, especially when they are based on recordings by a single scientist. Because the observer has chosen which people to study, which behaviors to record or ignore, and how to interpret those behaviors, she or he may be more likely to see (or at least to report) those observations that confirm, rather than disconfi rm, her or his expectations. Furthermore, the collected data may be relatively sketchy, in the form of “fi eld notes” or brief reports, and thus not amenable to assessment of their reliability or validity. However, in many cases these problems can be overcome by using systematic observation to create quantitative measured variables (Bakeman & Gottman, 1986; Weick, 1985).
Deciding What to Observe
Systematic observation involves specifying ahead of time exactly which observations are to be made on which people and in which times and places. These decisions are made on the basis of theoretical expectation about the types of events that are going to be of interest. Specificity about the of interest has the advantage of both focusing the observers’ attention on these specific behaviors and reducing the masses of data that might be collected if the observers attempted to record everything they saw. Furthermore, in many cases more than one observer can make the observations, and, as we have discussed in Chapter 5, this will increase the reliability of the measures.
Consider, for instance, a research team interested in assessing how and when young children compare their own performance with that of their classmates (Pomerantz et al., 1995). In this study, one or two adult observers sat in chairs adjacent to work areas in the classrooms of elementary school children and recorded in laptop computers the behaviors of the children. Before beginning the project, the researchers had defined a specific set of behavioral categories for use by the observers. These categories were based on theoretical predictions of what would occur for these children and defined exactly what behaviors were to be coded, how to determine when those behaviors were occurring, and how to code them into the computer.
Deciding How to Record Observations
Before beginning to code the behaviors, the observers spent three or four days in the classroom learning, practicing, and revising the coding methods and letting the children get used to their presence. Because the coding categories were so well defi ned, there was good interrater reliability. And to be certain that the judges remained reliable, the experimenters frequently computed a reliability analysis on the codings over the time that the observations were being made. This is particularly important because there are some behaviors that occur infrequently, and it is important to be sure that they are being coded reliably.
Over the course of each observation period, several types of data were collected. For one, the observers coded event frequencies—for instance, the number of verbal statements that indicated social comparison. These included both statements about one’s own performance (“My picture is the best.”) and questions about the performance of others (“How many did you get wrong?”). In addition, the observers also coded event duration—for instance, the amount of time that the child was attending to the work of others. Finally, all the children were interviewed after the observation had ended.
Choosing Sampling Strategies
One of the difficulties in coding ongoing behavior is that there is so much of it. Pomerantz et al. (1995), used three basic sampling strategies to reduce the amount of data they needed to record. First, as we have already seen, they used event sampling—focusing in on specifi c behaviors that were theoretically related to social comparison. Second, they employed individual sampling. Rather than trying to record the behaviors of all of the children at the same time, the observers randomly selected one child to be the focus child for an observational period. The observers zeroed in on this child, while ignoring the behavior of others during the time period. Over the entire period of the study, however, each child was observed. Finally, Pomerantz and colleagues employed time sampling. Each observer focused on a single child for only four minutes before moving on to another child. In this case, the data were coded as they were observed, but in some cases the observer might use the time periods between observations to record the responses. Although sampling only some of the events of interest may lose some information, the events that are attended to can be more precisely recorded.
The data of the observers were then uploaded from laptop computers for analysis. Using these measures, Pomerantz et al. found, among other things, that older children used subtler social comparison strategies and increasingly saw such behavior as boastful or unfair. These data have high ecological validity, and yet their reliability and validity are well established. Another example of a coding scheme for naturalistic research, also using children, is shown in Figure 7.1.