6. Sampling

6.1. Units of Analysis

Man wearing a hat, orange jacket, and backpack looking out to the sea using binoculars.
Before developing a sampling strategy, researchers must first define who or what is the target (the unit of analysis) of their scientific study. Rory McKeever, via Unsplash

Learning Objectives

  1. Describe units of analysis.
  2. Discuss how we can study the same topic using different units of analysis.

Before you can decide on a sampling strategy, you must define the unit of analysis of your scientific study. The unit of analysis refers to the person, collective, or object that you are focusing on and want to learn about through your research. As depicted in Figure 6.1, your unit of analysis would be the type of entity (say, an individual) you’re interested in. Your sample would be a group of such entities (say, a group of individuals you survey)—which, collectively, stand in for the population you wish to study.

Cartoon depictions of a large group of people representing a population. Six of them are selected to represent the sample.
Figure 6.1. Units of Analysis, Populations and Samples. When conducting research, we focus on studying a particular unit of analysis—a person, collective, or object that we want to learn more about. That unit of analysis encompasses a larger population we are interested in making claims about. It is usually not possible to include every unit of the population in a research study, so we must select a small subset of the population (a sample) from which we actually gather data.

Typical units of analysis include individuals, groups, organizations, and countries. For instance, if we are interested in studying people’s shopping behavior, their learning outcomes, or their attitudes toward new technologies, then the unit of analysis is likely to be the individual. If we want to study characteristics of street gangs or teamwork in organizations, then the unit of analysis is probably the group. If our research is directed at understanding differences in national cultures, then our unit of analysis could be the country. In the latter two examples, even though specific individuals—the group or country’s leaders—may have a greater say over what these groups or countries do, for the sake of analysis, researchers typically think of those decisions as reflecting a collective decision rather than any one individual’s decision.

Even inanimate objects can serve as units of analysis. For instance, if we wish to study how two or more individuals engage with each other during social interactions, the unit of analysis might be each conversation, and not the individual speakers. If we wanted to track how depictions of people of color have changed in popular culture over time, we could focus on a film or television show as a unit of analysis.

Our choice of a particular unit of analysis will depend on our research question. For instance, if we wish to study why certain neighborhoods have high crime rates, then our unit of analysis becomes the neighborhood—not crimes or criminals committing such crimes—because the object of our inquiry is the neighborhood and not the people living in it. If, however, we wish to compare the prevalence of different types of crimes—homicide versus robbery versus assault, for example—across neighborhoods, our unit of analysis could very well be the crime. If we wish to study why criminals engage in illegal activities, then the unit of analysis becomes the individual (i.e., the criminal).

Now let’s consider a completely different kind of sociological study. If we want to examine why some business innovations are more successful than others, then our unit of analysis is an innovation—such as the invention of a new method for charging phones. If, however, we wish to study how some tech companies develop innovative products more consistently than others, then the unit of analysis is the organization. As you can see, two related research questions within the same study may have entirely different units of analysis.

Determining the appropriate unit of analysis is important because it influences what type of data you should collect for your study and whom you collect it from. If your unit of analysis is the organization, then you usually will want to collect organizational-level data—that is, data that has to do with the organization, such as its size, personnel structure, or revenues. Data may come from a variety of sources, such as financial records or surveys of directors or executives, who are presumed to be representing their organization when they answer your survey questions. Meanwhile, if your unit of analysis is a website, you will want to collect data about different sites, such as how one kind of site compares to others in terms of traffic. We could use the term “site-level” data—just like we’d use the term “individual-level” data when individuals are the unit of analysis. We could also talk about “lower” and “higher” levels of analysis—with individual-level data existing on a lower level than group-level data, which may, in turn, be on a lower level than national data (see the discussion of micro, meso, and macro levels of analysis in Chapter 3: The Role of Theory in Research). It is important to note that “higher” does not imply “better” in this case. We’re just talking about whether we’re looking at smaller or larger groupings of data.

Frequently, the unit of analysis is what we observe in our research—the source of our data—but that is not always the case. In fact, sometimes we want to make a distinction between units of analysis and units of observation. The unit of analysis is what we really want to study, but sometimes we have to get at it indirectly, by observing something else. For example, surveys often ask questions about families to understand their family structure, income, and various aspects of their well-being, but they need to get information about the family through individuals—specifically, the respondent who is answering survey questions on behalf of the family. In this case, the unit of analysis for the survey’s family-related questions would be the family, but the unit of observation would be the individual. Likewise, in our earlier examples, we talked about studying organizations and websites as our units of analysis, but doing so might involve talking to individuals—the directors of those organizations, or the users of those websites, respectively.

Analyzing multiple types of units of observation can give us a fuller picture of our unit of analysis. For example, if you are conducting research about what makes particular social media apps more addictive than others, then examining differences between the apps in terms of their functionality (app as the unit of observation) would tell you one thing, but surveying individuals about their usage of apps (user as the unit of observation) would clarify other aspects of that question. Furthermore, it is often a good idea to collect data from a lower level of analysis and sum up, or aggregate, that data, converting it into higher-level data. This can give you a bigger-picture perspective on your unit of analysis. For instance, to study teamwork in organizations, you can survey individuals in different teams and measure how much conflict or cohesion they perceive on their teams. You can then average their individual scores to create a “team-level” score on those particular ratings. Note, however, that issues can arise when we move in the opposite direction—from a higher to a lower level of analysis (see the sidebar Deeper Dive: Ecological Fallacies).

Ultimately, the unit of analysis will help you determine both the population you are interested in and the sample that you will study to arrive at any conclusions about that population. So you need to choose it wisely. For example, let’s say you’re interested in the average pay of chief executive officers (CEOs) at companies across the nation. The unit of analysis would be the CEO, and the population would be all individuals in the country who work as company CEOs. But the unit of analysis would be different for a very similar research question: the average amount that U.S. companies pay their CEOs. In this case, the unit of analysis is actually the company because you are interested in how much companies pay their CEOs—not how much individuals are paid as CEOs. The difference is subtle, but the main point is that your unit of analysis is linked to whatever population you actually want to say something about—in this example, either individual CEOs, or companies that have CEOs.

Deeper Dive: Ecological Fallacies

Person using magnifying glass on a map.
When researchers confuse their units of analysis and observation, they may commit an ecological fallacy—that is, making possibly inaccurate claims about individuals based on aggregated data collected at a higher level. lil artsy, via Pexels

A mismatch between the unit of analysis and the unit of observation can create issues for researchers. Let’s say you want to compare the residents of different states (your unit of analysis is the individual), but you only have access to state-level data (your unit of observation is the state). This is a problem because you generally do not want to be making claims about a lower level of analysis based only on aggregated data at a higher level—in this example, drawing conclusions about individuals based on the states where they reside. For instance, the fact that the population of a state is, on average, wealthier than the rest of the country does not mean that residents of that state are more likely to be rich than the average American. It may be that a small contingent of superrich people have pulled up the average wealth of the state, but its many other residents actually tend to be poorer than the average American. (As you might know from your statistics classes, in this situation, mean wealth—the group’s average—differs dramatically from median wealth—how much money the person smack in the middle of the income distribution has.) This logical error—making claims about the nature of individuals based on data from the groups they belong to—is called an ecological fallacy.

Émile Durkheim’s classic study of suicide is often mentioned as an example of an ecological fallacy. One of the pioneers of the field of sociology, Durkheim argued in his 1897 book Suicide that societies in which individuals struggled to feel they belonged—that is, populations with low levels of social integration—would experience more suicide. Ideally, the unit of analysis for such a study would be the individual. Specifically, we would want to study individuals and the factors that contributed to their deaths by suicide. But Durkheim did not have individual-level data. Instead, he had higher-level data about the number of suicides in each country. To test his theory that social integration safeguarded individuals against suicide, Durkheim compared countries that were mostly Protestant to those that were mostly Catholic. The idea was that Protestantism was a more individualistic and unstructured faith than Catholicism, and so the two varieties of religious belief could stand in for less and more social integration, respectively.

Durkheim’s analysis concluded that suicide was indeed higher in Protestant-majority countries. The problem was that his data only allowed him to say that Protestant countries were more likely to have higher suicide rates—not that Protestant individuals were more likely to commit suicide. To conclude the latter would have been an ecological fallacy, and yet that was the question that Durkheim truly wanted to answer. To his credit, Durkheim also tested his theory by studying suicide rates across localities within countries—another level of analysis (Selvin 1958). (Replicating your analysis across different types of data is a good way to check the robustness of your findings, as we will discuss in later chapters.) Durkheim found the same pattern of higher levels of Protestant belief correlating with higher suicide rates within counties, giving further credence to his theory. Although flawed, Durkheim’s analysis made creative use of the data that was available to him at the time, and his work continues to inspire researchers, including those studying the growing rates of suicide among less educated Americans since 2000 (Case and Deaton 2020).

Key Takeaways

  1. A unit of analysis is a member of the larger group you wish to be able to say something about at the end of your study. A unit of observation is a member of the population that you actually observe.
  2. When researchers confuse their units of analysis and observation, they may commit an ecological fallacy—that is, when we make possibly inaccurate claims about the nature of individuals based on data from the groups they belong to.

License

Icon for the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License

The Craft of Sociological Research by Victor Tan Chen; Gabriela León-Pérez; Julie Honnold; and Volkan Aytar is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License, except where otherwise noted.

Share This Book