Simply Psychology Logo


Cluster Sampling: Definition, Method and Examples

By Julia Simkus, published Jan 03, 2022


Cluster sampling is a method of probability sampling where researchers divide a large population up into smaller groups known as clusters, and then select randomly among the clusters to form a sample.

Key Terms
  • A sample is the participants you select from a target population (the group you are interested in) to make generalizations about.
  • Representative means the extent to which a sample mirrors a researcher's target population and reflects its characteristics.
  • Generalisability means the extent to which their findings can be applied to the larger population of which their sample was a part.

Cluster sampling is usally used when both the population and the desired sample size are particularly large.

The purpose of cluster sampling is to reduce the total number of participants in a study if the original population is too large to study as a whole. These clusters serve as a small-scale representation of the total population and taken together, the clusters should cover the characteristics of the entire population.

This method of sampling reduces the cost and time of a study by increasing efficiency. Researchers sometimes will use pre-existing groups such as schools, cities, or households as their clusters.

Types of cluster sampling

  1. Single-stage Cluster Sampling
    • A single-stage cluster is a type of cluster sampling where each unit of the chosen clusters is sampled. Researchers will first divide the total sample into a predetermined number of clusters based on how large they want each cluster to be.
    • Then, they randomly select and sample from the clusters and collect data from each individual unit in the selected clusters.
  2. Double-stage cluster sampling
    • In multi-stage, or two-stage, cluster sampling, researchers will only collect data from a random subsample of individual units within each of the selected clusters to use as the sample.
    • This technique is less precise than single-stage sampling and should only be used when it is too challenging or expensive to test the entire cluster.
  3. Multi-stage cluster sampling
    • This type of cluster sampling involves the same process as double-stage sampling, except with a few extra steps.
    • In multi-stage sampling, researchers will continue to randomly sample elements from within the clusters until they reach a manageable sample size.

Applications: When is it used

Cluster sampling is used when the target population is too large or spread out ,and studying each subject would be costly, time consuming, and improbable.

Cluster sampling allows researchers to create smaller, more manageable subsections of the population with similar characteristics. Cluster sampling is particularly useful in area or geographical sampling, when the populations are widely dispersed.

Researchers will form clusters based on geographical area by grouping individuals within a community, neighborhood, or local area into a single cluster.

Cluster sampling is also used in market research when researchers are unable to collect information about the population as a whole. Lastly, cluster sampling can be used to estimate high mortality rates, such as from wars, famines, or natural disasters.

How to cluster sample?

  1. First, choose the target population that you wish to study and determine your desired sample size.
  2. Then, divide your sample into clusters. When forming the clusters, make sure each cluster’s population is diverse, has a similar distribution of characteristics to the distribution of the population as a whole, and has the same number of members. The goal is to form clusters that are representative of the total population as a whole.
  3. Next, select clusters by a random selection process. It is important to randomly select from the clusters in order to preserve the validity of your results. The number of clusters selected is based on how large the sample size is.
  4. In single-stage sampling, collect data from each individual unit of the clusters you selected in Step 3.
  5. In the case of double-stage or multi-stage sampling, you randomly select individual units from within the selected clusters to use as your sample. You will then collect your data from each of these individual units. Double-stage and multi-stage clustering tend to be easier than single-stage because you will be working with a much smaller sample.   

Advantages

Time and cost efficient

Cluster sampling is cheaper and quicker than other sampling methods. For example, it reduces travel expenses for widely geographical populations.

High external validity

If your population is clustered properly to represent every possible characteristic of the entire population, your clusters will accurately reflect the entire population.

Practicality and ease

This type of sampling process enables researchers to study large populations that would otherwise be too challenging or complicated to otherwise analyze.

Limitations

High sampling error

When the clusters do not mirror the population’s characteristics or serve as a mini-representation of the population as a whole, there will be less statistical certainty and accuracy. This error is even greater when you use more stages of clustering.

Complexity

Planning study designs for cluster sampling usually requires more attention because researchers need to determine how to divide up a larger population efficiently and properly.

Examples

  • Assess immunization coverage (Henderson & Sundaresan, 1982).
  • Estimate density of waterfowl wintering (Smith, Conroy, & Brakhage, 1995).
  • Conduct rapid assessment of health in communities affected by natural disasters (Malilay, Flanders, & Brogan, 1996).
  • Determine forest inventories (Roesch, 1993).
  • Assess the prevalence of irritable bowel syndrome in South China and its impact on health-related quality of life (Xiong, 2004).
  • Estimate the size of hidden and hard to access populations (Medina & Thompson, 2004).

Cluster Sampling vs Stratified Sampling

Stratified sampling is a method where researchers divide a population into smaller subpopulations known as stratum. Stratums are formed based on shared, unique characteristics of the members, such as age, income, race, or education level.

Then, members of the strata are randomly selected to form a sample.

Researchers using stratified sampling divide the population into groups based on age, religion, ethnicity, or income level and randomly choose from these strata to form a sample.

Alternatively, researchers using cluster sampling will use naturally divided groups to separate the population (ie: city blocks or school districts) and then randomly select elements from these clusters to be a part of the sample.

About the Author

Julia Simkus is an undergraduate student at Princeton University, majoring in Psychology. She plans to pursue a PhD in Clinical Psychology upon graduation from Princeton in 2023. Julia has co-authored two journal articles, one titled “Substance Use Disorders and Behavioral Addictions During the COVID-19 Pandemic and COVID-19-Related Restrictions," which was published in Frontiers in Psychiatry in April 2021 and the other titled “Food Addiction: Latest Insights on the Clinical Implications," to be published in Handbook of Substance Misuse and Addictions: From Biology to Public Health in early 2022.

How to reference this article:

Simkus, J. (2022, Jan 03). Cluster Sampling: Definition, Method and Examples. Simply Psychology. www.simplypsychology.org/cluster-sampling.html

Sources

Felix-Medina, M. H., & Thompson, S. K. (2004). Combining link-tracing sampling and cluster sampling to estimate the size of hidden populations. JOURNAL OF OFFICIAL STATISTICS-STOCKHOLM-, 20(1), 19-38.
Henderson, R. H., & Sundaresan, T. (1982). Cluster sampling to assess immunization coverage: a review of experience with a simplified sampling method. Bulletin of the World Health Organization, 60(2), 253–260.
Malilay, J., Flanders, W. D., & Brogan, D. (1996). A modified cluster-sampling method for post-disaster rapid assessment of needs. Bulletin of the World Health Organization, 74(4), 399–405.
Roesch, F. A. (1993). Adaptive cluster sampling for forest inventories. Forest Science, 39(4), 655-669.
Smith, D. R., Conroy, M. J., & Brakhage, D. H. (1995). Efficiency of Adaptive Cluster Sampling for Estimating Density of Wintering Waterfowl. Biometrics, 51(2), 777–788. https://doi.org/10.2307/2532964
Steven K. Thompson (1990) Adaptive Cluster Sampling, Journal of the American Statistical Association, 85:412,1050-1059, DOI: 10.1080/01621459.1990.10474975

Xiong, L. S., Chen, M. H., Chen, H. X., Xu, A. G., Wang, W. A., & Hu, P. J. (2004). A population‐based epidemiologic study of irritable bowel syndrome in South China: stratified randomized study by cluster sampling. Alimentary pharmacology & therapeutics, 19(11), 1217-1224.

Home | About Us | Privacy Policy | Advertise | Contact Us

This workis licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Unported License.

Company Registration no: 10521846