Internet Archive Audio

case control studies design conduct analysis pdf

  • This Just In
  • Grateful Dead
  • Old Time Radio
  • 78 RPMs and Cylinder Recordings
  • Audio Books & Poetry
  • Computers, Technology and Science
  • Music, Arts & Culture
  • News & Public Affairs
  • Spirituality & Religion
  • Radio News Archive

case control studies design conduct analysis pdf

  • Flickr Commons
  • Occupy Wall Street Flickr
  • NASA Images
  • Solar System Collection
  • Ames Research Center

case control studies design conduct analysis pdf

  • All Software
  • Old School Emulation
  • MS-DOS Games
  • Historical Software
  • Classic PC Games
  • Software Library
  • Kodi Archive and Support File
  • Vintage Software
  • CD-ROM Software
  • CD-ROM Software Library
  • Software Sites
  • Tucows Software Library
  • Shareware CD-ROMs
  • Software Capsules Compilation
  • CD-ROM Images
  • ZX Spectrum
  • DOOM Level CD

case control studies design conduct analysis pdf

  • Smithsonian Libraries
  • FEDLINK (US)
  • Lincoln Collection
  • American Libraries
  • Canadian Libraries
  • Universal Library
  • Project Gutenberg
  • Children's Library
  • Biodiversity Heritage Library
  • Books by Language
  • Additional Collections

case control studies design conduct analysis pdf

  • Prelinger Archives
  • Democracy Now!
  • Occupy Wall Street
  • TV NSA Clip Library
  • Animation & Cartoons
  • Arts & Music
  • Computers & Technology
  • Cultural & Academic Films
  • Ephemeral Films
  • Sports Videos
  • Videogame Videos
  • Youth Media

Search the history of over 866 billion web pages on the Internet.

Mobile Apps

  • Wayback Machine (iOS)
  • Wayback Machine (Android)

Browser Extensions

Archive-it subscription.

  • Explore the Collections
  • Build Collections

Save Page Now

Capture a web page as it appears now for use as a trusted citation in the future.

Please enter a valid web address

  • Donate Donate icon An illustration of a heart shape

Case-control studies : design, conduct, analysis

Bookreader item preview, share or embed this item, flag this item for.

  • Graphic Violence
  • Explicit Sexual Content
  • Hate Speech
  • Misinformation/Disinformation
  • Marketing/Phishing/Advertising
  • Misleading/Inaccurate/Missing Metadata

[WorldCat (this item)]

plus-circle Add Review comment Reviews

53 Previews

Better World Books

DOWNLOAD OPTIONS

No suitable files to display here.

EPUB and PDF access not available for this item.

IN COLLECTIONS

Uploaded by station13.cebu on June 13, 2019

SIMILAR ITEMS (based on metadata)

  • - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Analysis of matched...

Analysis of matched case-control studies

  • Related content
  • Peer review
  • Neil Pearce , professor 1 2
  • 1 Department of Medical Statistics and Centre for Global NCDs, Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London WC1E 7HT, UK
  • 2 Centre for Public Health Research, Massey University, Wellington, New Zealand
  • neil.pearce{at}lshtm.ac.uk
  • Accepted 30 December 2015

There are two common misconceptions about case-control studies: that matching in itself eliminates (controls) confounding by the matching factors, and that if matching has been performed, then a “matched analysis” is required. However, matching in a case-control study does not control for confounding by the matching factors; in fact it can introduce confounding by the matching factors even when it did not exist in the source population. Thus, a matched design may require controlling for the matching factors in the analysis. However, it is not the case that a matched design requires a matched analysis. Provided that there are no problems of sparse data, control for the matching factors can be obtained, with no loss of validity and a possible increase in precision, using a “standard” (unconditional) analysis, and a “matched” (conditional) analysis may not be required or appropriate.

Summary points

Matching in a case-control study does not control for confounding by the matching factors

A matched design may require controlling for the matching factors in the analysis

However, it is not the case that a matched design requires a matched analysis

A “standard” (unconditional) analysis may be most valid and appropriate, and a “matched” (conditional) analysis may not be required or appropriate

Matching on factors such as age and sex is commonly used in case-control studies. 1 This can be done for convenience (eg, choosing a control admitted to hospital on the same day as the case), to improve study efficiency by improving precision (under certain conditions) when controlling for the matching factors (eg, age, sex) in the analysis, or to enable control in the analysis of unquantifiable factors such as neighbourhood characteristics (eg, by choosing neighbours as controls and then controlling for neighbourhood in the analysis). The increase in efficiency occurs because it ensures similar numbers of cases and controls in confounder strata. For example, in a study of lung cancer, if controls are sampled at random from the source population, their age distribution will be much younger than that of the lung cancer cases. Thus, when age is controlled in the analysis, the young age stratum may contain mostly controls and few cases, whereas the old age stratum may contain mostly cases and fewer controls. Thus, statistical precision may be improved if controls are age matched to ensure roughly equal numbers of cases and controls in each age stratum.

There are two common misconceptions about case-control studies: that matching in itself eliminates confounding by the matching factors; and that if matching has been performed, then a “matched analysis” is required.

Matching in the design does not control for confounding by the matching factors. In fact, it can introduce confounding by the matching factors even when it did not exist in the source population. 1 The reasons for this are complex and will only be discussed briefly here. In essence, the matching process makes the controls more similar to the cases not only for the matching factor but also for the exposure itself. This introduces a bias that needs to be controlled in the analysis. For example, suppose we were conducting a case-control study of poverty and death (from any cause), and we chose siblings as controls (that is, for each person who died, we matched on family or residence by choosing a sibling who was still alive as a control). In this situation, since poverty runs in families we would tend to select a disadvantaged control for each disadvantaged person who had died and a wealthy control for each wealthy person who had died. We would find roughly equal percentages of disadvantaged people among the cases and controls, and we would find little association between poverty and mortality. The matching has introduced a bias, which fortunately (as we will illustrate) can be controlled by controlling for the matching factor in the analysis.

Thus, a matched design will (almost always) require controlling for the matching factors in the analysis. However, this does not necessarily mean that a matched analysis is required or appropriate, and it will often be sufficient to control for the matching factors using simpler methods. Although this is well recognised in both recent 2 3 and historical 4 5 texts, other texts 6 7 8 9 do not discuss this issue and present the matched analysis as the only option for analysing matched case-control studies. In fact, the more standard analysis may not only be valid but may be much easier in practice, and yield better statistical precision.

In this paper I explore and illustrate these problems using a hypothetical pair matched case-control study.

Options for analysing case-control studies

Unmatched case-control studies are typically analysed using the Mantel-Haenszel method 10 or unconditional logistic regression. 4 The former involves the familiar method of producing a 2×2 (exposure-disease) stratum for each level of the confounder (eg, if there are five age groups and two sex groups, then there will be 10 2×2 tables, each showing the association between exposure and disease within a particular stratum), and then producing a summary (average) effect across the strata. The Mantel-Haenszel estimates are robust and not affected by small numbers in specific strata (provided that the overall numbers of exposed or non-exposed cases or controls are adequate), although it can be difficult or impossible to control for factors other than the matching factors if some strata involve small numbers (eg, just one case and one control). Furthermore, the Mantel-Haenszel approach works well when there are only a few confounder strata, but will experience problems of small numbers (eg, strata with only cases and no controls) if there are too many confounders to adjust for. In this situation, logistic regression may be preferred, since this uses maximum likelihood methods, which enable the adjustment (given certain assumptions) of more confounders.

Suppose that for each case we have chosen a control who is in the same five year age group (eg, if the case is aged 47 years, then a control is chosen who is aged 45-49 years). We can then perform a standard analysis, which adjusts for the matching factor (age group) by grouping all cases and controls into five year age groups and using unconditional logistic regression 4 (or the Mantel-Haenszel method 10 ); if there are eight age groups then this analysis will just have eight strata (represented by seven age group dummy variables), each with multiple cases and controls. Alternatively we can perform a matched analysis (that is, retaining the pair matching of one control for each case) using conditional logistic regression (or the matched data methods, which are equivalent to the Mantel-Haenszel method); if there are 100 case-control pairs, this analysis will then have 100 strata.

The main reason for using conditional (rather than unconditional) logistic regression is that when the analysis strata are very small (eg, with just one case and one control for each stratum), problems of sparse data will occur with unconditional methods. 11 For example, if there are 100 strata, this requires 99 dummy variables to represent them, even though there are only 200 study participants. In this extreme situation, unconditional logistic regression is biased and produces an odds ratio estimate that is the square of the conditional (true) estimate of the odds ratio. 5 12

Example of age matching

Table 1 ⇓ gives an example of age matching in a population based case-control study, and shows the “true’ findings for the total population, the findings for the corresponding unmatched case-control study, and the findings for an age matched case-control study using the standard analysis. Table 2 ⇓ presents the findings for the same age matched case-control study using the matched analysis. All analyses were performed using the Mantel-Haenszel method, but this yields similar results to the corresponding (unconditional or conditional) logistic regression analyses.

Hypothetical study population and case-control study with unmatched and matched standard analyses

  • View inline

Hypothetical matched case-control study with matched analysis

Table 1 ⇑ shows that the crude odds ratio in the total population is 0.86 (0.70 to 1.05), but this changes to 2.00 (1.59 to 2.51) when the analysis is adjusted for age (using the Mantel-Haenszel method). This occurs because there is strong confounding by age—the cases are mostly old, and old people have a lower exposure than young people. Overall, there are 390 cases, and when 390 controls are selected at random from the non-cases in the total population (which is half exposed and half not exposed), this yields the same crude (0.86) and adjusted (2.00) odds ratios, but with wider confidence intervals, reflecting the smaller numbers of non-cases (controls) in the case-control study.

Why matching factors need to be controlled in the analysis

Now suppose that we reconduct the case-control study, matching for age, using two very broad age groups: old and young (table 1 ⇑ ). The number of cases and controls in each age group are now equal. However, the crude odds ratio (1.68, 1.25 to 2.24) is different from both the crude (0.86) and the adjusted (2.00) odds ratios in the total population. In contrast, the adjusted odds ratio (2.00) is the same as that in the total population and in the unmatched case-control study (both of these adjusted odds ratios were estimated using the standard approach). Thus, matching has not removed age confounding and it is still necessary to control for age (this occurs because the matching process in a case-control study changes the association between the matching factor and the outcome and can create an association even if there were none before the matching was conducted). However, there is a small increase in precision in the matched case-control study compared with the unmatched case-control studies (95% confidence intervals of 1.42 to 2.81 compared with 1.38 to 2.89) because there are now equal numbers of cases and controls in each age group (table 1 ⇑ ).

A pair matched study does not necessarily require a pair matched analysis

However, control for simple matching factors such as age does not require a pair matched analysis. Table 2 ⇑ gives the findings that would have been obtained from a pair matched analysis (this is created by assuming that in each age group, and for each case, the control was selected at random from all non-cases in the same age group). The standard adjusted (Mantel-Haenszel) analysis (table 1 ⇑ ) yields an odds ratio of 2.00 (95% confidence interval 1.42 to 2.81); the matched analysis (table 2 ⇑ ) yields the same odds ratio (2.00) but with a slightly wider confidence interval (1.40 to 2.89).

Advantages of the standard analysis

So for many matched case-control studies, we have a choice of doing a standard analysis or a matched analysis. In this situation, there are several possible advantages of using the standard approach.

The standard analysis can actually yield slightly better statistical precision. 13 This may apply, for example, if two or more cases and their matched controls all have identical values for their matching factors; then combining them into a single stratum produces an estimator with lower variance and no less validity 14 (as indicated by the slightly narrower confidence interval for the standard adjusted analysis (table 1 ⇑ ) compared with the pair matched analysis (table 2 ⇑ ). This particularly occurs because combining strata with identical values for the matching factors (eg, if two case-control pairs all concern women aged 55-59 years) may mean that fewer data are discarded (that is, do not contribute to the analysis) because of strata where the case and control have the same exposure status. Further gains in precision may be obtained if combining strata means that cases with no corresponding control (or controls without a corresponding case) can be included in the analysis. When such strata are combined, a conditional analysis may still be required if the resulting strata are still “small,” 13 but an unconditional analysis will be valid and yield similar findings if the resulting strata are sufficiently large. This may often be the case when matching has only been performed on standard factors such as sex and age group.

The standard analysis may also enhance the clarity of the presentation, particularly when analysing subgroups of cases and controls selected for variables on which they were not matched, since it involves standard 2×2 tables for each subgroup. 15

A further advantage of the standard analysis is that it makes it easier to combine different datasets that have involved matching on different factors (eg, if some have matched for age, some for age and sex, and some for nothing, then all can be combined in an analysis adjusting for age, sex, and study centre). In contrast, one multicentre study 16 (of which I happened to be a coauthor) attempted to (unnecessarily) perform a matched analysis across centres. Because not all centres had used pair matching, this involved retrospective pair matching in those centres that had not matched as part of the study design. This resulted in the unnecessary discarding of the unmatched controls, thus resulting in a likely loss of precision.

Conclusions

If matching is carried out on a particular factor such as age in a case-control study, then controlling for it in the analysis must be considered. This control should involve just as much precision as was used in the original matching 14 (eg, if exact age in years was used in the matching, then exact age in years should be controlled for in the analysis), although in practice such rigorous precision may not always be required (eg, five year age groups may suffice to control confounding by age, even if age matching was done more precisely than this). In some circumstances, this control may make no difference to the main exposure effect estimate—eg, if the matching factor is unrelated to exposure. However, if there is an association between the matching factor and the exposure, then matching will introduce confounding that needs to be controlled for in the analysis.

So when is a pair matched analysis required? The answer is, when the matching was genuinely at (or close to) the individual level. For example, if siblings have been chosen as controls, then each stratum would have just one case and the sibling control; in this situation, an unconditional logistic regression analysis would suffer from problems of sparse data, and conditional logistic regression would be required. Similar situations might arise if controls were neighbours or from the same general practice (if each general practice only had one or a few cases), or if matching was performed on many factors simultaneously so that most strata (in the standard analysis) had just one case and one control.

Provided, however, that there are no problems of sparse data, such control for the matching factors can be obtained using an unconditional analysis, with no loss of validity and a possible increase in precision.

Thus, a matched design will (nearly always) require controlling for the matching factors in the analysis. It is not the case, however, that a matched design requires a matched analysis.

I thank Simon Cousens, Deborah Lawlor, Lorenzo Richiardi, and Jan Vandenbroucke for their comments on the draft manuscript. The Centre for Global NCDs is supported by the Wellcome Trust Institutional Strategic Support Fund, 097834/Z/11/B.

Competing interests: I have read and understood the BMJ policy on declaration of interests and declare the following: none.

Provenance and peer review: Not commissioned; externally peer reviewed.

This is an Open Access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 3.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/3.0/ .

  • ↵ Rothman KJ, Greenland S, Lash TL, eds Design strategies to improve study accuracy. Modern epidemiology. 3rd ed . Lippincott Williams & Wilkins,  2008 .
  • ↵ Rothman KJ. Epidemiology: an introduction. Oxford University Press,  2012 .
  • ↵ Rothman KJ, Greenland S, Lash TL, eds. Modern epidemiology. 3rd ed . Lippincott Williams & Wilkins,  2008 .
  • ↵ Breslow NE, Day NE. Statistical methods in cancer research. Vol I: the analysis of case-control studies. IARC,  1980 .
  • ↵ Kleinbaum DG, Kupper LL, Morgenstern H. Epidemiologic research: principles and quantitative methods. Lifetime Learning Publications,  1982 .
  • ↵ Dos Santos Silva I. Cancer epidemiology: principles and methods. IARC,  1999 .
  • ↵ Keogh RH, Cox DR. Case-control studies. Cambridge University Press,  2014 doi:10.1017/CBO9781139094757 . .
  • ↵ Lilienfeld DE, Stolley PD. Foundations of epidemiology. 3rd ed . Oxford University Press,  1994 .
  • ↵ MacMahon B, Trichopolous D. Epidemiology: principles and methods. 2nd ed . Little Brown,  1996 .
  • ↵ Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst  1959 ; 22 :719- 48 . 13655060 .
  • ↵ Robins J, Greenland S, Breslow NE. A general estimator for the variance of the Mantel-Haenszel odds ratio. Am J Epidemiol  1986 ; 124 :719- 23 . 3766505 .
  • ↵ Pike MC, Hill AP, Smith PG. Bias and efficiency in logistic analyses of stratified case-control studies. Int J Epidemiol  1980 ; 9 :89- 95 . doi:10.1093/ije/9.1.89 .  7419334 .
  • ↵ Brookmeyer R, Liang KY, Linet M. Matched case-control designs and overmatched analyses. Am J Epidemiol  1986 ; 124 :693- 701 . 3752063 .
  • ↵ Greenland S. Applications of stratified analysis methods. In: Rothman KJ, Greenland S, Lash TL, eds. Modern epidemiology. 3rd ed . Lippincott Williams & Wilkins,  2008 .
  • ↵ Vandenbroucke JP, Koster T, Briët E, Reitsma PH, Bertina RM, Rosendaal FR. Increased risk of venous thrombosis in oral-contraceptive users who are carriers of factor V Leiden mutation. Lancet  1994 ; 344 :1453- 7 . doi:10.1016/S0140-6736(94)90286-0 .  7968118 .
  • ↵ Cardis E, Richardson L, Deltour I, et al. The INTERPHONE study: design, epidemiological methods, and description of the study population. Eur J Epidemiol  2007 ; 22 :647- 64 . doi:10.1007/s10654-007-9152-z .  17636416 .
  • Mansournia MA, Hernán MA, Greenland S. Matched designs and causal diagrams. Int J Epidemiol  2013 ; 42 :860- 9 . doi:10.1093/ije/dyt083 .  23918854 .
  • Hernán MA, Hernández-Díaz S, Robins JM. A structural approach to selection bias. Epidemiology  2004 ; 15 :615- 25 . doi:10.1097/01.ede.0000135174.63482.43 .  15308962 .

case control studies design conduct analysis pdf

Have a language expert improve your writing

Run a free plagiarism check in 10 minutes, generate accurate citations for free.

  • Knowledge Base

Methodology

  • What Is a Case-Control Study? | Definition & Examples

What Is a Case-Control Study? | Definition & Examples

Published on February 4, 2023 by Tegan George . Revised on June 22, 2023.

A case-control study is an experimental design that compares a group of participants possessing a condition of interest to a very similar group lacking that condition. Here, the participants possessing the attribute of study, such as a disease, are called the “case,” and those without it are the “control.”

It’s important to remember that the case group is chosen because they already possess the attribute of interest. The point of the control group is to facilitate investigation, e.g., studying whether the case group systematically exhibits that attribute more than the control group does.

Table of contents

When to use a case-control study, examples of case-control studies, advantages and disadvantages of case-control studies, other interesting articles, frequently asked questions.

Case-control studies are a type of observational study often used in fields like medical research, environmental health, or epidemiology. While most observational studies are qualitative in nature, case-control studies can also be quantitative , and they often are in healthcare settings. Case-control studies can be used for both exploratory and explanatory research , and they are a good choice for studying research topics like disease exposure and health outcomes.

A case-control study may be a good fit for your research if it meets the following criteria.

  • Data on exposure (e.g., to a chemical or a pesticide) are difficult to obtain or expensive.
  • The disease associated with the exposure you’re studying has a long incubation period or is rare or under-studied (e.g., AIDS in the early 1980s).
  • The population you are studying is difficult to contact for follow-up questions (e.g., asylum seekers).

Retrospective cohort studies use existing secondary research data, such as medical records or databases, to identify a group of people with a common exposure or risk factor and to observe their outcomes over time. Case-control studies conduct primary research , comparing a group of participants possessing a condition of interest to a very similar group lacking that condition in real time.

The only proofreading tool specialized in correcting academic writing - try for free!

The academic proofreading tool has been trained on 1000s of academic texts and by native English editors. Making it the most accurate and reliable proofreading tool for students.

case control studies design conduct analysis pdf

Try for free

Case-control studies are common in fields like epidemiology, healthcare, and psychology.

You would then collect data on your participants’ exposure to contaminated drinking water, focusing on variables such as the source of said water and the duration of exposure, for both groups. You could then compare the two to determine if there is a relationship between drinking water contamination and the risk of developing a gastrointestinal illness. Example: Healthcare case-control study You are interested in the relationship between the dietary intake of a particular vitamin (e.g., vitamin D) and the risk of developing osteoporosis later in life. Here, the case group would be individuals who have been diagnosed with osteoporosis, while the control group would be individuals without osteoporosis.

You would then collect information on dietary intake of vitamin D for both the cases and controls and compare the two groups to determine if there is a relationship between vitamin D intake and the risk of developing osteoporosis. Example: Psychology case-control study You are studying the relationship between early-childhood stress and the likelihood of later developing post-traumatic stress disorder (PTSD). Here, the case group would be individuals who have been diagnosed with PTSD, while the control group would be individuals without PTSD.

Case-control studies are a solid research method choice, but they come with distinct advantages and disadvantages.

Advantages of case-control studies

  • Case-control studies are a great choice if you have any ethical considerations about your participants that could preclude you from using a traditional experimental design .
  • Case-control studies are time efficient and fairly inexpensive to conduct because they require fewer subjects than other research methods .
  • If there were multiple exposures leading to a single outcome, case-control studies can incorporate that. As such, they truly shine when used to study rare outcomes or outbreaks of a particular disease .

Disadvantages of case-control studies

  • Case-control studies, similarly to observational studies, run a high risk of research biases . They are particularly susceptible to observer bias , recall bias , and interviewer bias.
  • In the case of very rare exposures of the outcome studied, attempting to conduct a case-control study can be very time consuming and inefficient .
  • Case-control studies in general have low internal validity  and are not always credible.

Case-control studies by design focus on one singular outcome. This makes them very rigid and not generalizable , as no extrapolation can be made about other outcomes like risk recurrence or future exposure threat. This leads to less satisfying results than other methodological choices.

If you want to know more about statistics , methodology , or research bias , make sure to check out some of our other articles with explanations and examples.

  • Student’s  t -distribution
  • Normal distribution
  • Null and Alternative Hypotheses
  • Chi square tests
  • Confidence interval
  • Quartiles & Quantiles
  • Cluster sampling
  • Stratified sampling
  • Data cleansing
  • Reproducibility vs Replicability
  • Peer review
  • Prospective cohort study

Research bias

  • Implicit bias
  • Cognitive bias
  • Placebo effect
  • Hawthorne effect
  • Hindsight bias
  • Affect heuristic
  • Social desirability bias

Prevent plagiarism. Run a free check.

A case-control study differs from a cohort study because cohort studies are more longitudinal in nature and do not necessarily require a control group .

While one may be added if the investigator so chooses, members of the cohort are primarily selected because of a shared characteristic among them. In particular, retrospective cohort studies are designed to follow a group of people with a common exposure or risk factor over time and observe their outcomes.

Case-control studies, in contrast, require both a case group and a control group, as suggested by their name, and usually are used to identify risk factors for a disease by comparing cases and controls.

A case-control study differs from a cross-sectional study because case-control studies are naturally retrospective in nature, looking backward in time to identify exposures that may have occurred before the development of the disease.

On the other hand, cross-sectional studies collect data on a population at a single point in time. The goal here is to describe the characteristics of the population, such as their age, gender identity, or health status, and understand the distribution and relationships of these characteristics.

Cases and controls are selected for a case-control study based on their inherent characteristics. Participants already possessing the condition of interest form the “case,” while those without form the “control.”

Keep in mind that by definition the case group is chosen because they already possess the attribute of interest. The point of the control group is to facilitate investigation, e.g., studying whether the case group systematically exhibits that attribute more than the control group does.

The strength of the association between an exposure and a disease in a case-control study can be measured using a few different statistical measures , such as odds ratios (ORs) and relative risk (RR).

No, case-control studies cannot establish causality as a standalone measure.

As observational studies , they can suggest associations between an exposure and a disease, but they cannot prove without a doubt that the exposure causes the disease. In particular, issues arising from timing, research biases like recall bias , and the selection of variables lead to low internal validity and the inability to determine causality.

Sources in this article

We strongly encourage students to use sources in their work. You can cite our article (APA Style) or take a deep dive into the articles below.

George, T. (2023, June 22). What Is a Case-Control Study? | Definition & Examples. Scribbr. Retrieved February 15, 2024, from https://www.scribbr.com/methodology/case-control-study/
Schlesselman, J. J. (1982). Case-Control Studies: Design, Conduct, Analysis (Monographs in Epidemiology and Biostatistics, 2) (Illustrated). Oxford University Press.

Is this article helpful?

Tegan George

Case-control studies

Design, conduct, analysis, by james j. schlesselman.

  • ★ ★ ★ ★ ★ 5.00 ·
  • 26 Want to read
  • 1 Currently reading
  • 2 Have read

Case-control studies by James J. Schlesselman

Preview Book

My Reading Lists:

Use this Work

Create a new list

My book notes.

My private notes about this edition:

Check nearby libraries

  • Library.link

Buy this book

  • Better World Books
  • Bookshop.org

When you buy books using these links the Internet Archive may earn a small commission .

This edition doesn't have a description yet. Can you add one ?

Previews available in: English

Showing 1 featured edition. View all 1 editions?

Add another edition?

Book Details

Published in, edition notes.

Bibliography: p. [325]-343. Includes index.

Classifications

The physical object, community reviews (0).

  • Created April 1, 2008
  • 16 revisions

Wikipedia citation

Copy and paste this code into your Wikipedia page. Need help ?

THE CDC FIELD EPIDEMIOLOGY MANUAL

Designing and Conducting Analytic Studies in the Field

Brendan R. Jackson And Patricia M. Griffin

Analytic studies can be a key component of field investigations, but beware of an impulse to begin one too quickly. Studies can be time- and resource-intensive, and a hastily constructed study might not answer the correct questions. For example, in a foodborne disease outbreak investigation, if the culprit food is not on your study’s questionnaire, you probably will not be able to implicate it. Analytic studies typically should be used to test hypotheses, not generate them. However, in certain situations, collecting data quickly about patients and a comparison group can be a way to explore multiple hypotheses. In almost all situations, generating hypotheses before designing a study will help you clarify your study objectives and ask better questions.

  • Generating Hypotheses
  • Study Designs for Testing Hypotheses
  • Types of Observational Studies for Testing Hypotheses
  • Selection of Controls in Case–Control Studies
  • Matching in Case–Control Studies
  • Example: Using an Analytic Study to Solve an Outbreak at a Church Potluck Dinner (But Not That Church Potluck)
  • Outbreaks with Universal Exposure

The initial steps of an investigation, described in previous chapters, are some of your best sources of hypotheses. Key activities include the following:

  • By examining the sex distribution among persons in outbreaks, US enteric disease investigators have learned to suspect a vegetable as the source when most patients are women. (Of course, generalizations do not always hold true!)
  • In an outbreak of bloodstream infections caused by Serratia marcescens among patients receiving parenteral nutrition (food administered through an intravenous catheter), investigators had a difficult time finding the source until they noted that none of the 19 cases were among children. Further investigation of the parenteral nutrition administered to adults but not children in that hospital identified contaminated amino acid solution as the source ( 1 ).
  • Focus on outliers. Give extra attention to the earliest and latest cases on an epidemic curve and to persons who recently visited the neighborhood where the outbreak is occurring. Interviews with these patients can yield important clues (e.g., by identifying the index case, secondary case, or a narrowed list of common exposures).
  • Determine sources of similar outbreaks. Consult health department records, review the literature, and consult experts to learn about previous sources. Be mindful that new sources frequently occur, given ever-changing social, behavioral, and commercial trends.
  • Conduct a small number of in-depth, open-ended interviews. When a likely source is not quickly evident, conducting in-depth (often >1 hour), open-ended interviews with a subset of patients (usually 5 to 10) or their caregivers can be the best way to identify possible sources. It helps to begin with a semistructured list of questions designed to help the patient recall the events and exposures of every day during the incubation period. The interview can end with a “shotgun” questionnaire (see activity 6) ( Box 7.1 ). A key component of this technique is that one investigator ideally conducts, or at least participates in, as many interviews as possible (five or more) because reading notes from others’ interviews is no substitute for soliciting and hearing the information first-hand. For example, in a 2009 Escherichia coli O157 outbreak, investigators were initially unable to find the source through general and targeted questionnaires. During open-ended interviews with five patients, the interviewer noted that most reported having eaten strawberries, a particular type of candy, and uncooked prepackaged cookie dough. An analytic study was then conducted that included questions about these exposures; it confirmed cookie dough as the source ( 3 ).
  • Ask patients what they think. Patients can have helpful thoughts about the source of their illness. However, be aware that patients often associate their most recent food exposure (e.g., a meal) with illness, whereas the inciting exposure might have been long before.
  • Consider administering a shotgun questionnaire. Such questionnaires, which typically ask about hundreds of possible exposures, are best used on a limited number of patients as part of hypothesis-generating interviews. After generating hypotheses, investigators can create a questionnaire targeted to that investigation. Although not an ideal method, shotgun questionnaires can be used by multiple interviewers to obtain data about large numbers of patients ( Box 7.1 ).

In November 2014, a US surveillance system for foodborne diseases (PulseNet) detected a cluster (i.e., a possible outbreak) of listeriosis cases based on similar-appearing Listeria monocytogenes isolates by pulsed-field gel electrophoresis of the isolates. No suspected foods were identified through routine patient interviews by using a Listeria -specific questionnaire with approximately 40 common food sources of listeriosis (e.g., soft cheese and deli meat). The outbreak’s descriptive epidemiology offered no clear leads: the sex distribution was nearly even, the age spectrum was wide, and the case-fatality rate of approximately 20% was typical. Notably, however, 3 of the 35 cases occurred among previously healthy school-aged children, which is highly unusual for listeriosis. Most cases occurred during late October and early November.

Investigators began reinterviewing patients by using a hypothesis-generating shotgun questionnaire with more than 500 foods, but it did not include caramel apples. By comparing the first nine patient responses with data from a published survey of food consumption, strawberries and ice cream emerged as hypotheses. However, several interviewed patients denied having eaten these foods during the month before illness. An investigator then conducted lengthy, open-ended interviews with patients and their family members. During one interview, he asked about special foods eaten during recent holidays, and the patient’s wife replied that her husband had eaten prepackaged caramel apples around Halloween. Although produce items had been implicated in past listeriosis outbreaks, caramel apples seemed an unlikely source. However, the interviewer took note of this connection because he had previously interviewed another patient who reported having eaten caramel apples. This event underscores the importance of one person conducting multiple interviews because that person might make subtle mental connections that may be missed when reviewing other interviewers’ notes. In fact, several other investigators listening to the interview noted this exposure—among hundreds of others—but thought little of it.

In this investigation, the finding of high strawberry and ice cream consumption among patients, coupled with the timing of the outbreak during a holiday period, helped make a sweet food (i.e., caramel apples) seem more plausible as the possible source.

To explore the caramel apple hypothesis, investigators asked five other patients about this exposure, and four reported having eaten them. On the basis of these initial results, investigators designed and administered a targeted questionnaire to patients involved in the outbreak, as well as to patients infected with unrelated strains of L. monocytogenes (i.e., a case–case study). This study, combined with testing of apples and the apple packing facility, confirmed that caramel apples were the source (2). Had a single interviewer performed multiple open-ended interviews to generate hypotheses before the shotgun questionnaire, the outbreak might have been solved sooner.

As evident in public health and clinical guidelines, randomized controlled trials (e.g., trials of drugs, vaccines, and community-level interventions) are the reference standard for epidemiology, providing the highest level of evidence. However, such studies are not possible in certain situations, including outbreak investigations. Instead, investigators must rely on observational studies, which can provide sufficient evidence for public health action. In observational studies, the epidemiologist documents rather than determines the exposures, quantifying the statistical association between exposure and disease. Here again, the key when designing such studies is to obtain a relevant comparison group for the patients ( Box 7.2 ).

Because field analytic studies are used to quantify the association between exposure and disease, defining what is meant by exposure and disease is essential. Exposure is used broadly, meaning demographic characteristics, genetic or immunologic makeup, behaviors, environmental exposures, and other factors that might influence a person’s risk for disease. Because precise information can help accurately estimate an exposure’s effect on disease, exposure measures should be as objective and standard as possible. Developing a measure of exposure can be conceptually straightforward for an exposure that is a relatively discrete event or characteristic—for example, whether a person received a spinal injection with steroid medication compounded at a specific pharmacy or whether a person received a typhoid vaccination during the year before international travel. Although these exposures might be straightforward in theory, they can be subject to interpretation in practice. Should a patient injected with a medication from an unknown pharmacy be considered exposed? Whatever decision is made should be documented and applied consistently.

Additionally, exposures often are subject to the whims of memory. Memory aids (e.g., restaurant menus, vaccination cards, credit card receipts, and shopper cards) can be helpful. More than just a binary yes or no, the dose of an exposure can also be enlightening. For example, in an outbreak of fungal bloodstream infections linked to contaminated intravenous saline flushes administered at an oncology clinic, affected patients had received a greater number of flushes than unaffected patients ( 4 ). Similarly, in an outbreak of Listeria monocytogenes infections, the association with deli meat became apparent only when the exposure evaluated was consumption of deli meat more than twice a week ( 5 ).

Defining disease (e.g., does a person have botulism?) might sound simple, but often it is not; read more about making and applying disease case definitions in Chapter 3 .

Three types of observational studies are commonly used in the field. All are best performed by using a standard questionnaire specific for that investigation, developed on the basis of hypothesis-generating interviews.

Observational Study Type 1: Cohort

In concept, a cohort study, like an experimental study, begins with a group of persons without the disease under study, but with different exposure experiences, and follows them over time to find out whether they experience the disease or health condition of interest. However, in a cohort study, each person’s exposure is merely recorded rather than assigned randomly by the investigator. Then the occurrence of disease among persons with different exposures is compared to assess whether the exposures are associated with increased risk for disease. Cohort studies can be prospective or retrospective.

Prospective Cohort Studies

A prospective cohort study enrolls participants before they experience the disease or condition of interest. The enrollees are then followed over time for occurrence of the disease or condition. The unexposed or lowest exposure group serves as the comparison group, providing an estimate of the baseline or expected amount of disease. An example of a prospective cohort study is the Framingham Heart Study. By assessing the exposures of an original cohort of more than 5,000 adults without cardiovascular disease (CVD), beginning in 1948 and following them over time, the study was the first to identify common CVD risk factors ( 6 ). Each case of CVD identified after enrollment was counted as an incident case. Incidence was then quantified as the number of cases divided by the sum of time that each person was followed (incidence rate) or as the number of cases divided by the number of participants being followed (attack rate or risk or i ncidence proportion). In field epidemiology, prospective cohort studies also often involve a group of persons who have had a known exposure (e.g., survived the World Trade Center attack on September 11, 2001 [ 7 ]) and who are then followed to examine the risk for subsequent illnesses with long incubation or latency periods.

Retrospective Cohort Studies

A retrospective cohort study enrolls a defined participant group after the disease or condition of interest has occurred. In field epidemiology, these studies are more common than prospective studies. The population affected is often well-defined (e.g., banquet attendees, a particular school’s students, or workers in a certain industry). Investigators elicit exposure histories and compare disease incidence among persons with different exposures or exposure levels.

Observational Study Type 2: Case–Control

In a case–control study, the investigator must identify a comparison group of control persons who have had similar opportunities for exposure as the case-patients. Case–control studies are commonly performed in field epidemiology when a cohort study is impractical (e.g., no defined cohort or too many non-ill persons in the group to interview). Whereas a cohort study proceeds conceptually from exposure to disease or condition, a case–control study begins conceptually with the disease or condition and looks backward at exposures. Excluding controls by symptoms alone might not guarantee that they do not have mild cases of the illness under investigation. Table 7.1 presents selected key differences between a case–control and retrospective cohort study.

Observational Study Type 3: Case–Case

In case–case studies, a group of patients with the same or similar disease serve as a comparison group (8). This method might require molecular subtyping of the suspected pathogen to distinguish outbreak-associated cases from other cases and is especially useful when relevant controls are difficult to identify. For example, controls for an investigation of Listeria illnesses typically are patients with immunocompromising conditions (e.g., cancer or corticosteroid use) who might be difficult to identify among the general population. Patients with Listeria isolates of a different subtype than the outbreak strain can serve as comparisons to help reduce bias when comparing food exposures. However, patients with similar illnesses can have similar exposures, which can introduce a bias, making identifying the source more difficult. Moreover, other considerations should influence the choice of a comparison group. If most outbreak-associated case-patients are from a single neighborhood or are of a certain race/ethnicity, other patients with listeriosis from across the country will serve as an inadequate comparison group.

Considerations for Selecting Controls

Selecting relevant controls is one of the most important considerations when designing a case–control study. Several key considerations are presented here; consult other resources for in-depth discussion ( 9,10 ). Ideally, controls should

  • Thoroughly reflect the source population from which case-patients arose, and
  • Provide a good estimate of the level of exposure one would expect from that population. Sometimes the source population is not so obvious, and a case–control study using controls from the general population might be needed to implicate a general exposure (e.g., visiting a specific clinic, restaurant, or fair). The investigation can then focus on specific exposures among persons with the general exposure (see also next section).

Controls should be chosen independently of any specific exposure under evaluation. If you select controls on the basis of lack of exposure, you are likely to find an association between illness and that exposure regardless of whether one exists. Also important is selecting controls from a source population in a way that minimizes confounding (see Chapter 8 ), which is the existence of a factor (e.g., annual income) that, by being associated with both exposure and disease, can affect the associations you are trying to examine.

When trying to enroll controls who reflect the source population, try to avoid overmatching (i.e., enrolling controls who are too similar to case-patients, resulting in fewer differences among case-patients and controls than ought to exist and decreased ability to identify exposure–disease associations). When conducting case–control studies in hospitals and other healthcare settings, ensure that controls do not have other diseases linked to the exposure under study.

Commonly Used Control Selection Methods

When an outbreak does not affect a defined population (e.g., potluck dinner attendees) but rather the community at large, a range of options can be used to determine how to select controls from a large group of persons.

  • Random-digit dialing . This method, which involves selecting controls by using a system that randomly selects telephone numbers from a directory, has been a staple of US outbreak investigations. In recent years, however, declining response rates because of increasing use of caller identification and cellular phones and lack of readily available directory listings of cellular phone numbers by geographic area have made this method increasingly difficult. Even when this method was most useful, often 50 or more numbers needed to be dialed to reach one household or person who both answered and provided a usable match for the case-patient. Commercial databases that include cellular phone numbers have been used successfully to partially address this problem, but the method remains time-consuming ( 11 ).
  • Random or systematic sampling from a list . For investigations in settings where a roster is available (e.g., attendees at a resort on certain dates), controls can be selected by either random or systematic sampling. Government records (e.g., motor vehicle, voter, or tax records) can provide lists of possible controls, but they might not be representative of the population being studied ( 11 ). For random sampling, a table or computer-generated list of random numbers can be used to select every n th persons to contact (e.g., every 12th or 13th).
  • Neighborhood . Recruiting controls from the same neighborhood as case-patients (i.e., neighborhood matching) has commonly been used during case–control studies, particularly in low-and middle-income countries. For example, during an outbreak of typhoid fever in Tajikistan ( 12 ), investigators recruited controls by going door-to-door down a street, starting at a case-patient’s house; a study of cholera in Haiti used a similar method ( 13 ). Typically, the immediately neighboring households are skipped to prevent overmatching.
  • Patients’ friends or relatives . Using friends and relatives as controls can be an effective technique when the characteristics of case-patients (e.g., very young children) make finding controls by a random method difficult. Typically, the investigator interviews a patient or his or her parent, then asks for the names and contact information for more friends or relatives who are needed as controls. One advantage is that the friends of an ill person are usually willing to participate, knowing their cooperation can help solve the puzzle. However, because they can have similar personal habits and preferences as patients, their exposures might be similar. Such overmatching can decrease the likelihood of finding the source of the illness or condition.
  • Databases of persons with exposure information . Sources of data on persons with exposure information include survey data (e.g., FoodNet Population Survey [ 14 ]), public health databases of patients with other illnesses or a different subtype of the same illness, and previous studies. ( Chapter 4 describes additional sources.)

When considering outside data sources, investigators must determine whether those data provide an appropriate comparison group. For example, persons in surveys might differ from case-patients in ways that are impossible to determine. Other patients might be so similar to case-patients that risky exposures are unidentifiable, or they might be so different that exposures identified as risks are not true risks.

To help control for confounding, controls can be matched to case-patients on characteristics specified by investigators, including age group, sex, race/ethnicity, and neighborhood. Such matching does not itself reduce confounding, but it enables greater efficiency when matched analyses are performed that do ( 15 ). When deciding to match, however, be judicious. Matching on too many characteristics can make controls difficult to find (making a tough process even harder). Imagine calling hundreds of random telephone numbers trying to find a man of a particular ethnicity aged 50–54 years who is then willing to answer your questions. Also, remember not to match on the exposure of interest or on any other characteristic you wish to examine. Matched case–control study data typically necessitate a matched analysis (e.g., conditional logistic regression) ( 15 ).

Matching Types

The two main types of matching are pair matching and frequency matching.

Pair Matching

In pair matching, each control is matched to a specific case-patient. This method can be helpful logistically because it allows matching by friends or relatives, neighborhood, or telephone exchange, but finding controls who meet specific criteria can be burdensome.

Frequency Matching

In frequency matching, also called category matching , controls are matched to case-patients in proportion to the distribution of a characteristic among case-patients. For example, if 20% of case-patients are children aged 5–18 years, 50% are adults aged 19–49 years, and 30% are adults 50 years or older, controls should be enrolled in similar proportions. This method works best when most case-patients have been identified before control selection begins. It is more efficient than pair matching because a person identified as a possible control who might not meet the criteria for matching a particular case-patient might meet criteria for one of the case-patient groups.

Number of Controls

Most field case–control studies use control-to-case-patient ratios of 1:1, 2:1, or 3:1. Enrolling more than one control per case-patient can increase study power, which might be needed to detect a statistically significant difference in exposure between case-patients and controls, particularly when an outbreak involves a limited number of cases. The incremental gain of adding more controls beyond three or four is small because study power begins to plateau. Note that not all case-patients need to have the same number of controls. Sample size calculations can help in estimating a target number of controls to enroll, although sample sizes in certain field investigations are limited more by time and resource constraints. Still, estimating study power under a range of scenarios is wise because an analytic study might not be worth doing if you have little chance of detecting a statistically significant association. Sample size calculators for unmatched case–control studies are available at http://www.openepi.com and in the StatCalc function of Epi Info ( https://www.cdc.gov/epiinfo ).

More than One Control Group

Sometimes the choice of a control group is so vexing that investigators decide to use more than one type of control group (e.g., a hospital-based group and a community group). If the two control groups provide similar results and conclusions about risk factors for disease, the credibility of the findings is increased. In contrast, if the two control groups yield conflicting results, interpretation becomes more difficult.

Since the 1940s, field epidemiology students have studied a classic outbreak of gastrointestinal illness at a church potluck dinner in Oswego, New York ( 16 ). However, the case study presented here, used to illustrate study designs, is a different potluck dinner.

In April 2015, an astute neurologist in Lancaster, Ohio, contacted the local health department about a patient in the emergency department with a suspected case of botulism. Within 2 hours, four more patients arrived with similar symptoms, including blurred vision and shortness of breath. Health officials immediately recognized this as a botulism outbreak.

  • If the source is a widely distributed commercial product, then the population to study is persons across the United States and possibly abroad.
  • If the source is airborne, then the population to study is residents of a single city or area.
  • If the source is food from a restaurant, then the population to study is predominantly local residents and some travelers.
  • If the source is a meal at a workplace or social setting, then the population to study is meal attendees.
  • If the source is a meal at home, then the population to study is household members and any guests.

Descriptive epidemiology and questioning of the case-patients revealed that all had eaten at the same church potluck dinner and had no other common exposures, making the potluck the likely exposure site and attendees the likely source population. Thus, an analytic study would be targeted at potluck attendees, although investigators must remain alert to case-patients among nonattendees. As initial interviews were conducted, more cases of botulism were being diagnosed, quickly increasing to more than 25. The source of the outbreak needed to be identified rapidly to halt further exposure and illness.

  • List of foods served at the potluck.
  • Approximate number of attendees.
  • A case definition.
  • Information from 5–10 hypothesis-generating interviews with a few case-patients or their family members.
  • A cohort study would be a reasonable option because a defined group exists (i.e., a cohort) of exposed persons who could be interviewed in a reasonable amount of time. The study would be retrospective because the outcome (i.e., botulism) has already occurred, and investigators could assess exposures retrospectively (i.e., foods eaten at the potluck) by interviewing attendees.
  • In a cohort study, investigators can calculate the attack rate for botulism among potluck attendees who reported having eaten each food and for those who had not. For example, if 20 of the 30 attendees who had eaten a particular food (e.g., potato salad) had botulism, you would calculate the attack rate by dividing 20 (corresponding to cell a in Handout 7.1 ) by 30 (total exposed, or a + b), yielding approximately 67%. If 5 of the 45 attendees who had not eaten potato salad had botulism, the attack rate among the unexposed—5 / 45, corresponding to c/ (c + d)—would be approximately 11%. The risk ratio would be 6, which is calculated by dividing the attack rate among the exposed (67%) by the attack rate among the unexposed (11%).
  • A case–control study would be the most feasible option because the entire cohort could not be identified and because the large number of attendees could make interviewing them all difficult. Rather than interview all non-ill persons, a subset could be interviewed as control subjects.
  • The method of control subject selection should be considered carefully. If all attendees are not interviewed, determining the risk for botulism among the exposed and unexposed is impossible because investigators would not know the exposures for all non-ill attendees. Instead of risk, investigators calculate the odds of exposure, which can approximate risk. For example, if 20 (80%) of 25 case-patients had eaten potato salad, the odds of potato salad exposure among case-patients would be 20/ 5 = 4 (exposed/ unexposed, or a/ c in Handout 7.2 ). If 10 (20%) of 50 selected controls had eaten potato salad, the odds of exposure among control subjects would be 10/ 40 = 0.25 (or b/ d in Handout 7.2). Dividing the odds of exposure among the case-patients (a/ c) by the odds of exposure among control subjects (b / d) yields an odds ratio of 16 (4/ 0.25). The odds ratio is not a true measure of risk, but it can be used to implicate a food. An odds ratio can approximate a risk ratio when the outcome or disease is rare (e.g., roughly <5% of a population). In such cases, a/ b is similar to a/ (a + b). The odds ratio is typically higher than the risk ratio when >5% of exposed persons in the analysis have the illness.

In the actual outbreak, 29 (38%) of 77 potluck attendees had botulism. The investigators performed a cohort study, interviewing 75 of the 77 attendees about 52 foods served ( 17 ). The attack rate among persons who had eaten potato salad was significantly and substantially higher than the attack rate among those who had not, with a risk ratio of 14 (95% confidence interval 5–42). One of the potato salads served was made with incorrectly home-canned potatoes (a known source of botulinum toxin), and samples of discarded potato salad tested positive for botulinum toxin, supporting the findings of the analytic study. (Of note, persons often blame potato salad for causing illness when, in fact, it rarely is a source. This outbreak was a notable exception.)

In field epidemiology, the link between exposure and illness is often so strong that it is evident despite such inherent study limitations as small sample size and exposure misclassification. In this outbreak, a few of the patients with botulism reported not having eaten potato salad, and some of the attendees without botulism reported having eaten it. In epidemiologic studies, you rarely find 100% concordance between exposure and outcome for various reasons, including incomplete or erroneous recall because remembering everything eaten is difficult. Here, cross-contamination of potato salad with other foods might have helped explain cases among patients who had not eaten potato salad because only a small amount of botulinum toxin is needed to produce illness.

Two-by-Two Table to Calculate the Relative Risk, or Risk Ratio, in Cohort Studies

Two- by- two tables are covered in more detail in Chapter 8 .

Risk Ratio = Incidence in exposed over Incidence in unexposed = a over a+b over c over c+d

Two-by-Two Table to Calculate the Odds Ratio in Case–Control Studies

A risk ratio cannot be calculated from a case–control study because true attack rates cannot be calculated.

Odds ratio = Odds of exposure in cases over Odds of exposure in controls = a/c over b/d = ad over bc

What kind of study would you design if your hypothesis-generating interviews lead you to believe that everyone, or nearly everyone, was exposed to the same suspected infection source? How would you test hypotheses if all barbecue attendees, ill and non-ill, had eaten the chicken or if all town residents had drunk municipal tap water, and no unexposed group exists for comparison? A few factors that might be of help are the exposure timing (e.g., a particularly undercooked batch of barbeque), the exposure place (e.g., a section of the water system more contaminated than others), and the exposure dose (e.g., number of chicken pieces eaten or glasses of water drunk). Including questions about the time, place, and frequency of highly suspected exposures in a questionnaire can improve the chances of detecting a difference ( 18 ).

Cohort, case–control, and case–case studies are the types of analytic studies that field epidemiologists use most often. They are best used as mechanisms for evaluating—quantifying and testing—hypotheses identified in earlier phases of the investigation. Cohort studies, which are oriented conceptually from exposure to disease, are appropriate in settings in which an entire population is well-defined and available for enrollment (e.g., guests at a wedding reception). Cohort studies are also appropriate when well-defined groups can be enrolled by exposure status (e.g., employees working in different parts of a manufacturing plant). Case–control studies, in contrast, are useful when the population is less clearly defined. Case–control studies, oriented from disease to exposure, identify persons with disease and a comparable group of persons without disease (controls). Then the exposure experiences of the two groups are compared. Case–case studies are similar to case–control studies, except that controls have an illness not linked to the outbreak. Case–control studies are probably the type most often appropriate for field investigations. Although conceptually straightforward, the design of an effective epidemiologic study requires many careful decisions. Taking the time needed to develop good hypotheses can result in a questionnaire that is useful for identifying risk factors. The choice of an appropriate comparison group, how many controls per case-patient to enroll, whether to match, and how best to avoid potential biases are all crucial decisions for a successful study.

This chapter relies heavily on the work of Richard C. Dicker, who authored this chapter in the previous edition.

  • Gupta N, Hocevar SN, Moulton-Meissner HA, et al. Outbreak of Serratia marcescens bloodstream infections in patients receiving parenteral nutrition prepared by a compounding pharmacy. Clin Infect Dis. 2014;59:1–8.
  • Angelo K, Conrad A, Saupe A, et al. Multistate outbreak of Listeria monocytogenes infections linked to whole apples used in commercially produced, prepackaged caramel apples: United States, 2014–2015. Epidemiol Infect. 2017;145:848–56.
  • Neil KP, Biggerstaff G, MacDonald JK, et al. A novel vehicle for transmission of Escherichia coli O157: H7 to humans: multistate outbreak of E. coli O157: H7 infections associated with consumption of ready-to-bake commercial prepackaged cookie dough—United States, 2009. Clin Infect Dis. 2012;54:511–8.
  • Vasquez AM, Lake J, Ngai S, et al. Notes from the field: fungal bloodstream infections associated with a compounded intravenous medication at an outpatient oncology clinic—New York City, 2016. MMWR. 2016;65:1274–5.
  • Gottlieb SL, Newbern EC, Griffin PM, et al. Multistate outbreak of listeriosis linked to turkey deli meat and subsequent changes in US regulatory policy. Clin Infect Dis. 2006;42:29–36.
  • Framingham Heart Study: A Project of the National Heart, Lung, and Blood Institute and Boston University. Framingham, MA: Framingham Heart Study; 2017. https://www.framinghamheartstudy.org/
  • Jordan HT, Brackbill RM, Cone JE, et al. Mortality among survivors of the Sept 11, 2001, World Trade Center disaster: results from the World Trade Center Health Registry cohort. Lancet. 2011;378:879–87.
  • McCarthy N, Giesecke J. Case– case comparisons to study causation of common infectious diseases. Int J Epidemiol. 1999;28:764–8.
  • Rothman KJ, Greenland S. Modern epidemiology . 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008.
  • Wacholder S, McLaughlin JK, Silverman DT, Mandel JS. Selection of controls in case–control studies. I. Principles. Am J Epidemiol. 1992;135:1019–28.
  • Chintapalli S, Goodman M, Allen M, et al. Assessment of a commercial searchable population directory as a means of selecting controls for case–control studies. Public Health Rep. 2009;124:378–83.
  • Centers for Disease Control and Prevention. Epidemiologic case studies: typhoid in Tajikistan. http://www.cdc.gov/epicasestudies/classroom_typhoid.html
  • Dunkle SE, Mba-Jonas A, Loharikar A, Fouche B, Peck M, Ayers T. Epidemic cholera in a crowded urban environment, Port-au-Prince, Haiti. Emerg Infect Dis. 2011;17:2143–6.
  • Centers for Disease Control and Prevention. Foodborne Diseases Active Surveillance Network (FoodNet): population survey. http://www.cdc.gov/foodnet/surveys/population.html
  • Pearce N. Analysis of matched case–control studies. BMJ. 2016;352:1969.
  • Centers for Disease Control and Prevention. Case studies in applied epidemiology: Oswego: an outbreak of gastrointestinal illness following a church supper. http://www.cdc.gov/eis/casestudies.html
  • McCarty CL, Angelo K, Beer KD, et al. Notes from the field.: large outbreak of botulism associated with a church potluck meal—Ohio, 2015. MMWR. 2015;64:802–3.
  • Tostmann A, Bousema JT, Oliver I. Investigation of outbreaks complicated by universal exposure. Emerg Infect Dis. 2012;18:1717–22.

< Previous Chapter 6: Describing Epidemiologic Data

Next Chapter 8: Analayzing and Interpreting Data >

The fellowship application period will be open March 1-June 5, 2024.

The host site application period is closed.

For questions about the EIS program, please contact us directly at [email protected] .

  • Laboratory Leadership Service (LLS)
  • Fellowships and Training Opportunities
  • Division of Workforce Development

Exit Notification / Disclaimer Policy

  • The Centers for Disease Control and Prevention (CDC) cannot attest to the accuracy of a non-federal website.
  • Linking to a non-federal website does not constitute an endorsement by CDC or any of its employees of the sponsors or the information and products presented on the website.
  • You will be subject to the destination website's privacy policy when you follow the link.
  • CDC is not responsible for Section 508 compliance (accessibility) on other federal or private website.

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • Ann Indian Acad Neurol
  • v.16(4); Oct-Dec 2013

Design and data analysis case-controlled study in clinical research

Sanjeev v. thomas.

Department of Neurology, Sree Chitra Tirunal Institute for Medical Sciences and Technology, Trivandrum, Kerala, India

Karthik Suresh

1 Department of Pulmonary and Critical Care Medicine, Johns Hopkins University School of Medicine, Louiseville, USA

Geetha Suresh

2 Department of Justice Administration, University of Louisville, Louiseville, USA

Clinicians during their training period and practice are often called upon to conduct studies to explore the association between certain exposures and disease states or interventions and outcomes. More often they need to interpret the results of research data published in the medical literature. Case-control studies are one of the most frequently used study designs for these purposes. This paper explains basic features of case control studies, rationality behind applying case control design with appropriate examples and limitations of this design. Analysis of sensitivity and specificity along with template to calculate various ratios are explained with user friendly tables and calculations in this article. The interpretation of some of the laboratory results requires sound knowledge of the various risk ratios and positive or negative predictive values for correct identification for unbiased analysis. A major advantage of case-control study is that they are small and retrospective and so they are economical than cohort studies and randomized controlled trials.

Introduction

Clinicians think of case-control study when they want to ascertain association between one clinical condition and an exposure or when a researcher wants to compare patients with disease exposed to the risk factors to non-exposed control group. In other words, case-control study compares subjects who have disease or outcome (cases) with subjects who do not have the disease or outcome (controls). Historically, case control studies came into fashion in the early 20 th century, when great interest arose in the role of environmental factors (such as pipe smoke) in the pathogenesis of disease. In the 1950s, case control studies were used to link cigarette smoke and lung cancer. Case-control studies look back in time to compare “what happened” in each group to determine the relationship between the risk factor and disease. The case-control study has important advantages, including cost and ease of deployment. However, it is important to note that a positive relationship between exposure and disease does not imply causality.

At the center of the case-control study is a collection of cases. [ Figure 1 ] This explains why this type of study is often used to study rare diseases, where the prevalence of the disease may not be high enough to permit for a cohort study. A cohort study identifies patients with and without an exposure and then “looks forward” to see whether or not greater numbers of patients with an exposure develop disease.

An external file that holds a picture, illustration, etc.
Object name is AIAN-16-483-g001.jpg

Comparison of cohort and case control studies

For instance, Yang et al . studied antiepileptic drug (AED) associated rashes in Asians in a case-control study.[ 1 ] They collected cases of confirmed anti-epileptic induced severe cutaneous reactions (such as Stevens Johnson syndrome) and then, using appropriate controls, analyzed various exposures (including type of [AED] used) to look for risk factors to developing AED induced skin disease.

Choosing controls is very important aspect of case-control study design. The investigator must weigh the need for the controls to be relevant against the tendency to over match controls such that potential differences may become muted. In general, one may consider three populations: Cases, the relevant control population and the population at large. For the study above, the cases include patients with AED skin disease. In this case, the relevant control population is a group of Asian patients without skin disease. It is important for controls to be relevant: In the anti-epileptic study, it would not be appropriate to choose a population across ethnicities since one of the premises of the paper revolves around particularly susceptibility to AED drug rashes in Asian populations.

One popular method of choosing controls is to choose patients from a geographic population at large. In studying the relationship between non-steroidal anti-inflammatory drugs and Parkinson's disease (PD), Wahner et al . chose a control population from several rural California counties.[ 2 ] There are other methods of choosing controls (using patients without disease admitted to the hospital during the time of study, neighbors of disease positive cases, using mail routes to identify disease negative cases). However, one must be careful not to introduce bias into control selection. For instance, a study that enrolls cases from a clinic population should not use a hospital population as control. Studies looking at geography specific population (e.g., Neurocysticercosis in India) cannot use controls from large studies done in other populations (registries of patients from countries where disease prevalence may be drastically different than in India). In general, geographic clustering is probably the easiest way to choose controls for case-control studies.

Two popular ways of choosing controls include hospitalized patients and patients from the general population. Choosing hospitalized, disease negative patients offers several advantages, including good rates of response (patients admitted to the hospital are generally already being examined and evaluated and often tend to be available to further questioning for a study, compared with the general population, where rates of response may be much lower) and possibly less amnestic bias (patients who are already in the hospital are, by default, being asked to remember details of their presenting illnesses and as such, may more reliably remember details of exposures). However, using hospitalized patients has one large disadvantage; these patients have higher severity of disease since they required hospitalization in the first place. In addition, patients may be hospitalized for disease processes that may share features with diseases under study, thus confounding results.

Using a general population offers the advantage of being a true control group, random in its choosing and without any common features that may confound associations. However, disadvantages include poor response rates and biasing based on geography. Administering long histories and questions regarding exposures are often hard to accomplish in the general population due to the number of people willing (or rather, not willing) to undergo testing. In addition, choosing cases from the general population from particular geographic areas may bias the population toward certain characteristics (such as a socio-economic status) of that geographic population. Consider a study that uses cases from a referral clinic population that draws patients from across socio-economic strata. Using a control group selected from a population from a very affluent or very impoverished area may be problematic unless the socio-economic status is included in the final analysis.

In case-controls studies, cases are usually available before controls. When studying specific diseases, cases are often collected from specialty clinics that see large numbers of patients with a specific disease. Consider for example, the study by Garwood et al .[ 3 ] which looked at patients with established PD and looked for associations between prior amphetamine use and subsequent development various neurologic disorders. Patients in this study were chosen from specialty clinics that see large numbers of patients with certain neurologic disorders. Case definitions are very important when planning to choose cases. For instance, in a hypothetical study aiming to study cases of peripheral neuropathy, will all patients who carry a diagnosis of peripheral neuropathy be included? Or, will only patients with definite electromyography evidence of neuropathy be included? If a disease process with known histopathology is being studied, will tissue diagnosis be required for all cases? More stringent case definitions that require multiple pieces of data to be present may limit the number of cases that can be used in the study. Less stringent criteria (for instance, counting all patients with the diagnosis of “peripheral neuropathy” listed in the chart) may inadvertently choose a group of cases that are too heterogeneous.

The disease history status of the chosen cases must also be decided. Will the cases being chosen have newly diagnosed disease, or will cases of ongoing/longstanding disease also be included? Will decedent cases be included? This is important when looking at exposures in the following fashion: Consider exposure X that is associated with disease Y. Suppose that exposure X negatively affects disease Y such that patients that are X + have more severe disease. Now, a case-control study that used only patients with long-standing or ongoing disease might miss a potential association between X and Y because X + patients, due to their more aggressive course of disease, are no longer alive and therefore were not included in the analysis. If this particular confounding effect is of concern, it can be circumvented by using incident cases only.

Selection bias occurs when the exposure of interest results in more careful screening of a population, thus mimicking an association. The classic example of this phenomenon was noted in the 70s, when certain studies noted a relationship between estrogen use and endometrial cancer. However, on close analysis, it was noted that patients who used estrogen were more likely to experience vaginal bleeding, which in turn is often a cause for close examination by physicians to rule out endometrial cancer. This is often seen with certain drug exposures as well. A drug may produce various symptoms, which lead to closer physician evaluation, thus leading to more disease positive cases. Thus, when analyzed in a retrospective fashion, more of the cases may have a particular exposure only insofar as that particular exposure led to evaluations that resulted in a diagnosis, but without any direct association or causality between the exposure and disease.

One advantage of case-control studies is the ability to study multiple exposures and other risk factors within one study. In addition, the “exposure” being studied can be biochemical in nature. Consider the study, which looked at a genetic variant of a kinase enzyme as a risk factor for development of Alzheimer's disease.[ 4 ] Compare this with the study mentioned earlier by Garwood et al .,[ 3 ] where exposure data was collected by surveys and questionnaires. In this study, the authors drew blood work on cases and controls in order to assess their polymorphism status. Indeed, more than one exposure can be assessed in the same study and with planning, a researcher may look at several variables, including biochemical ones, in single case-control study.

Matching is one of three ways (along with exclusion and statistical adjustment) to adjust for differences. Matching attempts to make sure that the control group is sufficiently similar to the cases group, with respects to variables such as age, sex, etc., Cases and controls should not be matched on variables that will be analyzed for possible associations to disease. Not only should exposure variables not be included, but neither should variables that are closely related to these variables. Lastly, overmatching should be avoided. If the control group is too similar to the cases group, the study may fail to detect the difference even if one exists. In addition, adding matching categories increases expense of the study.

One measure of association derived from case control studies are sensitivity and specificity ratios. These measures are important to a researcher, to understand the correct classification. A good understanding of sensitivity and specificity is essential to understand receiver operating characteristic curve and in distinguishing correct classification of positive exposure and disease with negative exposure and no disease. Table 1 explains a hypothetical example and method of calculation of specificity and sensitivity analysis.

Hypothetical example of sensitivity, specificity and predictive values

An external file that holds a picture, illustration, etc.
Object name is AIAN-16-483-g002.jpg

Interpretation of sensitivity, specificity and predictive values

Sensitivity and specificity are statistical measures of the performance of a two by two classification of cases and controls (sick or healthy) against positives and negatives (exposed or non-exposed).[ 5 ] Sensitivity measures or identifies the proportion of actual positives identified as the percentage of sick people who are correctly identified as sick. Specificity measures or identifies the proportion of negatives identified as the percentage of healthy people who are correctly identified as healthy. Theoretically, optimum prediction aims at 100% sensitivity and specificity with a minimum of margin of error. Table 1 also shows false positive rate, which is referred to as Type I error commonly stated as α “Alpha” is calculated using the following formula: 100 − specificity, which is equal to 100 − 90.80 = 9.20% for Table 1 example. Type 1 error is also known as false positive error is referred to as a false alarm, indicates that a condition is present when it is actually not present. In the above mentioned example, a false positive error indicates the percent falsely identified healthy as sick. The reason why we want Type 1 error to be as minimum as possible is because healthy should not get treatment.

The false negative rate, which is referred to as Type II error commonly stated as β “Beta” is calculated using the following formula: 100 − sensitivity which is equal to 100 − 73.30 = 26.70% for Table 1 example. Type II error is also known as false negative error indicates that a condition is not present when it should have been present. In the above mentioned example, a false negative error indicates percent falsely identified sick as healthy. A Type 1 error unnecessarily treats a healthy, which in turn increases the budget and Type II error would risk the sick, which would act against study objectives. A researcher wants to minimize both errors, which not a simple issue because an effort to decrease one type of error increases the other type of error. The only way to minimize both type of error statistically is by increasing sample size, which may be difficult sometimes not feasible or expensive. If the sample size is too low it lacks precision and it is too large, time and resources will be wasted. Hence, the question is what should be the sample size so that the study has the power to generalize the result obtained from the study. The researcher has to decide whether, the study has enough power to make a judgment of the population from their sample. The researcher has to decide this issue in the process of designing an experiment, how large a sample is needed to enable reliable judgment.

Statistical power is same as sensitivity (73.30%). In this example, large number of false positives and few false negatives indicate the test conducted alone is not the best test to confirm the disease. Higher statistical power increase statistical significance by reducing Type 1 error which increases confidence interval. In other words, larger the power more accurately the study can mirror the behavior of the study population.

The positive predictive values (PPV) or the precision rate is referred to as the proportion of positive test results, which means correct diagnoses. If the test correctly identifies all positive conditions then the PPV would be 100% and negative predictive value (NPV) would be 0. The calculative PPV in Table 1 is 11.8%, which is not large enough to predict cases with test conducted alone. However, the NPV 99.9% indicates the test correctly identifies negative conditions.

Clinical interpretation of a test

In a sample, there are two groups those who have the disease and those who do not have the disease. A test designed to detect that disease can have two results a positive result that states that the disease is present and a negative result that states that the disease is absent. In an ideal situation, we would want the test to be positive for all persons who have the disease and test to be negative for all persons who do not have the disease. Unfortunately, reality is often far from ideal. The clinician who had ordered the test has the result as positive or negative. What conclusion can he or she make about the disease status for his patient? The first step would be to examine the reliability of the test in statistical terms. (1) What is the sensitivity of the test? (2) What is the specificity of the test? The second step is to examine it applicability to his patient. (3) What is the PPV of the test? (4) What is the NPV of the test?

Suppose the test result had come as positive. In this example the test has a sensitivity of 73.3% and specificity of 90.8%. This test is capable of detecting the disease status in 73% of cases only. It has a false positivity of 9.2%. The PPV of the test is 11.8%. In other words, there is a good possibility that the test result is false positive and the person does not have the disease. We need to look at other test results and the clinical situation. Suppose the PPV of this test was close to 80 or 90%, one could conclude that most likely the person has the disease state if the test result is positive.

Suppose the test result had come as negative. The NPV of this test is 99.9%, which means this test gave a negative result in a patient with the disease only very rarely. Hence, there is only 0.1% possibility that the person who tested negative has in fact the disease. Probably no further tests are required unless the clinical suspicion is very high.

It is very important how the clinician interprets the result of a test. The usefulness of a positive result or negative result depends upon the PPV or NPV of the test respectively. A screening test should have high sensitivity and high PPV. A confirmatory test should have high specificity and high NPV.

Case control method is most efficient, for the study of rare diseases and most common diseases. Other measures of association from case control studies are calculation of odds ratio (OR) and risk ratio which is presented in Table 2 .

Different ratio calculation templates with sample calculation

An external file that holds a picture, illustration, etc.
Object name is AIAN-16-483-g003.jpg

Absolute risk means the probability of an event occurring and are not compared with any other type of risk. Absolute risk is expressed as a ratio or percent. In the example, absolute risk reduction indicates 27.37% decline in risk. Relative risk (RR) on the other hand compares the risk among exposed and non-exposed. In the example provided in Table 2 , the non-exposed control group is 69.93% less likely compared to exposed cases. Reader should keep in mind that RR does not mean increase in risk. This means that while a 100% likely risk among those exposed cases, unexposed control is less likely by 69.93%. RR does not explain actual risk but is expressed as relative increase or decrease in risk of exposed compared to non-exposed.

OR help the researcher to conclude whether the odds of a certain event or outcome are same for two groups. It calculates the odds of a health outcome when exposed compared to non-exposed. In our example an OR of. 207 can be interpreted as the non-exposed group is less likely to experience the event compared to the exposed group. If the OR is greater than 1 (example 1.11) means that the exposed are 1.11 times more likely to be riskier than the non-exposed.

Event rate for cases (E) and controls (C) in biostatistics explains how event ratio is a measure of how often a particular statistical exposure results in occurrence of disease within the experimental group (cases) of an experiment. This value in our example is 11.76%. This value or percent explains the extent of risk to patients exposed, compared with the non-exposed.

The statistical tests that can be used for ascertain an association depends upon the variable characteristics also. If the researcher wants to find the association between two categorical variables (e.g., a positive versus negative test result and disease state expressed as present or absent), Cochran-Armitage test, which is same as Pearson Chi-squared test can be used. When the objective is to find the association between two interval or ratio level (continuous) variables, correlation and regression analysis can be performed. In order to evaluate statistical significant difference between the means of cases and control, a test of group difference can be performed. If the researcher wants to find statically significant difference among means of more than two groups, analysis of variance can be performed. A detailed explanation and how to calculate various statistical tests will be published in later issues. The success of the research directly and indirectly depends on how the following biases or systematic errors, are controlled.

When selecting cases and controls, based on exposed or not-exposed factors, the ability of subjects to recall information on exposure is collected retrospectively and often forms the basis for recall bias. Recall bias is a methodological issue. Problems of recall method are: Limitations in human ability to recall and cases may remember their exposure with more accuracy than the controls. Other possible bias is the selection bias. In case-control studies, the cases and controls are selected from the same inherited characteristics. For instance, cases collected from referral clinics often exposed to selection bias cases. If selection bias is not controlled, the findings of association, most likely may be due to of chance resulting from the study design. Another possible bias is information bias, which arises because of misclassification of the level of exposure or misclassification of disease or other symptoms of outcome itself.

Case control studies are good for studying rare diseases, but they are not generally used to study rare exposures. As Kaelin and Bayona explains[ 6 ] if a researcher want to study the risk of asthma from working in a nuclear submarine shipyard, a case control study may not be a best option because a very small proportion of people with asthma might be exposed. Similarly, case-control studies cannot be the best option to study multiple diseases or conditions because the selection of the control group may not be comparable for multiple disease or conditions selected. The major advantage of case-control study is that they are small and retrospective and so they are economical than cohort studies and randomized controlled trials.

Source of Support: Nil

Conflict of Interest: Nil

Book cover

Clinical Epidemiology and Biostatistics pp 93–112 Cite as

Case-Control Studies

  • Michael S. Kramer M.D. 2  

347 Accesses

2 Citations

In cohort studies (and clinical trials), subjects are followed in a forward direction from exposure to outcome. Inferential reasoning is from cause to effect.

  • Endometrial Cancer
  • Target Population
  • Renal Cancer
  • Estimate Relative Risk
  • Exposure Odds

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, log in via an institution .

Buying options

  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
  • Available as EPUB and PDF
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Unable to display preview.  Download preview PDF.

Feinstein AR (1973) Clinical biostatistics. XX. The epidemiologic trohoc, the ablative risk ratio, and “retrospective” research. Clin Pharmacol Ther 14: 291–307

PubMed   CAS   Google Scholar  

Miettinen OS (1985) The “case-control” study: valid selection of subjects. J Chronic Dis 38: 543–548

Article   PubMed   CAS   Google Scholar  

Horwitz RI, Feinstein AR (1978) Alternative analytic methods for case-control studies of estrogens and endometrial cancer. N Engl J Med 299: 1089–1094

Jick H, Walker AM, Rothman KJ (1980) The epidemic of endometrial cancer: a commentary. Am J Public Health 70: 264–267

Shapiro S, Kaufman DW, Slone D, Rosenberg C, Miettinen OS, Stolley PD, Rosenshein NB, Watring WG, Leavitt T, Knapp RC (1980) Recent and past use of conjugated estrogen in relation to adenocarcinoma of the endometrium. N Engl J Med 303: 485–489

Kramer MS (1981) Do breast-feeding and delayed introduction of solid foods protect against subsequent obesity? J Pediatr 98: 883–887

Kleinbaum DG, Kupper LL, Morgenstern H (1982) Epidemiologic research: principles and quantitative methods. Lifetime Learning Publications, Belmont, CA, p 146

Google Scholar  

Miettinen OS (1976) Estimability and estimation in case-referent studies. Am J Epidemiol 103: 226–235

Berkson J (1946) Limitations of the application of fourfold table analysis to hospital data. Biometr Bull 2: 47–53

Walter SD (1980) Berkson’s bias and its control in epidemiologic studies. J Chronic Dis 33: 721–725

Lippman A, Mackenzie SG (1985) What is “recall bias” and does it exist? In: Marois M (ed) Prevention of physical and mental congenital defects. Part C. Basic and medical science, education, and future strategies. Alan R. Liss, New York, pp 205–209

Breslow NE, Day NE (1980) Statistical methods in cancer research, vol 1. The analysis of case-control studies. International Agency for Research on Cancer, Lyon, pp 162–189

Schlesselman JJ (1982) Case-control studies: design, conduct, analysis. Oxford University Press, New York, pp 213–219

Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. JNCI 22: 719–748

Feinstein AR (1985) Experimental requirements and scientific principles in case-control studies. J Chronic Dis 38: 127–133

Ibrahim MA (ed) (1979) The case-control study: consensus and controversy. Pergamon, Oxford

Horwitz RI, Feinstein AR (1979) Methodological standards and contradictory results in case-control research. Am J Med 66: 556–564

Hayden GF, Kramer MS, Horwitz RI (1982) The case-control study: a practical review for the clinician. JAMA 247: 326–331

Download references

Author information

Authors and affiliations.

Faculty of Medicine, McGill University, 1020 Pine Avenue West, Montreal, Quebec, H3A 1A2, Canada

Michael S. Kramer M.D. ( Professor of Pediatrics and of Epidemiology and Biostatistics )

You can also search for this author in PubMed   Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

© 1988 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter.

Kramer, M.S. (1988). Case-Control Studies. In: Clinical Epidemiology and Biostatistics. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-61372-2_8

Download citation

DOI : https://doi.org/10.1007/978-3-642-61372-2_8

Publisher Name : Springer, Berlin, Heidelberg

Print ISBN : 978-3-642-64814-4

Online ISBN : 978-3-642-61372-2

eBook Packages : Springer Book Archive

Share this chapter

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Publish with us

Policies and ethics

  • Find a journal
  • Track your research

IMAGES

  1. (PDF) Study design V. Case–control studies

    case control studies design conduct analysis pdf

  2. Design and Analysis of Case-Control Studies / design-and-analysis-of

    case control studies design conduct analysis pdf

  3. (PDF) Case-Control Studies

    case control studies design conduct analysis pdf

  4. PPT

    case control studies design conduct analysis pdf

  5. what is a case control study design

    case control studies design conduct analysis pdf

  6. Case-Control Studies: Design, Conduct, Analysis (Monographs in E

    case control studies design conduct analysis pdf

VIDEO

  1. Odds Ratios in Case Control Studies

  2. Case Studies

  3. Case-Control studies

  4. #5- Case Control Studies part 1

  5. Case control studies #mrcgpakt #generalpractitioner #medicaleducation #aktexam

  6. Medical Statistics

COMMENTS

  1. Case-control studies : design, conduct, analysis

    Case-control studies : design, conduct, analysis ... Case-control studies : design, conduct, analysis by Schlesselman, James J; Stolley, Paul D. Publication date 1982 Topics Epidemiology -- Statistical methods, Experimental design, Epidemiology ... EPUB and PDF access not available for this item.

  2. Case-Control Studies : Design, Conduct, Analysis

    Case-control Studies: Design, Conduct, Analysis ... Case-control Studies: Design, Conduct, Analysis James J. Schlesselman, Paul D. Stolley Snippet view - 1982. Common terms and phrases. American Journal analysis approach assessment association B ...

  3. Case-Control Studies: Design, Conduct, Analysis

    DOI: 10.2307/2288156 Corpus ID: 61692197 Case-Control Studies: Design, Conduct, Analysis J. Schlesselman Published 21 January 1982 Medicine Case-control studies, often called 'retrospective' studies, provide a research method for investigating factors that may prevent or cause disease. Basically the method involves comparison of patients… Expand

  4. PDF Design and Analysis of Case-Control Studies

    A case-control study is usually conducted before a cohort or an experimental study to identify the possible etiology of the disease. It costs relatively less and can be conducted in a shorter time. For a given disease, a case-control study can investigate multiple exposures (when the real exposure is not known).

  5. A Practical Overview of Case-Control Studies in Clinical Practice

    Case-control studies are one of the major observational study designs for performing clinical research. The advantages of these study designs over other study designs are that they are relatively quick to perform, economical, and easy to design and implement.

  6. Case-Control Studies: Design, Conduct, Analysis

    Corpus ID: 71203590 Case-Control Studies: Design, Conduct, Analysis @article {Shapiro1982CaseControlSD, title= {Case-Control Studies: Design, Conduct, Analysis}, author= {Samuel Shapiro}, journal= {JAMA}, year= {1982}, volume= {248}, pages= {2055-2055}, url= {https://api.semanticscholar.org/CorpusID:71203590} } S. Shapiro

  7. Case Control Studies: Design, Conduct, Analysis : Journal of ...

    Case Control Studies: Design, Conduct, Analysis : Journal of Occupational and Environmental Medicine ... PDF Only. Case Control Studies Design, Conduct, Analysis. Schlesselman, James J.; Schneiderman, Marvin A. Ph.D. Journal of Occupational Medicine 24(11):p 879, November 1982. Buy ©1982 The American College of Occupational and Environmental ...

  8. PDF Case-control studies: an efficient study design

    Case-control studies are particularly useful for studying the cause of an outcome that is rare and for studying the effects of prolonged exposure. For example, a case-control study could be used ...

  9. Handbook of Statistical Methods for Case-Control Studies

    analyze the nested case-control data. The latter approach essentially involves analyzing the whole set of cohort data and using multiple imputation for those variables that were only collected in the case-control subset. There are also excellent chapters on the self-controlled case series method, and various methods for case-control studies of ...

  10. Case-control study—design, measures, and classic examples

    Abstract. Case-control studies are a type of observational epidemiological study that involve comparing two groups of individuals; one group with a defined outcome and the other without (normal). By doing this, one can look back in time to analyze the possible factors that may have contributed to the development of that outcome.

  11. Case-control study—design, measures, and classic examples

    View PDF; Download full book; ... Handbook for Designing and Conducting Clinical and Translational Research. 2023, Pages 211-214. Chapter 35 - Case-control study—design, measures, and classic examples. ... In this chapter, the uses, methods of analysis, benefits, and limitations of case-control studies will be discussed, with tips on how to ...

  12. Case-control studies : design, conduct, analysis

    Case-control studies : design, conduct, analysis ... Sony Alpha-A6300 (Control) Collection_set trent External-identifier urn:oclc:record:1148213968 urn:lcp:casecontrolstudi0000schl:lcpdf:2b01614a-3323-46af-9fe3-f1cb903bfce0 ... 14 day loan required to access EPUB and PDF files.

  13. Analysis of matched case-control studies

    There are two common misconceptions about case-control studies: that matching in itself eliminates (controls) confounding by the matching factors, and that if matching has been performed, then a "matched analysis" is required.

  14. What Is a Case-Control Study?

    Revised on June 22, 2023. A case-control study is an experimental design that compares a group of participants possessing a condition of interest to a very similar group lacking that condition. Here, the participants possessing the attribute of study, such as a disease, are called the "case," and those without it are the "control."

  15. Methodology Series Module 2: Case-control Studies

    Abstract. Case-Control study design is a type of observational study. In this design, participants are selected for the study based on their outcome status. Thus, some participants have the outcome of interest (referred to as cases), whereas others do not have the outcome of interest (referred to as controls). The investigator then assesses the ...

  16. Case-control studies by James J. Schlesselman

    Case-control studies design, conduct, analysis by James J. Schlesselman ... Case-control studies: design, conduct, analysis 1982, Oxford University Press in English. 019502933X 9780195029338 aaaa. Borrow Listen. Libraries near you: WorldCat. Add another edition? Book Details ...

  17. (PDF) Research Design: Case-Control Studies

    Abstract. Case-control studies are observational studies in which cases are subjects who have a characteristic of interest, such as a clinical diagnosis, and controls are (usually) matched ...

  18. Case-control study: Design, measures, classic example

    Case-control studies provide important contributions to medical research as they are efficient and have a comparison group to perform comparative statistical analysis against. The primary outcome of a case-control study is an odds ratio, a statistical measure of the association between a given exposure and an outcome of interest.

  19. Research Design: Case-Control Studies

    A case-control study is one in which cases are compared with controls to identify historical exposures that are significantly associated with a current state or, stated in different words, variables that are significantly associated with caseness. In case-control studies, cases are subjects with a particular characteristic.

  20. (PDF) Case-control studies

    ... Both populations are screened for alcohol and drugs. Using these data, OR are calculated to estimate the accident risk associated with a specific (or combination of) substance (s) [2, 4, 6,8].

  21. Designing and Conducting Analytic Studies in the Field

    Most field case-control studies use control-to-case-patient ratios of 1:1, 2:1, or 3:1. Enrolling more than one control per case-patient can increase study power, which might be needed to detect a statistically significant difference in exposure between case-patients and controls, particularly when an outbreak involves a limited number of cases.

  22. Design and data analysis case-controlled study in clinical research

    Introduction. Clinicians think of case-control study when they want to ascertain association between one clinical condition and an exposure or when a researcher wants to compare patients with disease exposed to the risk factors to non-exposed control group. In other words, case-control study compares subjects who have disease or outcome (cases ...

  23. Case-Control Studies

    Schlesselman JJ (1982) Case-control studies: design, conduct, analysis. Oxford University Press, New York, pp 213-219. Google Scholar Mantel N, Haenszel W (1959) Statistical aspects of the analysis of data from retrospective studies of disease. JNCI 22: 719-748. PubMed CAS Google Scholar