U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings
  • Advanced Search
  • Journal List
  • BMC Med Res Methodol

Logo of bmcmrm

Advantages of the nested case-control design in diagnostic research

Cornelis j biesheuvel.

1 Julius Center for Health Sciences and Primary Care, University Medical Center, Utrecht, The Netherlands

2 The Children's Hospital at Westmead, Sydney, Australia

Yvonne Vergouwe

Ruud oudega, arno w hoes, diederick e grobbee, karel gm moons.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Despite its benefits, it is uncommon to apply the nested case-control design in diagnostic research. We aim to show advantages of this design for diagnostic accuracy studies.

We used data from a full cross-sectional diagnostic study comprising a cohort of 1295 consecutive patients who were selected on their suspicion of having deep vein thrombosis (DVT). We draw nested case-control samples from the full study population with case:control ratios of 1:1, 1:2, 1:3 and 1:4 (per ratio 100 samples were taken). We calculated diagnostic accuracy estimates for two tests that are used to detect DVT in clinical practice.

Estimates of diagnostic accuracy in the nested case-control samples were very similar to those in the full study population. For example, for each case:control ratio, the positive predictive value of the D-dimer test was 0.30 in the full study population and 0.30 in the nested case-control samples (median of the 100 samples). As expected, variability of the estimates decreased with increasing sample size.

Our findings support the view that the nested case-control study is a valid and efficient design for diagnostic studies and should also be (re)appraised in current guidelines on diagnostic accuracy research.

In diagnostic research it is essential to determine the accuracy of a test to evaluate its value for medical practice [ 1 ]. Diagnostic test accuracy is assessed by comparing the results of the index test with the results of the reference standard in the same patients. Given the cross-sectional nature of a diagnostic accuracy question, the design may be referred to as a cross-sectional cohort design. The (cohort) characteristic by which the study subjects (cohort members) are selected is 'the suspicion of the target disease', defined by the presence of particular symptoms or signs [ 2 ]. The collected study data allow for calculation of all diagnostic accuracy parameters of the index test, such as sensitivity, specificity, odds ratio, receiver operating characteristic (ROC) curve and predictive values, i.e. the probabilities of presence and absence of the disease given the index test result(s).

Subjects are not always selected on their initial suspicion of having the disease but often on the true presence or absence of the disease among those who underwent the reference test in routine care practice, which merely reflects a cross-sectional case-control design [ 3 , 4 ]. Appraisal of such conventional case-control design in diagnostic accuracy research has been limited due to its problems related to the incorrect sampling of cases and controls [ 3 - 7 ]. These problems may be overcome by applying a nested (cross-sectional) case-control study design, which may be advantageous over a full (cross-sectional) cohort design. The rationale, strengths and limitations of a nested case-control approach in epidemiology studies have widely been discussed in the literature [ 8 - 11 ], but not so much in the context of diagnostic accuracy research [ 6 ].

We therefore aim to show advantages of the nested case-control design for addressing diagnostic accuracy questions and discuss its pros and cons in relation to a conventional case-control design and to the full (cross sectional) cohort design in this domain. We will illustrate this with data from a recently conducted diagnostic accuracy study.

Case-control versus nested case-control design

The essence of a case-control study is that cases with the condition under study arise in a source population and controls are a representative sample of this same source population. Not the entire population is studied, what would be a full cohort study or census approach, but rather a random sample from the source population [ 12 ]. A major flaw inherent to case-control studies, described as early as 1959 [ 13 ], is the difficulty to ensure that cases and controls are a representative sample of the same source population. In a nested case-control study the cases emerge from a well-defined source population and the controls are sampled from that same population. The main difference between a case-control and a nested case-control study is that in the former the cases and controls are sampled from a source population with unknown size, whereas the latter is 'nested' in an existing predefined source population with known sample size. This source population can be a group or cohort of subjects that is followed over time or not.

The term 'cohort' is commonly referred to a group of subjects followed over time in etiologic or prognostic research. But in essence, time is no prerequisite for the definition of a cohort. A cohort is a group of subjects that is defined by the same characteristic. This characteristic can be a particular birth year, a particular living area, and also the presence of a particular sign or symptom that makes them suspected of having a particular disease as in diagnostic research. Accordingly, a cross-sectional study can either be a cross-sectional case-control study or a cross-sectional cohort study.

Case-control and nested case-control design in diagnostic accuracy research

In diagnostic accuracy research the case-control design is incorrectly applied when subjects are selected from routine care databases. First, this design commonly leads to biased estimates of diagnostic accuracy of the index test due to referral or (partial) verification bias [ 4 , 14 - 18 ]. In routine care, physicians selectively refer patients for additional tests, including the reference test, based on previous test results. This is good clinical practice but a bad starting point for diagnostic research. As said, for diagnostic research purposes all subjects suspected of the target disease preferably undergo the index test(s) plus reference test irrespective of previous test results. Second, selection of patients with a negative reference test result as 'controls' may lead to inclusion of controls that correspond to a different clinical domain, i.e. patients who underwent the reference test but not necessarily because they were similarly suspected of the target condition [ 16 , 17 ]. A third disadvantage of such case-control design is that absolute probabilities of disease presence given the index test results, i.e. the predictive values or post-test probabilities, that are the desired parameters for patient care, cannot be obtained. Cases and controls are sampled from a source population of unknown size. The total number of patients that were initially suspected of the target disease based on the presence of symptoms or signs, i.e. the true source population, is commonly unknown as in routine care patients are hardly classified by their symptoms and signs at presentation [ 18 ]. Hence, the sampling fraction of cases and controls is unknown and valid estimates of the absolute probabilities of disease presence cannot be calculated [ 12 ].

A nested case-control study in diagnostic research includes the full population or cohort of patients suspected of the target disease. The 'true' disease status is obtained for all these patients with the reference standard. Hence, there is no referral or partial verification bias. The results of the index tests can then be obtained for all subjects with the target condition but only for a sample of the subjects without the target condition. Usually all patients with the target disease are included, but this could as well be a sample of the cases. Besides the absence of bias, all measures of diagnostic accuracy, including the positive and negative predictive values, can simply be obtained by weighing the controls with the case-control sampling fraction, as explained in Figure ​ Figure1 1 .

An external file that holds a picture, illustration, etc.
Object name is 1471-2288-8-48-1.jpg

Theoretical example of a full study population and a nested case-control sample . The index test result and the outcome are obtained for all patients of the study population. The case-control ratio was 1:4 (sampling fraction (SF) = 160/400 = 0.40). Valid diagnostic accuracy measures can be obtained from the nested case-control sample, by multiplying the controls with 1/sampling fraction. For example, the positive predictive value (PPV) of a full study population can be calculated with a/(a + b), in this example 30/(30 + 100) = 0.23. In a nested case-control sample the PPV is calculated with a/(a + (1/SF)*b), in this example: 30/(30 + 2.5*40) = 0.23. In a case-control sample however, the controls are sampled from a source population with unknown size. Therefore, the sample fraction is unknown and valid estimate of the PPV cannot be calculated.

Potential advantages of a nested case-control design in diagnostic research

The nested case-control study design can be advantageous over a full cross-sectional cohort design when actual disease prevalence in subjects suspected of a target condition is low, the index test is costly to perform, or if the index test is invasive and may lead to side effects. Under these conditions, one limits patient burden and saves time and money as the index test is performed in only a sample of the control subjects.

Furthermore, the nested case-control design is of particular value when stored data (serum, images etc.) of an existing study population are re-analysed for diagnostic research purposes. Using a nested case-control design, only data of a sample of the full study population need to be retrieved and analysed without having to perform a new diagnostic study from the start. This may for example apply to evaluation of tumour markers to detect cancer, but also for imaging or electrophysiology tests.

Diagnostic accuracy estimates derived from a nested case-control study, should be virtually identical to a full cohort analysis. However, the variability of the accuracy estimates will increase with decreasing sample size. We illustrate this with data of a diagnostic study on a cohort of patients who were suspected of DVT.

A cross-sectional study was performed among a cohort of adult patients suspected of deep vein thrombosis (DVT) in primary care. This suspicion was primarily defined by the presence of a painful and swollen or red leg that existed no longer than 30 days. Details on the setting, data collection and main results have been described previously. [ 19 , 20 ] In brief, the full study population included 1295 consecutive patients who visited one of the participating primary care physicians with above symptoms and signs of DVT. Patients were excluded if pulmonary embolism was suspected. The general practitioner systematically documented information on patient history and physical examination. Patient history included information such as age, gender, history of malignancy, and recent surgery. Physical examination included swelling of the affected limb and difference in circumference of the calves calculated as the circumference (in centimetres) of affected limb minus circumference of unaffected limb, further referred to as calf difference test. Subsequently, all patients were referred to undergo D-dimer testing. In line with available guidelines and previous studies, the D-dimer test result was considered abnormal if the test yielded a D-dimer level ≥ 500 ng/ml. [ 21 , 22 ] Finally, they all underwent the reference test, i.e. repeated compression ultrasonography (CUS) of the lower extremities. In patients with a normal first CUS measurement, the CUS was repeated after seven days. DVT was considered present if one CUS measurement was abnormal. The echographist was blinded to the results of patient history, physical examination, and the D-dimer assay.

Nested case-control samples

Nested case-control samples were drawn from the full study population (n = 1295). In all samples, we included always all 289 cases with DVT. Controls were randomly sampled from the 1006 subjects without DVT. We applied four different and frequently used case-control ratios, i.e. one control for each case (1:1), two controls for each case (1:2), three controls for each case (1:3) and four controls for each case (1:4). For example, a sample with case-control ratio of 1:1 contained 289 cases and 289 random subjects out of 1006 controls (sampling fraction 289/1006 = 0.287). In the 1:4 approach, we sampled with replacement. For each case-control ratio, 100 nested case-control samples were drawn.

Statistical analysis

We focussed on two important diagnostic tests for DVT, i.e. the dichotomous D-dimer test and the continuous calf difference test. The latter was specifically chosen as it allowed for the estimation and thus comparison of the area under the ROC curve (ROC area). Diagnostic accuracy measures of both tests were estimated for the four case-control ratios and compared with those obtained from the full study population. Measures of diagnostic accuracy included sensitivity and specificity, positive and negative predictive values and the odds ratio (OR) for the D-dimer test, and the OR and the ROC area for the calf difference test.

In the analysis of the nested case-control samples, we multiplied control samples by [1/sample fraction] corresponding to the case-control ratio (1:1 = 3.48; 1:2 = 1.74; 1:3 = 1.16; 1:4 = 0.87). For each case-control ratio, the point estimates and variability were determined. The median estimate of the 100 samples was considered as the point estimate. Analyses were performed using SPSS version 12.0 and S-plus version 6.0.

In the full study population, the prevalence of DVT was 22% (n = 289), the D-dimer test was abnormal in 69% of the patients (n = 892) and the mean difference in calf circumference was 2.3 cm (Table ​ (Table1). 1 ). The prevalence of DVT was 50%, 33%, 25% and 20% in the nested case-control samples as a result of the sampling ratios (1:1, 1:2, 1:3 and 1:4, respectively). The distributions of the test characteristics in the control samples were similar as for the patients from the full study population without DVT (Table ​ (Table1 1 ).

Distribution of test results in the full study population and the nested case-control samples with various case-control ratios

For each case-control ratio, 100 nested case-control samples were drawn. The statistics of the control samples are the average values. All values represent absolute patient numbers (%) unless stated otherwise.

DVT+ = deep vein thrombosis present; DVT- = deep vein thrombosis absent; *mean (standard deviation)

In the full study population the sensitivity and negative predictive value were high for the D-dimer test, 0.94 and 0.96, respectively (Table ​ (Table2), 2 ), whereas the specificity and positive predictive value were relatively low. The OR for the calf difference test was 1.44 and the ROC area was 0.69.

Estimates of diagnostic accuracy with 95% confidence intervals for the D-dimer and calf difference tests obtained in the full study population

- = not applicable; PPV = positive predictive value; NPV = negative predictive value ROC area = area under the receiver operating characteristic curve

The average estimates of diagnostic accuracy for each of the four case-control ratios were similar to the corresponding estimates of the full study population (Figure ​ (Figure2). 2 ). For example, the negative predictive value of the D-dimer test was 0.955 in both the full study population and for the four case-control ratios. The OR of the calf difference test was 1.44 in the full study population and the OR derived from the nested case-control samples were on average also 1.44.

An external file that holds a picture, illustration, etc.
Object name is 1471-2288-8-48-2.jpg

Estimates of diagnostic accuracy of the D-dimer test and calf difference test for the 100 nested case-control samples with case-control ratios ranging from 1:1 to 1:4 . The boxes indicate mean values and corresponding interquartile ranges (25 th and 75 th percentile). Whiskers indicate 2.5 th and 97.5 th percentiles. The dotted lines represent the values estimated in the full study population.

The use of (conventional) case-control studies in diagnostic research has often been associated with biased estimates of diagnostic accuracy, due to the incorrect sampling of subjects [ 3 - 6 , 18 ]. Moreover, this study design does not allow for the estimation of the desired absolute disease probabilities. We discussed and showed that a case-control study nested within a well defined cohort of subjects suspected of a particular target disease with known sample size can yield valid estimates of diagnostic accuracy of an index test, including the absolute probabilities of disease presence or absence. Diagnostic accuracy parameters derived from a full (cross-sectional) cohort of patients suspected of DVT were similar to the estimates derived from various nested case-control samples averaged over 100 simulations. Expectedly, the variability decreased with increasing number of controls, making the measures estimated in the larger case-control samples more precise.

As discussed, the number of subjects from which the index test results need to be retrieved can substantially be reduced with a nested case-control design. Hence, the nested case-control design is particularly advantageous when the prevalence of the target condition in the cohort of patients suspected of the target disease is rare, when the index test results are costly or difficult to collect and for re-analysing stored images or specimen. However, precision of the diagnostic accuracy measures will be hampered by increased variability when too little control patients are included.

Rutjes et al nicely discussed limitations of different study designs in diagnostic research [ 6 ]. They proposed the 'two-gate design with representative sampling' (which resembles the nested case-control design in this paper) as a valid design. We confirmed their proposition with a quantitative analysis of a diagnostic study. Rutjes et al suggested not to use the term 'nested case-control' to prevent confusion with etiologic studies where this design is commonly applied. Indeed, diagnostic and etiologic research differs fundamentally, first and foremost on the concept of time. Diagnostic accuracy studies are, in contrast to etiologic studies, typically cross-sectional in nature. Furthermore, diagnostic associations between index and reference tests are purely descriptive, whereas in etiologic studies causal associations and potential confounding are involved. Despite these major differences we believe there is no reason not to use the term nested case-control study in diagnostic research as well. The term inherently refers to the method of sampling of study subjects which can be the same in a diagnostic or etiologic setting, and has no direct bearing on the other issues typically related to etiologic case control studies.

Our findings support the view that the nested case-control study is a valid and efficient design for diagnostic studies. We believe that the nested case-control approach should be applied more often in diagnostic research, and also be (re)appraised in current guidelines on diagnostic methodology.

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

All authors commented on the draft and the interpretation of the findings, read and approved the final manuscript. CJB was responsible for the design, statistical analysis and wrote the original manuscript. YV was responsible for the design and statistical analysis. RO was responsible for the data collection. AWH was responsible for expertise in case-control design. DEG and KGMM were responsible for conception and design of the study and coordination.

Pre-publication history

The pre-publication history for this paper can be accessed here:

http://www.biomedcentral.com/1471-2288/8/48/prepub

Acknowledgements

For this research project we received financial support from the Netherlands Organization for Scientific Research, grant number: ZON-MW904-66-112. The funding source had no influence on the design, data analysis and report of this study.

  • Knottnerus JA, van Weel C, Muris JW. Evaluation of diagnostic procedures. BMJ. 2002; 324 :477–480. doi: 10.1136/bmj.324.7335.477. [ PMC free article ] [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Knottnerus JA, Muris JW. Assessment of the accuracy of diagnostic tests: the cross-sectional study. J Clin Epidemiol. 2003; 56 :1118–1128. doi: 10.1016/S0895-4356(03)00206-3. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, Meulen van der JHP, Bossuyt PMM. Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999; 282 :1061–1066. doi: 10.1001/jama.282.11.1061. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Rutjes AW, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PM. Evidence of bias and variation in diagnostic accuracy studies. CMAJ. 2006; 174 :469–476. [ PMC free article ] [ PubMed ] [ Google Scholar ]
  • Whiting P, Rutjes AW, Reitsma JB, Glas AS, Bossuyt PM, Kleijnen J. Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Ann Intern Med. 2004; 140 :189–202. [ PubMed ] [ Google Scholar ]
  • Rutjes AW, Reitsma JB, Vandenbroucke JP, Glas AS, Bossuyt PM. Case-control and two-gate designs in diagnostic accuracy studies. Clin Chem. 2005; 51 :1335–1341. doi: 10.1373/clinchem.2005.048595. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Kraemer H . Evaluating Medical Tests. London, UK , Sage Publications; 1992. [ Google Scholar ]
  • Mantel N. Synthetic retrospective studies and related topics. Biometrics. 1973; 29 :479–486. doi: 10.2307/2529171. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Essebag V, Genest J, Jr., Suissa S, Pilote L. The nested case-control study in cardiology. Am Heart J. 2003; 146 :581–590. doi: 10.1016/S0002-8703(03)00512-X. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Ernster VL. Nested case-control studies. Prev Med. 1994; 23 :587–590. doi: 10.1006/pmed.1994.1093. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Langholz B., . Case-Control Study, Nested. In: Armitage PCT, editor. Encyclopedia of Biostatistics. 2nd. New York , John Wiley & Sons; 2005. pp. 646–665. [ Google Scholar ]
  • Rothman KJ, Greenland S. Modern epidemiology. Second. Philadelphia , Lincot-Raven Publishers; 1998. [ Google Scholar ]
  • Mantel N, Haenszel W. Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst. 1959; 22 :719–748. [ PubMed ] [ Google Scholar ]
  • Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med. 1978; 299 :926–930. [ PubMed ] [ Google Scholar ]
  • Begg CB, Greenes RA. Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics. 1983; 39 :297–215. doi: 10.2307/2530820. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Knottnerus JA, Leffers JP. The influence of referral patterns on the characteristics of diagnostic tests. J Clin Epidemiol. 1992; 45 :1143–1154. doi: 10.1016/0895-4356(92)90155-G. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • van der Schouw YT, van Dijk R, Verbeek ALM. Problems in selecting the adequate patient population from existing data files for assessment studies of new diagnostic tests. J Clin Epidemiol. 1995; 48 :417–422. doi: 10.1016/0895-4356(94)00144-F. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Oostenbrink R, Moons KG, Bleeker SE, Moll HA, Grobbee DE. Diagnostic research on routine care data: prospects and problems. J Clin Epidemiol. 2003; 56 :501–506. doi: 10.1016/S0895-4356(03)00080-5. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Oudega R, Hoes AW, Moons KG. The Wells rule does not adequately rule out deep venous thrombosis in primary care patients. Ann Intern Med. 2005; 143 :100–107. [ PubMed ] [ Google Scholar ]
  • Oudega R, Moons KG, Hoes AW. Limited value of patient history and physical examination in diagnosing deep vein thrombosis in primary care. Fam Pract. 2005; 22 :86–91. doi: 10.1093/fampra/cmh718. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Perrier A, Desmarais S, Miron M, de Moerloose P, Lepage R, Slosman D, Didier D, Unger P, Patenaude J, Bounameaux H. Non-invasive diagnosis of venous thromboembolism in outpatients. Lancet. 1999; 353 :190–195. doi: 10.1016/S0140-6736(98)05248-9. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • Schutgens RE, Ackermark P, Haas FJ, Nieuwenhuis HK, Peltenburg HG, Pijlman AH, Pruijm M, Oltmans R, Kelder JC, Biesma DH. Combination of a normal D-dimer concentration and a non-high pretest clinical probability score is a safe strategy to exclude deep venous thrombosis. Circulation. 2003; 107 :593–597. doi: 10.1161/01.CIR.0000045670.12988.1E. [ PubMed ] [ CrossRef ] [ Google Scholar ]
  • - Google Chrome

Intended for healthcare professionals

  • Access provided by Google Indexer
  • My email alerts
  • BMA member login
  • Username * Password * Forgot your log in details? Need to activate BMA Member Log In Log in via OpenAthens Log in via your institution

Home

Search form

  • Advanced search
  • Search responses
  • Search blogs
  • Nested case-control...

Nested case-control studies: advantages and disadvantages

  • Related content
  • Peer review
  • Philip Sedgwick , reader in medical statistics and medical education 1
  • 1 Centre for Medical and Healthcare Education, St George’s, University of London, London, UK
  • p.sedgwick{at}sgul.ac.uk

Researchers investigated whether antipsychotic drugs were associated with venous thromboembolism. A population based nested case-control study design was used. Data were taken from the UK QResearch primary care database consisting of 7 267 673 patients. Cases were adult patients with a first ever record of venous thromboembolism between 1 January 1996 and 1 July 2007. For each case, up to four controls were identified, matched by age, calendar time, sex, and practice. Exposure to antipsychotic drugs was assessed on the basis of prescriptions on, or during the 24 months before, the index date. 1

There were 25 532 eligible cases (15 975 with deep vein thrombosis and 9557 with pulmonary embolism) and 89 491 matched controls. The primary outcome was the odds ratios for venous thromboembolism associated with antipsychotic drugs adjusted for comorbidity and concomitant drug exposure. When adjusted using logistic regression to control for potential confounding, prescription of antipsychotic drugs in the previous 24 months was significantly associated with an increased occurrence of venous thromboembolism compared with non-use (odds ratio 1.32, 95% confidence interval 1.23 to 1.42). The researchers concluded that prescription of antipsychotic drugs was associated with venous thromboembolism in a large primary care population.

Which of the following statements, if any, are true?

a) The nested case-control study is a retrospective design

b) The study design minimised selection bias compared with a case-control study

c) Recall bias was minimised compared with a case-control study

d) Causality could be inferred from the association between prescription of antipsychotic drugs and venous thromboembolism

Statements a , b , and c are true, whereas d is false.

The aim of the study was to investigate whether prescription of antipsychotic drugs was associated with venous thromboembolism. A nested case-control study design was used. The study design was an observational one that incorporated the concept of the traditional case-control study within an established cohort. This design overcomes some of the disadvantages associated with case-control studies, 2 while incorporating some of the advantages of cohort studies. 3 4

Data for the study above were extracted from the UK QResearch primary care database, a computerised register of anonymised longitudinal medical records for patients registered at more than 500 UK general practices. Patient data were recorded prospectively, the database having been updated regularly as patients visited their GP. Cases were all adult patients in the register with a first ever record of venous thromboembolism between 1 January 1996 and 1 July 2007. There were 25 532 cases in total. For each case, up to four controls were identified from the register, matched by age, calendar time, sex, and practice. In total, 89 491 matched controls were obtained. Data relating to prescriptions for antipsychotic drugs on, or during the 24 months before, the index date were extracted for the cases and controls. The index date was the date in the register when venous thromboembolism was recorded for the case. The cases and controls were compared to ascertain whether exposure to prescription of antipsychotic drugs was more common in one group than in the other. Despite the data for the cases and controls being collected prospectively, the nested case-control study is described as retrospective ( a is true) because it involved looking back at events that had already taken place and been recorded in the register.

Selection bias is of particular concern in the traditional case-control study. Described in a previous question, 5 selection bias is the systematic difference between the study participants and the population they are meant to represent with respect to their characteristics, including demographics and morbidity. Cases and controls are often selected through convenience sampling. Cases are typically recruited from hospitals or general practices because they are convenient and easily accessible to researchers. Controls are often recruited from the same hospital clinics or general practices as the cases. Therefore, the selected cases may not be representative of the population of all cases. Equally, the controls might not be representative of otherwise healthy members of the population. The above nested case-control study was population based, with the QResearch primary care database incorporating a large proportion of the UK population. The cases and controls were selected from the database and therefore should be more representative of the population than those in a traditional case-control study. Hence, selection bias was minimised by using the nested case-control study design ( b is true).

The traditional case-control study involves participants recalling information about past exposure to risk factors after identification as a case or control. The study design is prone to recall bias, as described in a previous question. 6 Recall bias is the systematic difference between cases and controls in the accuracy of information recalled. Recall bias will exist if participants have selective preconceptions about the association between the disease and past exposure to the risk factor(s). Cases may, for example, recall information more accurately than controls, possibly because of an association with the disease or outcome. Although in the study above the cases and controls were identified retrospectively, the data for the QResearch primary care database were collected prospectively. Therefore, there was no reason for any systematic differences between groups of study participants in the accuracy of the information collected. Therefore, recall bias was minimised compared with a traditional case-control study ( c is true).

Not all of the patient records in the UK QResearch primary care database were used to explore the association between prescription of antipsychotic drugs and development of venous thromboembolism. A nested case-control study was used instead, with cases and controls matched on age, calendar time, sex, and practice. This was because it was statistically more efficient to control for the effects of age, calendar time, sex, and practice by matching cases and controls on these variables at the design stage, rather than controlling for their potential confounding effects when the data were analysed. The matching variables were considered to be important factors that could potentially confound the association between prescription of antipsychotic drugs and venous thromboembolism, but they were not of interest as potential risk factors in themselves. Matching in case-control studies has been described in a previous question. 7

Unlike a traditional case-control study, the data in the example above were recorded prospectively. Therefore, it was possible to determine whether prescription of antipsychotic drugs preceded the occurrence of venous thromboembolism. Nonetheless, only association, and not causation, can be inferred from the results of the above nested case-control study ( d is false)—that is, those people who were exposed to prescribed antipsychotic drugs were more likely to have developed venous thromboembolism. This is because the observed association between prescribed antipsychotic drugs and occurrence of venous thromboembolism may have been due to confounding. In particular, it was not possible to measure and then control for, through statistical analysis, all factors that may have affected the occurrence of venous thromboembolism.

The example above is typical of a nested case-control study; the health records for a group of patients that have already been collected and stored in an electronic database are used to explore the association between one or more risk factors and a disease or condition. The management of such databases means it is possible for a variety of studies to be undertaken, each investigating the risk factors associated with different diseases or outcomes. Nested case-control studies are therefore relatively inexpensive to perform. However, the major disadvantage of nested case-control studies is that not all pertinent risk factors are likely to have been recorded. Furthermore, because many different healthcare professionals will be involved in patient care, risk factors and outcome(s) will probably not have been measured with the same accuracy and consistency throughout. It may also be problematic if the diagnosis of the disease or outcome changes with time.

Cite this as: BMJ 2014;348:g1532

Competing interests: None declared.

  • ↵ Parker C, Coupland C, Hippisley-Cox J. Antipsychotic drugs and risk of venous thromboembolism: nested case-control study. BMJ 2010 ; 341 : c4245 . OpenUrl Abstract / FREE Full Text
  • ↵ Sedgwick P. Case-control studies: advantages and disadvantages. BMJ 2014 ; 348 : f7707 . OpenUrl CrossRef
  • ↵ Sedgwick P. Prospective cohort studies: advantages and disadvantages. BMJ 2013 ; 347 : f6726 . OpenUrl FREE Full Text
  • ↵ Sedgwick P. Retrospective cohort studies: advantages and disadvantages. BMJ 2014 ; 348 : g1072 . OpenUrl FREE Full Text
  • ↵ Sedgwick P. Selection bias versus allocation bias. BMJ 2013 ; 346 : f3345 . OpenUrl FREE Full Text
  • ↵ Sedgwick P. What is recall bias? BMJ 2012 ; 344 : e3519 . OpenUrl FREE Full Text
  • ↵ Sedgwick P. Why match in case-control studies? BMJ 2012 ; 344 : e691 . OpenUrl FREE Full Text

nested case control study design

  • Research article
  • Open access
  • Published: 21 July 2008

Advantages of the nested case-control design in diagnostic research

  • Cornelis J Biesheuvel 1 , 2 ,
  • Yvonne Vergouwe 1 ,
  • Ruud Oudega 1 ,
  • Arno W Hoes 1 ,
  • Diederick E Grobbee 1 &
  • Karel GM Moons 1  

BMC Medical Research Methodology volume  8 , Article number:  48 ( 2008 ) Cite this article

49k Accesses

87 Citations

2 Altmetric

Metrics details

Despite its benefits, it is uncommon to apply the nested case-control design in diagnostic research. We aim to show advantages of this design for diagnostic accuracy studies.

We used data from a full cross-sectional diagnostic study comprising a cohort of 1295 consecutive patients who were selected on their suspicion of having deep vein thrombosis (DVT). We draw nested case-control samples from the full study population with case:control ratios of 1:1, 1:2, 1:3 and 1:4 (per ratio 100 samples were taken). We calculated diagnostic accuracy estimates for two tests that are used to detect DVT in clinical practice.

Estimates of diagnostic accuracy in the nested case-control samples were very similar to those in the full study population. For example, for each case:control ratio, the positive predictive value of the D-dimer test was 0.30 in the full study population and 0.30 in the nested case-control samples (median of the 100 samples). As expected, variability of the estimates decreased with increasing sample size.

Our findings support the view that the nested case-control study is a valid and efficient design for diagnostic studies and should also be (re)appraised in current guidelines on diagnostic accuracy research.

Peer Review reports

In diagnostic research it is essential to determine the accuracy of a test to evaluate its value for medical practice [ 1 ]. Diagnostic test accuracy is assessed by comparing the results of the index test with the results of the reference standard in the same patients. Given the cross-sectional nature of a diagnostic accuracy question, the design may be referred to as a cross-sectional cohort design. The (cohort) characteristic by which the study subjects (cohort members) are selected is 'the suspicion of the target disease', defined by the presence of particular symptoms or signs [ 2 ]. The collected study data allow for calculation of all diagnostic accuracy parameters of the index test, such as sensitivity, specificity, odds ratio, receiver operating characteristic (ROC) curve and predictive values, i.e. the probabilities of presence and absence of the disease given the index test result(s).

Subjects are not always selected on their initial suspicion of having the disease but often on the true presence or absence of the disease among those who underwent the reference test in routine care practice, which merely reflects a cross-sectional case-control design [ 3 , 4 ]. Appraisal of such conventional case-control design in diagnostic accuracy research has been limited due to its problems related to the incorrect sampling of cases and controls [ 3 – 7 ]. These problems may be overcome by applying a nested (cross-sectional) case-control study design, which may be advantageous over a full (cross-sectional) cohort design. The rationale, strengths and limitations of a nested case-control approach in epidemiology studies have widely been discussed in the literature [ 8 – 11 ], but not so much in the context of diagnostic accuracy research [ 6 ].

We therefore aim to show advantages of the nested case-control design for addressing diagnostic accuracy questions and discuss its pros and cons in relation to a conventional case-control design and to the full (cross sectional) cohort design in this domain. We will illustrate this with data from a recently conducted diagnostic accuracy study.

Case-control versus nested case-control design

The essence of a case-control study is that cases with the condition under study arise in a source population and controls are a representative sample of this same source population. Not the entire population is studied, what would be a full cohort study or census approach, but rather a random sample from the source population [ 12 ]. A major flaw inherent to case-control studies, described as early as 1959 [ 13 ], is the difficulty to ensure that cases and controls are a representative sample of the same source population. In a nested case-control study the cases emerge from a well-defined source population and the controls are sampled from that same population. The main difference between a case-control and a nested case-control study is that in the former the cases and controls are sampled from a source population with unknown size, whereas the latter is 'nested' in an existing predefined source population with known sample size. This source population can be a group or cohort of subjects that is followed over time or not.

The term 'cohort' is commonly referred to a group of subjects followed over time in etiologic or prognostic research. But in essence, time is no prerequisite for the definition of a cohort. A cohort is a group of subjects that is defined by the same characteristic. This characteristic can be a particular birth year, a particular living area, and also the presence of a particular sign or symptom that makes them suspected of having a particular disease as in diagnostic research. Accordingly, a cross-sectional study can either be a cross-sectional case-control study or a cross-sectional cohort study.

Case-control and nested case-control design in diagnostic accuracy research

In diagnostic accuracy research the case-control design is incorrectly applied when subjects are selected from routine care databases. First, this design commonly leads to biased estimates of diagnostic accuracy of the index test due to referral or (partial) verification bias [ 4 , 14 – 18 ]. In routine care, physicians selectively refer patients for additional tests, including the reference test, based on previous test results. This is good clinical practice but a bad starting point for diagnostic research. As said, for diagnostic research purposes all subjects suspected of the target disease preferably undergo the index test(s) plus reference test irrespective of previous test results. Second, selection of patients with a negative reference test result as 'controls' may lead to inclusion of controls that correspond to a different clinical domain, i.e. patients who underwent the reference test but not necessarily because they were similarly suspected of the target condition [ 16 , 17 ]. A third disadvantage of such case-control design is that absolute probabilities of disease presence given the index test results, i.e. the predictive values or post-test probabilities, that are the desired parameters for patient care, cannot be obtained. Cases and controls are sampled from a source population of unknown size. The total number of patients that were initially suspected of the target disease based on the presence of symptoms or signs, i.e. the true source population, is commonly unknown as in routine care patients are hardly classified by their symptoms and signs at presentation [ 18 ]. Hence, the sampling fraction of cases and controls is unknown and valid estimates of the absolute probabilities of disease presence cannot be calculated [ 12 ].

A nested case-control study in diagnostic research includes the full population or cohort of patients suspected of the target disease. The 'true' disease status is obtained for all these patients with the reference standard. Hence, there is no referral or partial verification bias. The results of the index tests can then be obtained for all subjects with the target condition but only for a sample of the subjects without the target condition. Usually all patients with the target disease are included, but this could as well be a sample of the cases. Besides the absence of bias, all measures of diagnostic accuracy, including the positive and negative predictive values, can simply be obtained by weighing the controls with the case-control sampling fraction, as explained in Figure 1 .

figure 1

Theoretical example of a full study population and a nested case-control sample . The index test result and the outcome are obtained for all patients of the study population. The case-control ratio was 1:4 (sampling fraction (SF) = 160/400 = 0.40). Valid diagnostic accuracy measures can be obtained from the nested case-control sample, by multiplying the controls with 1/sampling fraction. For example, the positive predictive value (PPV) of a full study population can be calculated with a/(a + b), in this example 30/(30 + 100) = 0.23. In a nested case-control sample the PPV is calculated with a/(a + (1/SF)*b), in this example: 30/(30 + 2.5*40) = 0.23. In a case-control sample however, the controls are sampled from a source population with unknown size. Therefore, the sample fraction is unknown and valid estimate of the PPV cannot be calculated.

Potential advantages of a nested case-control design in diagnostic research

The nested case-control study design can be advantageous over a full cross-sectional cohort design when actual disease prevalence in subjects suspected of a target condition is low, the index test is costly to perform, or if the index test is invasive and may lead to side effects. Under these conditions, one limits patient burden and saves time and money as the index test is performed in only a sample of the control subjects.

Furthermore, the nested case-control design is of particular value when stored data (serum, images etc.) of an existing study population are re-analysed for diagnostic research purposes. Using a nested case-control design, only data of a sample of the full study population need to be retrieved and analysed without having to perform a new diagnostic study from the start. This may for example apply to evaluation of tumour markers to detect cancer, but also for imaging or electrophysiology tests.

Diagnostic accuracy estimates derived from a nested case-control study, should be virtually identical to a full cohort analysis. However, the variability of the accuracy estimates will increase with decreasing sample size. We illustrate this with data of a diagnostic study on a cohort of patients who were suspected of DVT.

A cross-sectional study was performed among a cohort of adult patients suspected of deep vein thrombosis (DVT) in primary care. This suspicion was primarily defined by the presence of a painful and swollen or red leg that existed no longer than 30 days. Details on the setting, data collection and main results have been described previously. [ 19 , 20 ] In brief, the full study population included 1295 consecutive patients who visited one of the participating primary care physicians with above symptoms and signs of DVT. Patients were excluded if pulmonary embolism was suspected. The general practitioner systematically documented information on patient history and physical examination. Patient history included information such as age, gender, history of malignancy, and recent surgery. Physical examination included swelling of the affected limb and difference in circumference of the calves calculated as the circumference (in centimetres) of affected limb minus circumference of unaffected limb, further referred to as calf difference test. Subsequently, all patients were referred to undergo D-dimer testing. In line with available guidelines and previous studies, the D-dimer test result was considered abnormal if the test yielded a D-dimer level ≥ 500 ng/ml. [ 21 , 22 ] Finally, they all underwent the reference test, i.e. repeated compression ultrasonography (CUS) of the lower extremities. In patients with a normal first CUS measurement, the CUS was repeated after seven days. DVT was considered present if one CUS measurement was abnormal. The echographist was blinded to the results of patient history, physical examination, and the D-dimer assay.

Nested case-control samples

Nested case-control samples were drawn from the full study population (n = 1295). In all samples, we included always all 289 cases with DVT. Controls were randomly sampled from the 1006 subjects without DVT. We applied four different and frequently used case-control ratios, i.e. one control for each case (1:1), two controls for each case (1:2), three controls for each case (1:3) and four controls for each case (1:4). For example, a sample with case-control ratio of 1:1 contained 289 cases and 289 random subjects out of 1006 controls (sampling fraction 289/1006 = 0.287). In the 1:4 approach, we sampled with replacement. For each case-control ratio, 100 nested case-control samples were drawn.

Statistical analysis

We focussed on two important diagnostic tests for DVT, i.e. the dichotomous D-dimer test and the continuous calf difference test. The latter was specifically chosen as it allowed for the estimation and thus comparison of the area under the ROC curve (ROC area). Diagnostic accuracy measures of both tests were estimated for the four case-control ratios and compared with those obtained from the full study population. Measures of diagnostic accuracy included sensitivity and specificity, positive and negative predictive values and the odds ratio (OR) for the D-dimer test, and the OR and the ROC area for the calf difference test.

In the analysis of the nested case-control samples, we multiplied control samples by [1/sample fraction] corresponding to the case-control ratio (1:1 = 3.48; 1:2 = 1.74; 1:3 = 1.16; 1:4 = 0.87). For each case-control ratio, the point estimates and variability were determined. The median estimate of the 100 samples was considered as the point estimate. Analyses were performed using SPSS version 12.0 and S-plus version 6.0.

In the full study population, the prevalence of DVT was 22% (n = 289), the D-dimer test was abnormal in 69% of the patients (n = 892) and the mean difference in calf circumference was 2.3 cm (Table 1 ). The prevalence of DVT was 50%, 33%, 25% and 20% in the nested case-control samples as a result of the sampling ratios (1:1, 1:2, 1:3 and 1:4, respectively). The distributions of the test characteristics in the control samples were similar as for the patients from the full study population without DVT (Table 1 ).

In the full study population the sensitivity and negative predictive value were high for the D-dimer test, 0.94 and 0.96, respectively (Table 2 ), whereas the specificity and positive predictive value were relatively low. The OR for the calf difference test was 1.44 and the ROC area was 0.69.

The average estimates of diagnostic accuracy for each of the four case-control ratios were similar to the corresponding estimates of the full study population (Figure 2 ). For example, the negative predictive value of the D-dimer test was 0.955 in both the full study population and for the four case-control ratios. The OR of the calf difference test was 1.44 in the full study population and the OR derived from the nested case-control samples were on average also 1.44.

figure 2

Estimates of diagnostic accuracy of the D-dimer test and calf difference test for the 100 nested case-control samples with case-control ratios ranging from 1:1 to 1:4 . The boxes indicate mean values and corresponding interquartile ranges (25 th and 75 th percentile). Whiskers indicate 2.5 th and 97.5 th percentiles. The dotted lines represent the values estimated in the full study population.

The use of (conventional) case-control studies in diagnostic research has often been associated with biased estimates of diagnostic accuracy, due to the incorrect sampling of subjects [ 3 – 6 , 18 ]. Moreover, this study design does not allow for the estimation of the desired absolute disease probabilities. We discussed and showed that a case-control study nested within a well defined cohort of subjects suspected of a particular target disease with known sample size can yield valid estimates of diagnostic accuracy of an index test, including the absolute probabilities of disease presence or absence. Diagnostic accuracy parameters derived from a full (cross-sectional) cohort of patients suspected of DVT were similar to the estimates derived from various nested case-control samples averaged over 100 simulations. Expectedly, the variability decreased with increasing number of controls, making the measures estimated in the larger case-control samples more precise.

As discussed, the number of subjects from which the index test results need to be retrieved can substantially be reduced with a nested case-control design. Hence, the nested case-control design is particularly advantageous when the prevalence of the target condition in the cohort of patients suspected of the target disease is rare, when the index test results are costly or difficult to collect and for re-analysing stored images or specimen. However, precision of the diagnostic accuracy measures will be hampered by increased variability when too little control patients are included.

Rutjes et al nicely discussed limitations of different study designs in diagnostic research [ 6 ]. They proposed the 'two-gate design with representative sampling' (which resembles the nested case-control design in this paper) as a valid design. We confirmed their proposition with a quantitative analysis of a diagnostic study. Rutjes et al suggested not to use the term 'nested case-control' to prevent confusion with etiologic studies where this design is commonly applied. Indeed, diagnostic and etiologic research differs fundamentally, first and foremost on the concept of time. Diagnostic accuracy studies are, in contrast to etiologic studies, typically cross-sectional in nature. Furthermore, diagnostic associations between index and reference tests are purely descriptive, whereas in etiologic studies causal associations and potential confounding are involved. Despite these major differences we believe there is no reason not to use the term nested case-control study in diagnostic research as well. The term inherently refers to the method of sampling of study subjects which can be the same in a diagnostic or etiologic setting, and has no direct bearing on the other issues typically related to etiologic case control studies.

Our findings support the view that the nested case-control study is a valid and efficient design for diagnostic studies. We believe that the nested case-control approach should be applied more often in diagnostic research, and also be (re)appraised in current guidelines on diagnostic methodology.

Knottnerus JA, van Weel C, Muris JW: Evaluation of diagnostic procedures. BMJ. 2002, 324 (7335): 477-480. 10.1136/bmj.324.7335.477.

Article   PubMed   PubMed Central   Google Scholar  

Knottnerus JA, Muris JW: Assessment of the accuracy of diagnostic tests: the cross-sectional study. J Clin Epidemiol. 2003, 56 (11): 1118-1128. 10.1016/S0895-4356(03)00206-3.

Article   CAS   PubMed   Google Scholar  

Lijmer JG, Mol BW, Heisterkamp S, Bonsel GJ, Prins MH, Meulen van der JHP, Bossuyt PMM: Empirical evidence of design-related bias in studies of diagnostic tests. JAMA. 1999, 282: 1061-1066. 10.1001/jama.282.11.1061.

Rutjes AW, Reitsma JB, Di Nisio M, Smidt N, van Rijn JC, Bossuyt PM: Evidence of bias and variation in diagnostic accuracy studies. CMAJ. 2006, 174 (4): 469-476.

Whiting P, Rutjes AW, Reitsma JB, Glas AS, Bossuyt PM, Kleijnen J: Sources of variation and bias in studies of diagnostic accuracy: a systematic review. Ann Intern Med. 2004, 140 (3): 189-202.

Article   PubMed   Google Scholar  

Rutjes AW, Reitsma JB, Vandenbroucke JP, Glas AS, Bossuyt PM: Case-control and two-gate designs in diagnostic accuracy studies. Clin Chem. 2005, 51 (8): 1335-1341. 10.1373/clinchem.2005.048595.

Kraemer H: Evaluating Medical Tests. 1992, London, UK , Sage Publications

Google Scholar  

Mantel N: Synthetic retrospective studies and related topics. Biometrics. 1973, 29 (3): 479-486. 10.2307/2529171.

Essebag V, Genest J, Suissa S, Pilote L: The nested case-control study in cardiology. Am Heart J. 2003, 146 (4): 581-590. 10.1016/S0002-8703(03)00512-X.

Ernster VL: Nested case-control studies. Prev Med. 1994, 23 (5): 587-590. 10.1006/pmed.1994.1093.

Langholz B: Case-Control Study, Nested. Encyclopedia of Biostatistics. Edited by: Armitage PCT. 2005, New York , John Wiley & Sons, 646-665. 2nd

Rothman KJ, Greenland S: Modern epidemiology. 1998, Philadelphia , Lincot-Raven Publishers, Second

Mantel N, Haenszel W: Statistical aspects of the analysis of data from retrospective studies of disease. J Natl Cancer Inst. 1959, 22 (4): 719-748.

CAS   PubMed   Google Scholar  

Ransohoff DF, Feinstein AR: Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. N Engl J Med. 1978, 299 (17): 926-930.

Begg CB, Greenes RA: Assessment of diagnostic tests when disease verification is subject to selection bias. Biometrics. 1983, 39: 297-215. 10.2307/2530820.

Article   Google Scholar  

Knottnerus JA, Leffers JP: The influence of referral patterns on the characteristics of diagnostic tests. J Clin Epidemiol. 1992, 45: 1143-1154. 10.1016/0895-4356(92)90155-G.

van der Schouw YT, van Dijk R, Verbeek ALM: Problems in selecting the adequate patient population from existing data files for assessment studies of new diagnostic tests. J Clin Epidemiol. 1995, 48: 417-422. 10.1016/0895-4356(94)00144-F.

Oostenbrink R, Moons KG, Bleeker SE, Moll HA, Grobbee DE: Diagnostic research on routine care data: prospects and problems. J Clin Epidemiol. 2003, 56 (6): 501-506. 10.1016/S0895-4356(03)00080-5.

Oudega R, Hoes AW, Moons KG: The Wells rule does not adequately rule out deep venous thrombosis in primary care patients. Ann Intern Med. 2005, 143 (2): 100-107.

Oudega R, Moons KG, Hoes AW: Limited value of patient history and physical examination in diagnosing deep vein thrombosis in primary care. Fam Pract. 2005, 22 (1): 86-91. 10.1093/fampra/cmh718.

Perrier A, Desmarais S, Miron M, de Moerloose P, Lepage R, Slosman D, Didier D, Unger P, Patenaude J, Bounameaux H: Non-invasive diagnosis of venous thromboembolism in outpatients. Lancet. 1999, 353: 190-195. 10.1016/S0140-6736(98)05248-9.

Schutgens RE, Ackermark P, Haas FJ, Nieuwenhuis HK, Peltenburg HG, Pijlman AH, Pruijm M, Oltmans R, Kelder JC, Biesma DH: Combination of a normal D-dimer concentration and a non-high pretest clinical probability score is a safe strategy to exclude deep venous thrombosis. Circulation. 2003, 107 (4): 593-597. 10.1161/01.CIR.0000045670.12988.1E.

Pre-publication history

The pre-publication history for this paper can be accessed here: http://www.biomedcentral.com/1471-2288/8/48/prepub

Download references

Acknowledgements

For this research project we received financial support from the Netherlands Organization for Scientific Research, grant number: ZON-MW904-66-112. The funding source had no influence on the design, data analysis and report of this study.

Author information

Authors and affiliations.

Julius Center for Health Sciences and Primary Care, University Medical Center, Utrecht, The Netherlands

Cornelis J Biesheuvel, Yvonne Vergouwe, Ruud Oudega, Arno W Hoes, Diederick E Grobbee & Karel GM Moons

The Children's Hospital at Westmead, Sydney, Australia

Cornelis J Biesheuvel

You can also search for this author in PubMed   Google Scholar

Corresponding author

Correspondence to Karel GM Moons .

Additional information

Competing interests.

The authors declare that they have no competing interests.

Authors' contributions

All authors commented on the draft and the interpretation of the findings, read and approved the final manuscript. CJB was responsible for the design, statistical analysis and wrote the original manuscript. YV was responsible for the design and statistical analysis. RO was responsible for the data collection. AWH was responsible for expertise in case-control design. DEG and KGMM were responsible for conception and design of the study and coordination.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2, rights and permissions.

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article.

Biesheuvel, C.J., Vergouwe, Y., Oudega, R. et al. Advantages of the nested case-control design in diagnostic research. BMC Med Res Methodol 8 , 48 (2008). https://doi.org/10.1186/1471-2288-8-48

Download citation

Received : 07 March 2008

Accepted : 21 July 2008

Published : 21 July 2008

DOI : https://doi.org/10.1186/1471-2288-8-48

Share this article

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative

  • Diagnostic Accuracy
  • Deep Vein Thrombosis
  • Target Disease
  • Diagnostic Accuracy Study

BMC Medical Research Methodology

ISSN: 1471-2288

nested case control study design

Protocol for a nested case-control study design for omics investigations in the Environmental Determinants of Islet Autoimmunity cohort

Affiliations.

  • 1 Adelaide Medical School, Robinson Research Institute, University of Adelaide, Adelaide, South Australia, Australia.
  • 2 School of Public Health, The University of Adelaide, Adelaide, South Australia, Australia.
  • 3 Melbourne Integrative Genomics, School of Mathematics and Statistics, University of Melbourne, Melbourne, Victoria, Australia.
  • 4 School of Women's and Children's Health, Faculty of Medicine, University of New South Wales, Sydney, New South Wales, Australia.
  • 5 Institute of Endocrinology and Diabetes, The Children's Hospital at Westmead, Sydney, New South Wales, Australia.
  • 6 Department of Diabetes and Endocrinology, Royal Melbourne Hospital, Melbourne, Victoria, Australia.
  • 7 Department of Medicine, University of Melbourne, Melbourne, Victoria, Australia.
  • 8 Telethon Kids Institute Centre for Child Health Research, The University of Western Australia, Perth, Western Australia, Australia.
  • 9 Faculty of Medicine, Frazer Institute, The University of Queensland Translational Research Institute, Brisbane, Queensland, Australia.
  • 10 Population Health and Immunity Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, Victoria, Australia.
  • 11 Department of Medical Biology, University of Melbourne, Melbourne, Victoria, Australia.
  • 12 Telethon Kids Institute, The University of Western Australia, Perth, Western Australia, Australia.
  • 13 Faculty of Medicine and Health, Sydney School of Public Health, The University of Sydney, Sydney, New South Wales, Australia.
  • 14 School of Clinical Medicine - Psychiatry and Mental Health, University of New South Wales, Sydney, New South Wales, Australia.
  • 15 Centre for Diabetes Research, Harry Perkins Institute of Medical Research, The University of Western Australia, Perth, Western Australia, Australia.
  • 16 Virology Research Laboratory, Serology and Virology Division, South Eastern Area Laboratory Services Microbiology, Prince of Wales Hospital, Sydney, New South Wales, Australia.
  • 17 Melbourne eResearch Group, School of Computing and Information Services, University of Melbourne, Melbourne, Victoria, Australia.
  • 18 Diabetes and Vascular Medicine Unit, Monash Health, Melbourne, Victoria, Australia.
  • 19 Endocrinology and Diabetes Centre, Women's and Children's Hospital, Adelaide, South Australia, Australia.
  • PMID: 37043275
  • PMCID: PMC10101668
  • DOI: 10.1080/07853890.2023.2198255

Background: The Environmental Determinants of Islet Autoimmunity (ENDIA) pregnancy-birth cohort investigates the developmental origins of type 1 diabetes (T1D), with recruitment between 2013 and 2019. ENDIA is the first study in the world with comprehensive data and biospecimen collection during pregnancy, at birth and through childhood from at-risk children who have a first-degree relative with T1D. Environmental exposures are thought to drive the progression to clinical T1D, with pancreatic islet autoimmunity (IA) developing in genetically susceptible individuals. The exposures and key molecular mechanisms driving this progression are unknown. Persistent IA is the primary outcome of ENDIA; defined as a positive antibody for at least one of IAA, GAD, ZnT8 or IA2 on two consecutive occasions and signifies high risk of clinical T1D. Method: A nested case-control (NCC) study design with 54 cases and 161 matched controls aims to investigate associations between persistent IA and longitudinal omics exposures in ENDIA. The NCC study will analyse samples obtained from ENDIA children who have either developed persistent IA or progressed to clinical T1D (cases) and matched control children at risk of developing persistent IA. Control children were matched on sex and age, with all four autoantibodies absent within a defined window of the case's onset date. Cases seroconverted at a median of 1.37 years (IQR 0.95, 2.56). Longitudinal omics data generated from approximately 16,000 samples of different biospecimen types, will enable evaluation of changes from pregnancy through childhood. Conclusions: This paper describes the ENDIA NCC study, omics platform design considerations and planned univariate and multivariate analyses for its longitudinal data. Methodologies for multivariate omics analysis with longitudinal data are discovery-focused and data driven. There is currently no single multivariate method tailored specifically for the longitudinal omics data that the ENDIA NCC study will generate and therefore omics analysis results will require either cross validation or independent validation.KEY MESSAGESThe ENDIA nested case-control study will utilize longitudinal omics data on approximately 16,000 samples from 190 unique children at risk of type 1 diabetes (T1D), including 54 who have developed islet autoimmunity (IA), followed during pregnancy, at birth and during early childhood, enabling the developmental origins of T1D to be explored.

Publication types

  • Research Support, Non-U.S. Gov't
  • Autoantibodies
  • Autoimmunity / genetics
  • Case-Control Studies
  • Child, Preschool
  • Diabetes Mellitus, Type 1* / etiology
  • Diabetes Mellitus, Type 1* / genetics
  • Genetic Predisposition to Disease
  • Infant, Newborn
  • Islets of Langerhans*

Associated data

  • ANZCTR/ACTRN12613000794707

Grants and funding

IMAGES

  1. Nested case control study design

    nested case control study design

  2. PPT

    nested case control study design

  3. PPT

    nested case control study design

  4. Nested Case Control Study

    nested case control study design

  5. Difference between case-control and nested case-control study

    nested case control study design

  6. Flowchart for the nested case-control studies.

    nested case control study design

VIDEO

  1. #5- Case Control Studies part 1

  2. Lecture 15 OOSE Project ideas continue & Use Case Diagram

  3. Use Case Diagram

  4. Case-control study design

  5. case control study part 2 || epidemiology|| PSM|| @Sudarshan263

  6. Analytic Study Design, Case Control Study Design

COMMENTS

  1. Methodologic considerations in the design and analysis of

    The nested case-control study (NCC) design within a prospective cohort study is used when outcome data are available for all subjects, but the exposure of interest has not been collected, and is difficult or prohibitively expensive to obtain for all subjects.

  2. Advantages of the nested case-control design in diagnostic

    Our findings support the view that the nested case-control study is a valid and efficient design for diagnostic studies. We believe that the nested case-control approach should be applied more often in diagnostic research, and also be (re)appraised in current guidelines on diagnostic methodology.

  3. Nested case-control studies: advantages and disadvantages

    a) The nested case-control study is a retrospective design. b) The study design minimised selection bias compared with a case-control study. c) Recall bias was minimised compared with a case-control study. d) Causality could be inferred from the association between prescription of antipsychotic drugs and venous thromboembolism

  4. Advantages of the nested case-control design in diagnostic

    Our findings support the view that the nested case-control study is a valid and efficient design for diagnostic studies and should also be (re)appraised in current guidelines on diagnostic accuracy research. Peer Review reports Background

  5. Protocol for a nested case-control study design for omics

    Method: A nested case-control (NCC) study design with 54 cases and 161 matched controls aims to investigate associations between persistent IA and longitudinal omics exposures in ENDIA. The NCC study will analyse samples obtained from ENDIA children who have either developed persistent IA or progressed to clinical T1D (cases) and matched ...