Evidence synthesis – Accuracy of administrative database algorithms for autism spectrum disorder, attention-deficit/hyperactivity disorder and fetal alcohol spectrum disorder case ascertainment: a systematic review

HPCDP Journal Home
Published by: The Public Health Agency of Canada
Date published: September 2022
ISSN: 2368-738X
Submit a manuscript
About HPCDP
Browse
Siobhan O'Donnell, MSc; Sarah Palmeter, MPH; Meghan Laverty, MSc; Claudia Lagacé, MSc
https://doi.org/10.24095/hpcdp.42.9.01
This article has been peer reviewed.
Author reference
Public Health Agency of Canada, Ottawa, Ontario, Canada
Correspondence
Siobhan O'Donnell, Centre for Surveillance and Applied Research, Health Promotion and Chronic Disease Prevention Branch, Public Health Agency of Canada, 785 Carling Avenue, AL 6806A, Ottawa, ON K1A 0K9; Tel: 613-301-7325; Email: siobhan.odonnell@phac-aspc.gc.ca
Suggested citation
O'Donnell S, Palmeter S, Laverty M, Lagacé C. Accuracy of administrative database algorithms for autism spectrum disorder, attention-deficit/hyperactivity disorder and fetal alcohol spectrum disorder case ascertainment: a systematic review. Health Promot Chronic Dis Prev Can. 2022;42(9):355-83. https://doi.org/10.24095/hpcdp.42.9.01
Abstract
Introduction: The purpose of this study was to perform a systematic review to assess the validity of administrative database algorithms used to identify cases of autism spectrum disorder (ASD), attention-deficit/hyperactivity disorder (ADHD) and fetal alcohol spectrum disorder (FASD).
Methods: MEDLINE, Embase, Global Health and PsycInfo were searched for studies that validated algorithms for the identification of ASD, ADHD and FASD in administrative databases published between 1995 and 2021 in English or French. The grey literature and reference lists of included studies were also searched. Two reviewers independently screened the literature, extracted relevant information, conducted reporting quality, risk of bias and applicability assessments, and synthesized the evidence qualitatively. PROSPERO CRD42019146941.
Results: Out of 48 articles assessed at full-text level, 14 were included in the review. No studies were found for FASD. Despite potential sources of bias and significant between-study heterogeneity, results suggested that increasing the number of ASD diagnostic codes required from a single data source increased specificity and positive predictive value at the expense of sensitivity. The best-performing algorithms for the identification of ASD were based on a combination of data sources, with physician claims database being the single best source. One study found that education data might improve the identification of ASD (i.e. higher sensitivity) in school-aged children when combined with physician claims data; however, additional studies including cases without ASD are required to fully evaluate the diagnostic accuracy of such algorithms. For ADHD, there was not enough information to assess the impact of number of diagnostic codes or additional data sources on algorithm accuracy.
Conclusion: There is some evidence to suggest that cases of ASD and ADHD can be identified using administrative data; however, studies that assessed the ability of algorithms to discriminate reliably between cases with and without the condition of interest were lacking. No evidence exists for FASD. Methodologically higher-quality studies are needed to understand the full potential of using administrative data for the identification of these conditions.
Keywords: autism spectrum disorder, attention deficit disorder with hyperactivity, fetal alcohol spectrum disorders, algorithms, validation study, administrative data, public health surveillance
Highlights
- Few studies have validated administrative database algorithms for the identification of ASD and ADHD. No validation studies were found for FASD.
- Extensive heterogeneity in study design and conduct across the included studies precluded a quantitative synthesis of the results.
- There is evidence to suggest that ASD and ADHD can be identified using administrative data; however, studies that assessed the ability of algorithms to discriminate reliably between cases with and without the condition of interest were lacking.
- The best-performing algorithms used to identify ASD are based on a combination of administrative data sources, with physician claims data being the single best source.
- Higher-quality studies are essential to fully leverage administrative data for surveillance and research on these conditions.
Introduction
Neurodevelopmental disorders, a group of conditions with onset early in life, are characterized by impairments in physical development, learning, language and/or behaviour.Footnote 1 Despite the wide-ranging personal and societal impacts that these disorders have, early detection and interventions have been shown to improve outcomes in those with certain types of neurodevelopmental disorders, including autism spectrum disorder (ASD), attention-deficit/hyperactivity disorder (ADHD) and fetal alcohol spectrum disorder (FASD).Footnote 2Footnote 3Footnote 4Footnote 5 In light of this, a better understanding of the epidemiological burden of these disorders in Canada is essential in the implementation of public policy, including the establishment of programs and services.
Population-based administrative databases, designed for health system management and physician remuneration, offer an efficient and inexpensive way of providing longitudinal epidemiological data. As a result, these data are being increasingly used as a way to conduct chronic disease surveillance,Footnote 6Footnote 7Footnote 8 disease and treatment outcome researchFootnote 9 and quality of care studies.Footnote 10Footnote 11 However, along with these advantages, health administrative databases have limitations, including the potential for misclassification.Footnote 12
The accuracy of the diagnostic codes or their combination (algorithm) for surveillance or research purposesFootnote 13 depends on multiple factors, including database quality, the specific condition being identified and the validity of the diagnostic codes within the patient group.Footnote 12 Therefore, validation studies are necessary to evaluate the accuracy of algorithms used for case ascertainment.Footnote 14 Validation involves quantifying the number of instances in which the algorithm matches a reference standard, such as a medical record diagnosis.Footnote 12 In this way, the algorithm can be treated like a diagnostic test, and measures of diagnostic accuracy can be calculated. The results of these validation studies are typically reported as estimates of the sensitivity and specificity of the algorithm, which express how good the algorithm is at correctly identifying individuals with and without the target condition, respectively.Footnote 15Footnote 16 Other diagnostic accuracy statistics can be used, including positive predictive value (PPV) and negative predictive value (NPV).
To our knowledge, there are no published reviews that have evaluated the validity of health administrative database algorithms for the surveillance or research of neurodevelopmental disorders, specifically ASD, ADHD and FASD. Thus, the primary objective of this systematic review was to address this shortcoming. The secondary objective was to examine the impact of linking health to non-health (i.e. education or social services) administrative data on the accuracy of these algorithms.
Methods
This systematic review is reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement.Footnote 17 Ethics approval was not required, as primary data were not collected.
Protocol and registration
The protocol for this systematic review has been registered in the International Prospective Register of Systematic Reviews (PROSPERO 2019 CRD42019146941), was published on 16 December 2019 and is available online.
Search strategy
A systematic search of MEDLINE, Embase, Global Health and PsycInfo was conducted to identify all validation studies using administrative data to ascertain cases of ASD, ADHD or FASD published in English or French from January 1995 to March 2021. A start year of 1995 was chosen for the database searches in order to align with the start year for data collection in the Canadian Chronic Disease Surveillance System, a collaborative network of provincial and territorial health administrative surveillance systems supported by the Public Health Agency of Canada. A reference librarian developed the search strategy using medical subject headings and keywords related to the target conditions (e.g. "exp autism spectrum disorder/"), administrative data (e.g. "exp insurance, health/"), reference standard (e.g. "exp medical records/") and validation testing (e.g. "sensitivity and specificity/"). The initial search strategy was developed in MEDLINE and was peer reviewed before being adopted for the other databases (Appendix A). Additionally, the grey literature was searched via two mechanisms: an advanced Google search and searching websites of relevant agencies and organizations. Furthermore, reference lists of relevant surveillance reports found in the grey literature as well as articles that met the eligibility criteria of the review were manually searched for additional studies.
Eligibility criteria
To be included, studies of any design type had to report
- the assessment or validation of one or more health administrative database algorithms against a reference standard (i.e. established clinical criteria, medical record diagnosis, electronic medical record or patient self-report measure) for identifying a case with ASD, ADHD or FASD; and
- at least one measure of diagnostic accuracy (i.e. sensitivity, specificity, PPV, NPV, area under the receiver operating characteristic curve [or C-statistic], Youden's index, kappa statistic or likelihood ratio).
An administrative database algorithm was defined as a set of rules for identifying disease cases from administrative data, with elements including type of data source, number of years of administrative data, diagnostic or medication code(s) and number of administrative data records (i.e. contacts) with diagnostic or medication code(s).Footnote 13 While the administrative database algorithm had to include health administrative data, it could also include other types of administrative data, such as education or social services data.
These algorithms could be based on administrative data from either a health administrative database or a clinical or health information system. A health administrative database was defined as information that is routinely or passively collected solely for administrative purposes in managing the health care of patients,Footnote 18 and a clinical/health information system was defined as administrative data supplemented with detailed clinical information by way of electronic health records.Footnote 19
Abstracts, editorials and commentaries were excluded from the review, as well as studies published before 1995 or in a language other than English or French.
Study selection and data extraction
Two reviewers (CL and SO) independently screened the titles and abstracts of all bibliographic records and articles identified through electronic database searches, grey literature and reference lists of surveillance reports for eligibility. When consensus could not be reached on a given study, it was retained for the next stage of screening. For every study that passed the title and abstract level of screening, full-text articles were assessed for eligibility by two reviewers (ML and SO) independently and the reason for exclusion was recorded. When reviewers did not agree on the inclusion or exclusion of an article, a third reviewer (CL) was consulted. The reference lists of all articles that passed full-text review were manually searched using the same two level screening process conducted by two reviewers (SO and SP).
Relevant information was extracted from included articles using a template developed for this systematic review and piloted before use that included author, year, geographic location, study cohort, type of administrative data source(s), administrative database algorithm(s) and related elements, reference standard, reference diagnostic criteria and measures of diagnostic accuracy. One reviewer (ML) completed the extraction, which was verified by a second (SO). Any disagreement was resolved by consensus, or when required, by a third party (CL).
Reporting quality, risk of bias and applicability assessments
Included studies underwent a reporting quality assessment using the 40-point, modified Standards for the Reporting of Diagnostic Accuracy Studies (STARD) checklistFootnote 12 (Appendix B) and risk of bias and applicability assessments using the revised Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2) toolFootnote 20 (Appendix C). These assessments were completed by one reviewer (ML) and verified by a second (SO). Any disagreements were resolved by consensus, or if necessary, by a third reviewer (CL).
Data synthesis and analysis
Extensive heterogeneity in study design and conduct across the included studies precluded a quantitative synthesis; therefore, results were synthesized narratively using text and tables for the conditions of interest. Findings both within and between studies were explored as per guidance from the Centre for Reviews and Dissemination.Footnote 21 While the diagnostic accuracy of the administrative database algorithms in all included studies were considered, our final recommendations also took the reporting quality, risk of bias and applicability concerns of each study into account.
Results
Search results
The PRISMA flow diagram in Figure 1 documents the study screening process.Footnote 17 A total of 5918 records were identified through database searching and 11 additional records through other sources (grey literature and surveillance report reference lists). After duplicates were removed, 4133 records identified from database searching were screened (title and abstract), of which 4085 were deemed ineligible and excluded. Of the remaining 48 records that underwent a full-text review for eligibility, 34 were excluded (17 did not use a health administrative database, 6 used a health administrative database, but were not validation studies, 8 did not validate a health administrative database algorithm, and 3 were excluded for other reasons) and 14 records (studies) were included in the review. None of the 11 records identified through grey literature and surveillance report reference lists were included in the review. No additional studies were found from manually searching reference lists of included articles.

Figure 1 - Text description
This figure illustrates the PRISMA flow diagram for the identification, screening and inclusion of studies in the evidence synthesis.
On one hand, studies were identified via databases and registers as follows:
- n = 5918 records were identified from databases, of which n = 1785 were duplicates records that were removed before screening.
- n = 4133 records were screened, of which n = 4085 records were excluded.
- n = 48 reports were sought for retrieval, of which n = 0 reports were not retrieved.
- n = 48 reports were assessed for eligibility, of which the following reports were excluded:
- Did not use health admin database (n = 17)
- Used health admin database, but not a validation study (n = 6)
- Did not validate health admin database algorithm(s) (n = 8)
- Other (e.g. abstract, editorial) (n = 3)
On the other hand, studies were identified via other methods as follows:
- n = 11 records were identified from grey literature and surveillance report reference lists
- n = 11 reports were sought for retrieval, of which n = 0 reports were not retrieved
- n = 11 reports were assessed for eligibility, of which the following reports were excluded:
- Duplicate of database search results (n = 5)
- Did not include conditions of interest i.e. ASD, ADHD and/or FASD (n = 1)
- Did not use health admin database (n = 2)
- Did not validate health admin database algorithms(s) (n = 3)
This resulted in n = 14 studies being included in the review.
Note: PRISMA template from Page MJ et al.Footnote 17
Characteristics of included studies
The characteristics of the 14 included studiesFootnote 22Footnote 23Footnote 24Footnote 25Footnote 26Footnote 27Footnote 28Footnote 29Footnote 30Footnote 31Footnote 32Footnote 33Footnote 34Footnote 35 are provided in Table 1. Ten studies focussed on ASDFootnote 22Footnote 23Footnote 24Footnote 25Footnote 26Footnote 27Footnote 28Footnote 29Footnote 30Footnote 31 and the remaining four on ADHD.Footnote 32Footnote 33Footnote 34Footnote 35 There were no studies identified for FASD.
First author, year Country | Validation population Age | Sample size | Administrative data source(s) | Years of administrative data | Diagnostic codes included in algorithm(s) |
---|---|---|---|---|---|
ASD | |||||
Bickford, 2020Footnote 22 |
Children aged 1 to 14 years, born in British Columbia between 1 April 2000 and 31 December 2009, assessed in one of the British Columbia Autism Assessment Network centres between 1 April 2004 and 31 December 2014 or with a Ministry of Education designation of ASD between 1 September 2004 and 30 June 2015. 1-14 years. |
8670 (cases and non-cases) | Health administrative databasesFootnote a: hospital discharge abstracts and physician claims | 2000-2014 | Hospital discharge abstracts: ICD-9 299.x; ICD-10-CA F84.x Physician claims: ICD-9 299.x |
Brooks, 2021Footnote 23 Canada |
Children and youth aged 1 to 24 as of 31 December 2011, within the Electronic Medical Record Primary Care database (from over 350 Ontario family physicians) with a valid date of birth, registered with an active/practising physician who has used EMR for more than 2 years, alive as of load date, and present in EMR for at least 1 year. 1-24 years. |
10 000 (cases and non-cases) | Health administrative databasesFootnote a: hospital discharge abstracts, emergency department visits, outpatient surgery, physician claims | NR | Hospital discharge abstracts/emergency department visits/outpatient surgery: ICD-9 299.x; ICD-10-CA F84.x Physician claims: OHIP physician billing code 299 |
Brooks, 2021Footnote 24Footnote b |
Children and youth aged 1 to 24 as of 31 December 2011, within the Electronic Medical Record Primary Care database (from over 350 Ontario family physicians) with a valid date of birth, registered with an active/practising physician who has used EMR for more than 2 years, alive as of load date, and present in EMR for at least 1 year. 1-24 years. |
10 000 (cases and non-cases) | Clinical/health information systemFootnote c: electronic health records | NR | OHIP physician billing codes 299, 315 |
Burke, 2014Footnote 25 |
Children and youth aged 2 to 20 years at time of first ASD or ASD-associated claim, insured through a large national private health plan. Eligible cases had to have at least 6 months of continuous enrolment pre- and post-first ASD or ASD-associated claim and not have any claims with a diagnosis of childhood disintegrative disorder or Rett's syndrome. 2-20 years. |
432 (cases and non-cases) | Health administrative databaseFootnote a: private medical, pharmacy, and behavioural insurance claims | 2001-2009 | ASD: ICD-9 299.00-299.01, 299.80-299.81, 299.9 |
Coleman, 2015Footnote 26 |
Children and youth aged < 18 years with current membership in one of the participating health care plans as of December 2010, with at least one ASD diagnostic code, that were not diagnosed in a specialty ASD centre. < 18 years. |
1272 (cases only) | Health administrative databaseFootnote a and clinical/health information systemFootnote c: insurance claims and electronic health records |
1995-2010 | ICD-9 299.0, 299.9, 299.8 |
Coo, 2017Footnote 27 |
Children aged 2-14 years, born between 1997 and 2009 with an administrative diagnosis of ASD and/or who were confirmed as a case by a Manitoba child/youth behavioural or disability service provider on or before 31 December 2011. 2-14 years. |
2610 (cases only) | Health administrative databasesFootnote a and education database: hospital discharge abstracts, physician claims, mental health services and education data | 1997-2011 | Hospital discharge abstracts: ICD-9-CM 299.0, 299.8, 299.9; ICD-10-CA F84.0, F84.1, F84.5, F84.8, F84.9 (in any diagnostic field) Education data: CATEGORYN=ASD (child received funding under special needs category for ASD) |
Dodds, 2009Footnote 28 |
Children born between 1989 and 2002, and assessed for ASD by a team of specialists between 2001 and 2005. Age not provided. |
264 (cases and non-cases) | Health administrative databasesFootnote a: hospital discharge abstracts, physician claims and mental health outpatient data | 1989-2005 | ICD-9 299.x or ICD-10 F84.x (primary or secondary diagnostic field) |
Hagberg, 2017Footnote 29 |
Singleton children born between 1990 and 2011, with at least three years of follow-up from birth. Age not provided. |
37 (cases only) | Clinical/health information systemFootnote c: electronic health records | 1990-2014 | Read codes: E140.00, E140000, E140100, E140.12, E140.13, E140z00, Eu84000, Eu84011, Eu84012, Eu84100, Eu84z11, Eu84500, Eu84.00, Eu84y00, Eu84z00 |
Lauritsen, 2010Footnote 30 |
Children born between 1990 and 1999, whose parent(s) or legal guardian(s) resided in Denmark, with a reported diagnosis of childhood autism. Age not provided. |
499 (cases only) | Health administrative databaseFootnote a: psychiatric inpatient and outpatient data | 1990-2001 | ICD-8 299.00 or ICD-10 F84.0 (main or subsidiary diagnosis) |
Surén, 2019Footnote 31 |
Children born 1999-2009, enrolled in the Norwegian Mother, Father and Child Cohort Study, with a reported autism diagnosis in Norwegian Patient Registry between 2008 and 2014, aged 5-15 years at end of follow-up, with patient records available and who did not undergo a clinical assessment as part of the Autism Study. 5-15 years. |
553 (cases only) | Health administrative databaseFootnote a: mental health care provider, somatic hospital, and specialist private consultant data | 2008-2014 | ICD-10 F84.x |
ADHD | |||||
Daley, 2014Footnote 32 USA |
Children aged 3-9 years at time of first diagnosis, insured at one of eight managed care organizations or who sought care at one of two community health sites between 2004 and 2010, who met the case definition for incident ADHD and were without a diagnosis of mental retardation or pervasive developmental disorder. 3-9 years. |
500 (cases only) | Clinical/health information systemFootnote c: electronic health records | 2004-2010 | ICD-9-CM 314.0x |
Gruschow, 2016Footnote 33 USA |
Patients of the Children's Hospital of Philadelphia health care network, born between 1987 and 1995 (median age 17.9 years) with ≥ 2 visits and who were New Jersey residents at the time of their last visit, that were not identified as having an intellectual disability and had their last visit at ≥ 12 years of age. Children with a recorded ADHD diagnosis in their electronic health record vs. children without were identified. Median age (IQR): 17.9 (15.9-19.1) years |
2030 (cases) 807 (non-cases) |
Clinical/health information systemFootnote c: electronic health records | 2001+ | ICD-9-CM 314.x |
Mohr-Jensen, 2016Footnote 34 Denmark |
Children and youth aged 4-15 years with a reported diagnosis of hyperkinetic disorder, diagnosed for the first time in 1995-2005. 4-15 years. |
372 (cases only) | Health administrative databaseFootnote a: psychiatric hospital data | 1995-2005 | ICD-10 F90.x |
Morkem, 2020Footnote 35 Canada |
Children and adults aged 4 and older identified from a single clinic, with a valid entry for year of birth and gender, and a primary care encounter in the year of study or previous year (from 2008-2015). Patients with certain medical conditions were excluded. ≥ 4 years. |
246 (cases) 246 (non-cases) |
Clinical/health information systemFootnote c: electronic health records | NR | ICD-9 314.x |
ASD studies
Studies that validated algorithms to identify ASD were published between 2009Footnote 28 and 2021.Footnote 23Footnote 24 Five studies were performed in Canada,Footnote 22Footnote 23Footnote 24Footnote 27Footnote 28 two in the United States,Footnote 25Footnote 26 one in the United Kingdom,Footnote 29 one in DenmarkFootnote 30 and one in Norway.Footnote 31 All 10 studies included children and youth as their study population,Footnote 22Footnote 23Footnote 24Footnote 25Footnote 26Footnote 27Footnote 28Footnote 29Footnote 30Footnote 31 although only seven reported the age range.Footnote 22Footnote 23Footnote 24Footnote 25Footnote 26Footnote 27Footnote 31
Validation cohort sample sizes ranged from 37Footnote 29 to 10 000.Footnote 23Footnote 24 Patients were initially selected from diagnostic codes in the administrative database for five studiesFootnote 25Footnote 26Footnote 29Footnote 30Footnote 31 and for one of the two samples used in one study.Footnote 27 Only five studies included a comparator group without ASD.Footnote 22Footnote 23Footnote 24Footnote 25Footnote 28 The prevalence of ASD in the validation cohort ranged from 1.1%Footnote 23Footnote 24 to 67.9%.Footnote 22
Six studies used health administrative databases,Footnote 22Footnote 23Footnote 25Footnote 28Footnote 30Footnote 31 two used a clinical/health information system,Footnote 24Footnote 29 one used both health administrative databases and a clinical/health information systemFootnote 26 and one used health administrative databases combined with an education data source.Footnote 27 The most common data source included a combination of outpatient and inpatient data.Footnote 22Footnote 23Footnote 28Footnote 30Footnote 31
A variety of diagnostic codes were used: International Classification of Diseases, Eighth Revision (ICD-8),Footnote 30 International Classification of Diseases, Ninth Revision (ICD-9),Footnote 22Footnote 23Footnote 25Footnote 26Footnote 27Footnote 28 International Classification of Diseases, Tenth Revision (ICD-10),Footnote 22Footnote 23Footnote 27Footnote 28Footnote 30Footnote 31 Ontario Health Insurance Plan physician billing codes,Footnote 23Footnote 24 Read codesFootnote 29 and unique codes for education and mental health services.Footnote 27 The number of algorithms validated within each study ranged from 1Footnote 29Footnote 30Footnote 31 to 153.Footnote 23
Several reference standards were used, with the most common being a medical chart diagnosis.Footnote 23Footnote 24Footnote 27Footnote 29
The PPV was the most commonly reported measure and was reported in 9 of the 10 studies.Footnote 22Footnote 23Footnote 24Footnote 25Footnote 26Footnote 27Footnote 29Footnote 30Footnote 31 Only three studies reported at least four measures of diagnostic accuracy.Footnote 22Footnote 23Footnote 24
ADHD studies
Studies that validated algorithms to identify ADHD were published between 2014Footnote 32 and 2020.Footnote 35 Two studies were performed in the USA,Footnote 32Footnote 33 one in CanadaFootnote 35 and one in Denmark.Footnote 34 Three studies included children and youth as their study population,Footnote 32Footnote 33Footnote 34 two of which reported the age range.Footnote 32Footnote 34 One study included adults and children aged four years and older.Footnote 35
Validation cohort sample sizes ranged from 372Footnote 34 to 2837.Footnote 33 Patients were initially selected from diagnostic codes in the administrative data source for three studiesFootnote 32Footnote 33Footnote 34 and from diagnostic codes and medication prescriptions in the administrative data source for one study.Footnote 35 Only two studies included a comparator group without ADHD.Footnote 33Footnote 35 The prevalence of ADHD in the validation cohort ranged from 50.0%Footnote 35 to 56.7%.Footnote 33
One study used a health administrative database, including inpatient and outpatient psychiatric hospital dataFootnote 34 and three studies used a clinical/health information system, specifically, electronic health records.Footnote 32Footnote 33Footnote 35
Two studies used ICD-9 codes,Footnote 32Footnote 33 one used ICD-10 codes,Footnote 34 and one used ICD-9 codes and medication prescriptions.Footnote 35 Each study validated one algorithm only.Footnote 32Footnote 33Footnote 34Footnote 35 One study captured incident, rather than prevalent, cases of ADHD.Footnote 32
Various reference standards were used. One study used clinical classification criteria documented in the medical chart.Footnote 34 One used a medical chart ADHD diagnosis.Footnote 35 One used a clinical case definition that required a combination of evidence from the electronic health record, and in the absence of this evidence, a manual review of the electronic health record.Footnote 33 Lastly, one used a combination of clinical classification criteria, medical record diagnosis and standardized screening checklist documented in the medical chart.Footnote 32
The PPV was reported in all four studies,Footnote 32Footnote 33Footnote 34Footnote 35 while only one reported at least four measures of diagnostic accuracy.Footnote 33
Reporting quality of included studies
The number and percentage of included studies meeting reporting criteria using the modified STARD checklist for validating health administrative data are summarized in Table 2.Footnote 12 The quality of reporting was variable. Highlighted below are areas where the reporting quality was especially suboptimal, that is, where less than half of the studies met the criterion. For full details of the reporting quality results for each included study, see Appendix B.
Section, topic and item | Frequency (%) | |
---|---|---|
ASD studiesFootnote b | ADHD studiesFootnote c | |
Title, keywords, abstract | ||
1. Identifies article as study of assessing diagnostic accuracy? | 10 (100) | 4 (100) |
2. Identifies article as study of administrative data? | 8 (80) | 3 (75) |
Introduction | ||
3. States disease identification and validation as one of goals of study? | 10 (100) | 4 (100) |
Methods | ||
Participants in validation cohort | ||
4. Describes validation cohort (cohort of patients to which reference standard was applied)? | 10 (100) | 4 (100) |
4a. Age? | 10 (100) | 4 (100) |
4b. Disease? | 10 (100) | 4 (100) |
4c. Severity? | 0 (0) | 0 (0) |
4d. Location/jurisdiction? | 7 (70) | 2 (50) |
5. Describes recruitment procedure of validation cohort? | 10 (100) | 4 (100) |
5a. Inclusion criteria? | 10 (100) | 4 (100) |
5b. Exclusion criteria? | 5 (50) | 4 (100) |
6. Describes patient sampling (random, consecutive, all, etc.)? | 9 (90) | 4 (100) |
7. Describes data collection? (n = 8 ASD studies) | 8 (100) | 4 (100) |
7a. Who identified patients and ensured selection adhered to patient recruitment criteria? (n = 8 ASD studies) | 8 (100) | 4 (100) |
7b. Who collected data? (n = 8 ASD studies) | 8 (100) | 4 (100) |
7c. A priori data collection form? (n = 8 ASD studies) | 5 (62.5) | 2 (50) |
7d. How was disease classified? | 10 (100) | 3 (75) |
8. Was there a split sample (i.e. re-validation using a separate cohort)? | 1 (10) | 0 (0) |
Test methods | ||
9. Describe number, training and expertise of persons reading reference standard? (n = 8 ASD studies) | 6 (75) | 3 (75) |
10. If > 1 person reading reference standard, measure of consistency is reported (e.g. kappa)? (n = 6 ASD studies; n = 3 ADHD studies) | 3 (50) | 2 (66.7) |
11. Were the readers of the reference (validation) test blinded to the results of the classification by administrative data for that patient? (e.g. Was the reviewer of the charts blinded to how that chart was billed?) (n = 8 ASD studies) | 3 (37.5) | 1 (25) |
Statistical methods | ||
12. Describe methods of calculating/comparing diagnostic accuracy? | 10 (100) | 3 (75) |
Results | ||
Participants | ||
13. Report when study done, start/end dates of enrolment? | 8 (80) | 2 (50) |
14. Describe number of people who satisfied inclusion/exclusion criteria? | 10 (100) | 4 (100) |
15. Study flow diagram? | 4 (40) | 3 (75) |
Test results | ||
16. Report distribution of disease severity? | 0 (0) | 0 (0) |
17. Report cross-tabulation of index tests by results of reference standard? | 9 (90) | 3 (75) |
Estimates | ||
18. Reports at least 4 estimates of diagnostic accuracy? (Estimates reported in included studies) | 3 (30) | 1 (25) |
18a. Sensitivity | 5 (50) | 1 (25) |
18b. Specificity | 4 (40) | 1 (25) |
18c. PPV | 9 (90) | 4 (100) |
18d. NPV | 4 (40) | 2 (50) |
18e. Likelihood ratios | 0 (0) | 0 (0) |
18f. Kappa | 1 (10) | 1 (25) |
18g. Area under the ROC curve / C-statistic | 2 (20) | 0 (0) |
18h. Accuracy/agreement | 0 (0) | 1 (25) |
19. Was the accuracy reported for any subgroups (e.g. age, geography, different sex etc.)? | 4 (40) | 1 (25) |
20. If PPV/NPV reported, does ratio of cases/controls of validation cohort approximate prevalence of condition in the population? (n = 9 ASD studies) | 2 (22.2) | 0 (0) |
21. Reports 95% CIs for each diagnostic accuracy measure? | 6 (60) | 3 (75) |
Discussion | ||
22. Discusses the applicability of the findings? | 10 (100) | 4 (100) |
ASD studies
Concerning the methods used, none of the ASD studies described the severity of the patients, only one re-validated the algorithms using a separate cohortFootnote 23 and just three of eight that included reviewers of the reference standard reported that the reviewers were blinded to the patient classification by administrative data.Footnote 23Footnote 24Footnote 27 In terms of the results, only four included a study flow diagram,Footnote 23Footnote 24Footnote 25Footnote 31 none reported test results by disease severity, just three reported at least four measures of diagnostic accuracy,Footnote 22Footnote 23Footnote 24 only four reported the diagnostic accuracy by subgroup of interestFootnote 25Footnote 26Footnote 27Footnote 28 and just two of nine that reported the PPV and/or NPV reported a ratio of cases to controls in the validation cohort that approximates the prevalence of ASD in the population.Footnote 23Footnote 24
ADHD studies
With respect to the methods used, none of the ADHD studies described the severity of the patients, none re-validated the algorithms using a separate cohort and only one reported that the reviewers of the reference standard were blinded to the administrative data classification.Footnote 35 Concerning the results, none reported test results by disease severity, just one reported at least four measures of diagnostic accuracy,Footnote 33 only one stated the diagnostic accuracy by subgroups of interest,Footnote 32 and none reported a ratio of cases to controls in the validation cohort that approximated the prevalence of ADHD in the population.
Risk of bias and applicability concerns of included studies
An overview of the risk of bias and applicability concerns of the included studies by QUADAS-2 domain is shown in Figure 2.Footnote 20 Assessments revealed either "high" or "unclear" risk of bias in patient selection, reference standard and flow and timing domains in 5 or more of the 14 studies. All studies had a low risk of bias on the index test domain because of the objectivity of administrative database algorithms. There were no applicability concerns with respect to the patient selection, index test or reference standard differing from the review question. For complete risk of bias and applicability assessments for each included study, see Appendix C.

Figure 2 - Text description
Study | Risk of bias | Applicability concerns | |||||
---|---|---|---|---|---|---|---|
Patient selection | Index test | Reference standard | Flow and timing | Patient selection | Index test | Reference standard | |
ASD | |||||||
Bickford, 2020Footnote 22 | X | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Brooks, 2021Footnote 23 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Brooks, 2021Footnote 24 | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Burke, 2014Footnote 25 | X | ✓ | ? | ✓ | ✓ | ✓ | ✓ |
Coleman, 2015Footnote 26 | ? | ✓ | ✓ | X | ✓ | ✓ | ✓ |
Coo, 2017Footnote 27 | XFootnote b✓Footnote c | ✓ | ?Footnote bXFootnote c | X | ✓ | ✓ | ✓ |
Dodds, 2009Footnote 28 | ✓ | ✓ | ✓ | X | ✓ | ✓ | ✓ |
Hagberg, 2017Footnote 29 | ✓ | ✓ | ? | ✓ | ✓ | ✓ | ✓ |
Lauritsen, 2010Footnote 30 | ✓ | ✓ | ✓ | X | ✓ | ✓ | ✓ |
Surén, 2019Footnote 31 | X | ✓ | ✓ | X | ✓ | ✓ | ✓ |
ADHD | |||||||
Daley, 2014Footnote 32 | X | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Gruschow, 2016Footnote 33 | X | ✓ | X | X | ✓ | ✓ | ✓ |
Mohr-Jensen, 2016Footnote 34 | ✓ | ✓ | ✓ | X | ✓ | ✓ | ✓ |
Morkem, 2020Footnote 35 | X | ✓ | ? | ✓ | ✓ | ✓ | ✓ |
bThe risk of bias for the portion of the study evaluating sensitivity using the "sensitivity cohort" from the study.
cThe risk of bias for the portion of the study evaluating positive predictive value (PPV) using children with an administrative diagnosis of ASD.
ASD studies
Patient selection
Three studies had a high risk of bias,Footnote 22Footnote 25Footnote 31 one had a high risk in one of the two samples within the study,Footnote 27 and one had an unclear risk.Footnote 26 These evaluations were either due to the sampling approach,Footnote 25Footnote 27 insufficient information,Footnote 26 the use of a case-control designFootnote 22 or inappropriate exclusions.Footnote 31
Reference standard
Two studies had an unclear risk of bias,Footnote 25Footnote 29 and one study had an unclear risk in one of its two samples.Footnote 27 These judgments were either due to insufficient information about the rigour of the reference standard,Footnote 27Footnote 29 a lack of information as to whether the reviewers were blinded to the results of the algorithmFootnote 25 or the reference standard being partly based on parent-reported diagnosis.Footnote 27
Flow and timing
Five studies had a high risk of bias,Footnote 26Footnote 27Footnote 28Footnote 30Footnote 31 as not all patients were included in the analysis,Footnote 26Footnote 28Footnote 30Footnote 31 or not all patients were evaluated using the same reference standard.Footnote 27
ADHD studies
Patient selection
Three studies had a high risk of bias due to inappropriate exclusions.Footnote 32Footnote 33Footnote 35
Reference standard
One study had a high risk of bias, as the reference standard was not likely to classify cases correctly and reviewers were not blinded to the algorithm results,Footnote 33 and one study had an unclear risk of bias due to insufficient information.Footnote 35
Flow and timing
Two studies had a high risk of bias,Footnote 33Footnote 34 as not all patients were included in the analysisFootnote 34 or not all patients were evaluated using the same reference standard.Footnote 33
Diagnostic accuracy of administrative database algorithms
Given the heterogeneity found in study design and conduct across the included studies, the following synthesis highlights findings on the diagnostic accuracy of algorithms tested within, rather than between, studies. The diagnostic accuracy estimates of the algorithms varied substantially between studies, likely due to the observed between-study heterogeneity. Sources of this heterogeneity included differences in how cases were initially selected, administrative data sources, reference standards and algorithms tested. For example, two studiesFootnote 23Footnote 24 with the same validation cohort and similar algorithms used different administrative data sources (health administrative database vs. clinical/health information system), observed very different performance metrics, namely sensitivity and PPV. For the diagnostic accuracy of the algorithms validated in each included study, refer to Table 3.
First author, year Country | Reference standard | Administrative database algorithm(s) | Measures of diagnostic accuracy (95% CI)Footnote a |
---|---|---|---|
ASD | |||
Bickford, 2020Footnote 22 Canada |
Clinical diagnosis: Clinical data on ASD status from either the British Columbia Autism Assessment Network or the Ministry of Education. All diagnoses made using a standard approach based on DSM criteria, utilizing direct assessment of the child, information provided by family, and any other relevant information. Diagnoses were made by clinicians using the Autism Diagnostic Observation Schedule (ADOS) and the Autism Diagnostic Interview-Revised (ADI-R). |
≥ 1 hospital discharge or ≥ 1 physician claim | SENS: 75% (74%-76%), SPEC: 67% (65%-69%), PPV: 82.7%, NPV: 55.6%, C-stat: 0.71 (0.70-0.72), Kappa: 0.40 (0.38-0.42) |
≥ 1 physician claim | SENS: 74% (72%-75%), SPEC: 68% (66%-69%), PPV: 82.7%, NPV: 54.7%, C-stat: 0.71 (0.70-0.72), Kappa: 0.39 (0.37-0.41) | ||
≥ 1 hospital discharge or ≥ 2 general practitioner claims or ≥ 1 pediatrician claim or ≥ 1 psychiatrist/neurologist claim or ≥ 1 other specialist claim | SENS: 67% (66%-69%), SPEC: 71% (69%-73%), PPV: 83.1%, NPV: 50.8%, C-stat: 0.69 (0.68-0.70), Kappa: 0.35 (0.33-0.37) | ||
≥ 1 hospital discharge or ≥ 3 general practitioner claims or ≥ 1 pediatrician claim or ≥ 1 psychiatrist/neurologist claim or ≥ 1 other specialist claim | SENS: 64% (63%-66%), SPEC: 73% (71%-74%), PPV: 83.2%, NPV: 49.1%, C-stat: 0.68 (0.67-0.69), Kappa: 0.33 (0.31-0.35) | ||
≥ 1 hospital discharge or ≥ 2 physician claims | SENS: 57% (56%-58%), SPEC: 84% (83%-86%), PPV: 88.4%, NPV: 48.2%, C-stat: 0.71 (0.70-0.72), Kappa: 0.35 (0.33-0.36) | ||
≥ 2 physician claims | SENS: 55% (54%-57%), SPEC: 85% (83%-86%), PPV: 88.5%, NPV: 47.3%, C-stat: 0.70 (0.69-0.71), Kappa: 0.33 (0.32-0.35) | ||
≥ 1 pediatrician claim | SENS: 54% (53%-56%), SPEC: 76% (75%-78%), PPV: 82.8%, NPV: 44.2%, C-stat: 0.65 (0.64-0.66), Kappa: 0.26 (0.24-0.28) | ||
≥ 1 hospital discharge or ≥ 2 general practitioner claims or ≥ 2 pediatrician claims or ≥ 2 psychiatrist/neurologist claims or ≥ 2 other specialist claims | SENS: 52% (51%-54%), SPEC: 86% (85%-87%), PPV: 88.7%, NPV: 46.1%, C-stat: 0.69 (0.68-0.70), Kappa: 0.31 (0.30-0.33) | ||
≥ 2 general practitioner claims or ≥ 2 pediatrician claims or ≥ 2 psychiatrist/neurologist claims or ≥ 2 other specialist claims | SENS: 50% (49%-52%), SPEC: 87% (85%-88%), PPV: 88.8%, NPV: 45.2%, C-stat: 0.68 (0.68-0.69), Kappa: 0.30 (0.28-0.31) | ||
≥ 1 general practitioner claim | SENS: 44% (42%-45%), SPEC: 89% (88%-90%), PPV: 89.5%, NPV: 42.8%, C-stat: 0.66 (0.66-0.67), Kappa: 0.26 (0.24-0.27) | ||
≥ 1 psychiatrist/neurologist claim | SENS: 14% (13%-15%), SPEC: 97% (96%-97%), PPV: 89.6%, NPV: 34.8%, C-stat: 0.55 (0.55-0.56), Kappa: 0.07 (0.07-0.08) | ||
Brooks, 2021Footnote 23 Canada |
Medical chart review-ASD diagnosis: |
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery or ≥ 1 physician claim | SENS: 75.9% (68.0%-83.8%), SPEC: 98.9% (98.6%-99.1%), PPV: 42.9% (36.0%-49.8%), NPV: 99.7% (99.6%-99.8%) |
≥ 1 physician claim | SENS: 74.1% (66.0%-82.2%), SPEC: 98.9% (98.7%-99.1%), PPV: 42.6% (35.6%-49.5%), NPV: 99.7% (99.6%-99.8%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery or ≥ 1 physician claim by any specialist | SENS: 67.9% (59.2%-76.5%), SPEC: 99.0% (98.9%-99.2%), PPV: 44.7% (37.2%-52.2%), NPV: 99.6% (99.5%-99.8%) | ||
≥ 1 physician claim by any specialist | SENS: 66.1% (57.3%-74.8%), SPEC: 99.1% (98.9%-99.2%), PPV: 44.3% (36.8%-51.8%), NPV: 99.6% (99.5%-99.7%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery; or ≥ 2 physician claims in 3 years | SENS: 59.8% (50.7%-68.9%), SPEC: 99.3% (99.1%-99.5%), PPV: 49.3% (40.9%-57.7%), NPV: 99.5% (99.4%-99.7%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery; or ≥ 2 physician claims in 2 years | SENS: 57.1% (48.0%-66.3%), SPEC: 99.3% (99.1%-99.5%), PPV: 48.5% (40.0%-57.0%), NPV: 99.5% (99.4%-99.7%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery; or ≥ 2 physician claims in 3 years with ≥ 1 from any specialist | SENS: 53.6% (44.3%-62.8%), SPEC: 99.4% (99.2%-99.5%), PPV: 48.4% (39.6%-57.2%), NPV: 99.5% (99.3%-99.6%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery; or ≥ 2 physician claims in 2 years with ≥ 1 from any specialist | SENS: 52.7% (43.4%-61.9%), SPEC: 99.4% (99.2%-99.5%), PPV: 48.4% (39.5%-57.2%), NPV: 99.5% (99.3%-99.6%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery; or ≥ 3 physician claims in 3 years | SENS: 50.0% (40.7%-59.3%), SPEC: 99.6% (99.4%-99.7%), PPV: 56.6% (46.8%-66.3%), NPV: 99.4% (99.3%-99.6%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery; or ≥ 3 physician claims in 3 years with ≥ 1 from any specialist | SENS: 49.1% (39.8%-58.4%), SPEC: 99.6% (99.5%-99.7%), PPV: 57.9% (48.0%-67.8%), NPV: 99.4% (99.3%-99.6%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery; or ≥ 3 physician claims in 2 years | SENS: 45.5% (36.3%-54.8%), SPEC: 99.6% (99.4%-99.7%), PPV: 54.3% (44.2%-64.3%), NPV: 99.4% (99.2%-99.5%) | ||
≥ 1 hospital discharge or ≥ 1 emergency department visit or ≥ 1 outpatient surgery; or ≥ 3 physician claims in 2 years with ≥ 1 from any specialist | SENS: 45.5% (36.3%-54.8%), SPEC: 99.6% (99.5%-99.7%), PPV: 56.0% (45.8%-66.2%), NPV: 99.4% (99.2%-99.5%) | ||
Brooks, 2021Footnote 24Footnote b Canada |
Medical chart review-ASD diagnosis: |
≥ 1 physician claim (299 or 315) | SENS: 33.0% (24.4%-42.6%), SPEC: 98.8% (98.5%-99.0%), PPV: 23.4% (17.1%-30.8%), NPV: 99.2% (99.0%-99.4%) |
≥ 2 physician claims (299 or 315) in 3 years | SENS: 14.3% (8.4%-22.2%), SPEC: 99.8% (99.7%-99.9%), PPV: 44.4% (27.9%-61.9%), NPV: 99.0% (98.8%-99.2%) | ||
≥ 2 physician claims (299 or 315) in 2 years | SENS: 13.4% (7.7%-21.1%), SPEC: 99.8% (99.7%-99.9%), PPV: 45.5% (28.1%-63.6%), NPV: 99.0% (98.8%-99.2%) | ||
≥ 2 physician claims (299 or 315) in 1 year | SENS: 11.6% (6.3%-19.0%), SPEC: 99.9% (99.8%-99.9%), PPV: 52.0% (31.3%-72.2%), NPV: 99.0% (98.8%-99.2%) | ||
≥ 3 physician claims (299 or 315) in 3 years | SENS: 2.7% (0.6%-7.6%), SPEC: 99.9% (99.9%-100%), PPV: 33.3% (7.5%-70.1%), NPV: 98.9% (98.7%-99.1%) | ||
≥ 3 physician claims (299 or 315) in 2 years | SENS: 2.7% (0.6%-7.6%), SPEC: 99.9% (99.9%-100%), PPV: 33.3% (7.5%-70.1%), NPV: 98.9% (98.7%-99.1%) | ||
≥ 3 physician claims (299 or 315) in 1 year | SENS: 1.8% (0.2%-6.3%), SPEC: 100% (99.9%-100%), PPV: 33.3% (4.3%-77.7%), NPV: 98.9% (98.7%-99.1%) | ||
≥ 1 physician claim (299 only) | SENS: 28.6% (20.4%-37.9%), SPEC: 99.9% (99.9%-100%), PPV: 86.5% (71.2%-95.5%), NPV: 99.2% (99.0%-99.4%) | ||
≥ 2 physician claims (299 only) in 3 years | SENS: 12.5% (7.0%-20.1%), SPEC: 100% (99.9%-100%), PPV: 93.3% (68.1%-99.8%), NPV: 99.0% (98.8%-99.2%) | ||
≥ 2 physician claims (299 only) in 2 years | SENS: 11.6% (6.3%-19.0%), SPEC: 100% (99.9%-100%), PPV: 92.9% (66.1%-99.8%), NPV: 99.0% (98.8%-99.2%) | ||
≥ 2 physician claims (299 only) in 1 year | SENS: 9.8% (5.0%-16.9%), SPEC: 100% (99.9%-100%), PPV: 91.7% (61.5%-99.8%), NPV: 99.0% (98.8%-99.2%) | ||
≥ 3 physician claims (299 only) in 3 years | SENS: 1.8% (0.2%-6.3%), SPEC: 100% (99.9%-100%), PPV: 66.7% (9.4%-99.2%), NPV: 98.9% (98.7%-99.1%) | ||
≥ 3 physician claims (299 only) in 2 years | SENS: 1.8% (0.2%-6.3%), SPEC: 100% (99.9%-100%), PPV: 66.7% (9.4%-99.2%), NPV: 98.9% (98.7%-99.1%) | ||
≥ 3 physician claims (299 only) in 1 year | SENS: 0.9% (0.0%-4.9%), SPEC: 100% (100%-100%), PPV: 100% (2.5%-100%), NPV: 98.9% (98.7%-99.1%) | ||
Burke, 2014Footnote 25 USA |
Medical chart review-clinical classification criteria, ASD diagnosis: |
≥ 1 ASD-associated condition (no ASD insurance claim) | NPV (level 1 or 2): > 98% |
≥ 1 ASD insurance claim | PPV (level 1): 43.3% (38.2%-48.5%) PPV (level 1 or 2): 74.2% (69.4%-78.6%) |
||
≥ 2 ASD insurance claims | PPV (level 1): 60.9% (53.5%-68.1%) PPV (level 1 or 2): 87.4% (81.6%-91.8%) |
||
Coleman, 2015Footnote 26 USA |
Medical chart review-clinical classification criteria, ASD diagnosis: |
1 insurance claim or outpatient diagnosis | PPVFootnote c (confirmed): 27% PPVFootnote c (confirmed, probable and possible): 72% |
≥ 2 insurance claims or outpatient diagnoses, at least one day apart | PPVFootnote c (confirmed): 36% PPVFootnote c (confirmed, probable and possible): 87% |
||
Coo, 2017Footnote 27 Canada |
Medical chart review-ASD diagnosis: |
Ages 2-5: | Ages 2-5: |
≥ 1 physician claim or ≥ 1 "ASD" education code or ≥ 1 hospital discharge or ≥ 1 adolescent treatment centre diagnosis | SENS: 88% (84%-91%), minimum PPVFootnote d: 73% (68%-77%) | ||
≥ 1 physician claim or ≥ 1 "ASD" education code | SENS: 88% (83%-91%), minimum PPVFootnote d: 73% (69%-78%) | ||
≥ 1 physician claim | SENS: 85% (80%-88%), minimum PPVFootnote d: 73% (68%-77%) | ||
≥ 2 physician claims or ≥ 1 "ASD" education code | SENS: 57% (51%-62%), minimum PPVFootnote d: 89% (84%-93%) | ||
≥ 2 physician claims | SENS: 50% (45%-56%), minimum PPVFootnote d: 89% (83%-93%) | ||
Ages 6-9: | Ages 6-9: | ||
≥ 1 physician claim or ≥ 1 "ASD" education code or ≥ 1 hospital discharge or ≥ 1 adolescent treatment centre diagnosis | SENS: 90% (88%-93%), minimum PPVFootnote d: 65% (61%-68%) | ||
≥ 1 physician claim or ≥ 1 "ASD" education code | SENS: 89% (86%-92%), minimum PPVFootnote d: 65% (61%-68%) | ||
≥ 1 physician claim or ≥ 2 "ASD" education codes | SENS: 88% (85%-90%), minimum PPVFootnote d: 65% (62%-69%) | ||
≥ 2 physician claims or ≥ 1 "ASD" education code | SENS: 84% (81%-87%), minimum PPVFootnote d: 78% (75%-81%) | ||
≥ 2 physician claims or ≥ 2 "ASD" education codes | SENS: 81% (78%-84%), minimum PPVFootnote d: 80% (77%-84%) | ||
≥ 1 physician claim | SENS: 77% (73%-80%), minimum PPVFootnote d: 66% (62%-69%) | ||
≥ 1 "ASD" education code | SENS: 68% (64%-72%), minimum PPVFootnote d: 87% (84%-90%) | ||
≥ 2 "ASD" education codes | SENS: 66% (62%-70%), minimum PPVFootnote d: 88% (85%-91%) | ||
≥ 2 physician claims | SENS: 58% (54%-62%), minimum PPVFootnote d: 83% (79%-86%) | ||
Ages 10-14: | Ages 10-14: | ||
≥ 1 physician claim or ≥ 1 "ASD" education code or ≥ 1 hospital discharge or ≥ 1 adolescent treatment centre diagnosis | SENS: 88% (85%-90%), minimum PPVFootnote d: 60% (57%-63%) | ||
≥ 1 physician claim or ≥ 1 "ASD" education code | SENS: 86% (83%-88%), minimum PPVFootnote d: 61% (58%-63%) | ||
≥ 1 physician claim or ≥ 2 "ASD" education codes | SENS: 84% (82%-87%), minimum PPVFootnote d: 62% (59%-64%) | ||
≥ 2 physician claims or ≥ 1 "ASD"' education code | SENS: 80% (77%-83%), minimum PPVFootnote d: 70% (67%-73%) | ||
≥ 2 physician claims or ≥ 2 "ASD" education codes | SENS: 78% (75%-81%), minimum PPVFootnote d: 72% (69%-75%) | ||
≥ 1 physician claim | SENS: 73% (69%-76%), minimum PPVFootnote d: 64% (60%-70%) | ||
≥ 1 "ASD" education code | SENS: 73% (70%-76%), minimum PPVFootnote d: 75% (72%-78%) | ||
≥ 2 "ASD" education codes | SENS: 69% (66%-72%), minimum PPVFootnote d: 78% (74%-81%) | ||
≥ 2 physician claims | SENS: 56% (52%-59%), minimum PPVFootnote d: 78% (75%-82%) | ||
Dodds, 2009Footnote 28 Canada |
Clinical diagnosis: Clinical diagnosis by a team of ASD specialists; based on the Autism Diagnostic Interview-Revised, the Autism Diagnostic Observation Schedule and clinical judgment using DSM-IV-TR. |
≥ 1 hospital discharge or ≥ 1 physician claim or ≥ 1 mental health outpatient diagnosis | SENS: 69.3%, SPEC: 77.3%, C-stat: 0.76 |
≥ 1 hospital discharge or ≥ 1 physician claim | SENS: 62.5%, SPEC: 83.0%, C-stat: 0.74 | ||
≥ 1 physician claim | SENS: 59.7%, SPEC: 85.2%, C-stat: 0.72 | ||
≥ 1 hospital discharge or ≥ 2 physician claims or ≥ 2 mental health outpatient diagnoses | SENS: 42.6%, SPEC: 88.6%, C-stat: 0.67 | ||
≥ 1 hospital discharge or ≥ 2 physician claims | SENS: 36.9%, SPEC: 93.2%, C-stat: 0.65 | ||
≥ 1 mental health outpatient diagnosis | SENS: 16.5%, SPEC: 92.0%, C-stat: 0.54 | ||
≥ 1 hospital discharge diagnosis | SENS: 11.9%, SPEC: 97.7%, C-stat: 0.55 | ||
Hagberg, 2017Footnote 29 United Kingdom |
Medical chart review-ASD diagnosis: |
≥ 1 Read code | PPV: 91.9% |
Lauritsen, 2010Footnote 30 Denmark |
Medical chart review-clinical classification criteria: |
1 psychiatric inpatient or outpatient diagnosis | PPV: 97% (96%-99%) |
Surén, 2019Footnote 31 Norway |
Medical chart review-clinical classification criteria: |
≥ 1 diagnosis | PPV: 86% (83%-89%) |
ADHD | |||
Daley, 2014Footnote 32 USA |
Medical chart review-clinical classification criteria, ADHD diagnosis, standardized screening checklist: |
2 outpatient diagnoses, between 7 and 365 days apart (incident ADHD) | Ages 3-5 at diagnosis: |
Gruschow, 2016Footnote 33 USA |
Clinical case definition:Footnote f |
≥ 1 inpatient or outpatient diagnosis or problem listFootnote g diagnosis | SENS: 96%-97% (95%-97%), SPEC: 98%-99% (97%-99%), PPV: 83%-98% (81%-99%), NPV: 99% (99%-99%),Footnote h Kappa: 0.87 (0.75-0.99) |
Mohr-Jensen, 2016Footnote 33 Denmark |
Medical chart review-clinical classification criteria: |
1 psychiatric inpatient or outpatient diagnosis | PPV: 86.8% |
Morkem, 2020Footnote 35 Canada |
Medical chart review-ADHD diagnosis: |
≥ 1 medical visit (ICD code) and ≥ 1 prescription of ADHD-related medications or ≥ 2 medical visits (ICD code) | PPV: 95.9% (92.6%-98.0%), NPV: 96.3% (93.2%-98.3%) |
ASD studies
For studies on ASD, the diagnostic accuracy of the algorithms tested was summarized in three different ways.
By health administrative database algorithm
Seven studies tested and compared multiple algorithms, each requiring more or fewer diagnostic codes from a specific health administrative data source (i.e. physician claims) over a comparable time frame.Footnote 22Footnote 23Footnote 24Footnote 25Footnote 26Footnote 27Footnote 28 In general, these studies found that increasing the number of ASD diagnoses required from physician claims increased the specificity and PPV of the algorithm, at the expense of sensitivity. For example, one study found a sensitivity of 62.5% and specificity of 83.0% when using an algorithm that required at least one ASD code from either the hospital or physician claims database.Footnote 28 However, when the same algorithm required at least two ASD codes from the physician claims database, the specificity improved (93.2%) at the cost of a dramatic reduction in sensitivity (36.9%).
Three studies tested the value of additional health administrative data sources in their algorithms.Footnote 22Footnote 23Footnote 28 Two of these studies did not find a significant improvement in the diagnostic accuracy of those algorithms that required diagnostic codes from a combination of hospital discharge abstracts or physician claims (with or without emergency department visits or outpatient surgery) compared to physician claims alone.Footnote 22Footnote 23 One study that required at least one diagnostic code from one of three data sources (hospital discharge abstracts, mental health outpatient data or physician claims) increased the sensitivity of the algorithm by 9.6%, at the expense of specificity (7.9% decrease), compared to physician claims only.Footnote 28 Additionally, upon testing the accuracy of algorithms based on these three data sources separately (i.e. physician claims only, hospital discharge abstract only, mental health outpatient data only), the same study found that ASD diagnostic codes from physician claims led to the best-performing algorithm.
Two studies varied the number of years in which ASD diagnostic codes from physician claims were required in their algorithms (e.g. two or more codes in two years vs. two or more codes in three years).Footnote 23Footnote 24 Both of these studies found that increasing the number of years in which the codes could be found did not result in significant improvement in the diagnostic accuracy.
By reference standard
Of the 10 included studies, two varied the diagnostic criteria required for ASD case confirmation from more to less stringent.Footnote 25Footnote 26 Both of these studies found that when the evidence of ASD required in the medical chart was less stringent, the PPV increased substantially. For example, one study increased the PPV from 27% to 72% for an algorithm requiring at least one ASD code, and from 36% to 87% for an algorithm requiring at least two ASD codes.Footnote 26
By combining education and health administrative data
Only one study validated algorithms using education and health administrative data to identify ASD.Footnote 27 In general, the algorithms that combined education and physician claims data demonstrated an improvement in sensitivity, but the PPV either remained the same or decreased slightly compared to algorithms based on physician claims data alone. For example, in the group aged 6 to 9 years, requiring at least one code from physician claims or education data versus at least one physician code only caused a substantial increase in sensitivity (from 77% to 89%) and a nominal decrease in PPV (from 66% to 65%). A similar pattern was observed in the oldest age group, those aged 10 to 14 years.
ADHD studies
For studies on ADHD, the diagnostic accuracy of the algorithms tested was summarized in the same ways; however, none of the studies on ADHD used education data in addition to health administrative data, therefore any benefit of this data cannot be assessed.
By health administrative database algorithm
All four studies tested and reported results for one algorithm only and each of these algorithms included diagnostic codes from one administrative data source only.Footnote 32Footnote 33Footnote 34Footnote 35 As a result, there was not enough information to assess the impact of requiring more or fewer diagnostic codes or utilizing additional data sources in identifying ADHD cases.
By reference standard
One of the four studies varied the diagnostic criteria required for the reference standard and presented results for more and less stringent requirements for incident ADHD case confirmation.Footnote 32 As more documented evidence of incident ADHD was required, the PPV decreased from 71.5% to 32.8% in children aged 3 to 5 years at diagnosis and from 73.6% to 30.9% in children aged 6 to 9.
Discussion
A total of 14 studies met our eligibility criteria,Footnote 22Footnote 23Footnote 24Footnote 25Footnote 26Footnote 27Footnote 28Footnote 29Footnote 30Footnote 31Footnote 32Footnote 33Footnote 34Footnote 35 of which 10 focussed on the validation of administrative database algorithms to identify ASDFootnote 22Footnote 23Footnote 24Footnote 25Footnote 26Footnote 27Footnote 28Footnote 29Footnote 30Footnote 31 and four on ADHD.Footnote 32Footnote 33Footnote 34Footnote 35 Six of the 14 studies were conducted in Canada and had generally a higher reporting quality and lower risk of bias compared to studies from other countries.Footnote 22Footnote 23Footnote 24Footnote 27Footnote 28Footnote 35 There were no studies identified for FASD that met the eligibility criteria for our review. Other important gaps identified included a lack of validation studies on adults, and the identification of incident, rather than prevalent, cases.
While there have been efforts to use health administrative data to estimate the prevalence of FASD in Canada,Footnote 36 this work has been done in the absence of any validated administrative database algorithms. The lack of published validation studies for FASD may be connected to several fundamental issues related to assigning an FASD diagnosis, including:
- the need for a multidisciplinary assessment and knowledge of prenatal alcohol exposure;Footnote 37Footnote 38
- the lack of diagnostic criteria or detailed description for the diagnosis "neurodevelopmental disorder associated with prenatal alcohol exposure" in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition;Footnote 1Footnote 39 and
- the non-specific nature of International Classification of Diseases diagnostic codes that can be used for a primary FASD diagnosis,Footnote 39 and the specificity of codes required to capture diagnostic entities associated with FASD, which is not always available within health administrative databases.
Given the evolving nature of FASD diagnostic practices and coding, the use of administrative database algorithms to identify FASD cases currently poses some unique challenges.
Results from the quality assessments revealed "high" or "unclear" risk of bias in at least one domain of the QUADAS-2 tool for 12 of the 14 studies,Footnote 22Footnote 25Footnote 26Footnote 27Footnote 28Footnote 29Footnote 30Footnote 31Footnote 32Footnote 33Footnote 34Footnote 35 indicating the measures of diagnostic accuracy should be interpreted with caution. Furthermore, significant heterogeneity in terms of study design and conduct across the included studies prohibited a quantitative synthesis of the results on the accuracy of the algorithms tested. Of particular importance were the differences in how cases were initially selected (i.e. either by diagnostic codes in the administrative database or the reference standard) and the inclusion or exclusion of cases without the condition of interest. To ensure unbiased estimates of diagnostic accuracy, the disease prevalence in the validation cohort must approximate the prevalence in the population.Footnote 40 This is achieved when an appropriate reference standard is applied (which accurately classifies cases with and without the condition of interest) and patients are randomly sampled, ideally from the general population. In addition, to compute the key diagnostic accuracy estimates necessary to evaluate the characteristics of a diagnostic test, both cases with and without the condition of interest are needed to populate all four cells of a two-way contingency table.
Unfortunately, most studies in our review (8 out of 14) used diagnostic codes in the administrative database to initially select cases with the condition of interest only.Footnote 26Footnote 27Footnote 29Footnote 30Footnote 31Footnote 32Footnote 34Footnote 35 This approach can generate biased estimates of diagnostic accuracy, given the underlying prevalence of the conditions of interest is unknown, and it limits the diagnostic accuracy measures that can be computed to PPV only.Footnote 40 Although PPV defines the likelihood of false-positive test results, alone it does not provide information on the likelihood of false-negative test results or how many cases are being missed by the algorithm.
Furthermore, while seven of the 14 studies included cases without the condition of interest,Footnote 22Footnote 23Footnote 24Footnote 25Footnote 28Footnote 33Footnote 35 two drew their samples from specialty clinics or service providers,Footnote 22Footnote 28 and two oversampled children more likely to have the disorder;Footnote 25Footnote 33 both of these sampling methods can generate falsely elevated PPV due to the high prevalence of cases. In addition, only four of these studies reported the four key measures of diagnostic accuracy that can be computed using this approach.Footnote 22Footnote 23Footnote 24Footnote 33
Despite these limitations, findings from our review suggest that increasing the number of ASD diagnostic codes required from physician claims database increases specificity and PPV of an algorithm, at the expense of sensitivity. In addition, the use of multiple sources of health administrative data in an algorithm designed to identify ASD cases (i.e. hospital, physician claims and mental health services) may increase sensitivity with only a slight cost to the specificity and PPV, with physician claims database being the best single source.
Furthermore, the findings from one study showed that the addition of education data, in combination with physician claims data, might improve case capture (sensitivity) in school-aged children and youth at a slight cost to precision (PPV).Footnote 27 However, the lack of cases without ASD in this study limited the diagnostic accuracy measures that could be computed. Therefore, additional studies are required to evaluate the full impact of including education data in combination with physician claims data in administrative database algorithms for ASD case ascertainment purposes.
Due to the nature of the ADHD studies included, there was not enough information to assess the impact of number of diagnostic codes or additional data sources on algorithm accuracy. However, based on the performance measures reported, there was some evidence that ADHD could be identified through health administrative data sources.
To address the gaps uncovered by this review, as well as the reporting quality and risk of bias issues found, additional high quality studies validating the use of administrative database algorithms to identify cases of the selected neurodevelopmental disorders are required. These issues are not unique to this area of study, and guidance on how to conduct and report the findings from such validation studies has been previously published.Footnote 12Footnote 20Footnote 40 In light of all of this, we recommend authors follow published recommendations on study methodsFootnote 20Footnote 40 as well as reporting guidelinesFootnote 12 when validating administrative database algorithms for case identification purposes.
Another challenge associated with the use of specific diagnostic codes for neurodevelopmental disorders such as ASD, ADHD and FASD relates to the fact that the boundaries between these disorders are often not clear and the presence of comorbid disorders is common. While historically neurodevelopmental disorders have been categorically diagnosed based on a constellation of signs and symptoms, there is an evolving body of literature on the need for new approaches to their diagnosis that involves conceptualizing these disorders as lying on a neurodevelopmental continuum.Footnote 41 This shift will have important implications on the classification of these disorders, clinical practice, research and surveillance.
Strengths and limitations
The strengths of this review include:
- its prospective registration with the Prospective Register of Systematic Reviews (PROSPERO), which helps to reduce the potential for bias in the conduct and reporting of systematic reviews;Footnote 42
- the development of a literature search strategy by an experienced reference librarian that included a systematic search of multiple databases, the grey literature and reference lists of included articles;
- a rigorous assessment of the reporting quality as well as the risk of bias and applicability of each included study using the modified STARD checklistFootnote 12 and the QUADAS-2 tool,Footnote 20 respectively; and
- the use of the PRISMA standards to ensure full reporting and transparency.Footnote 17
However, there are some limitations worth noting, such as:
- challenges in conducting a comprehensive search for studies focussing on administrative database algorithms, given they are not well catalogued in the databases we searched (i.e. no medical subject headings on "administrative database" exist);
- the potential for language bias, as studies published in a language other than English or French were not considered, as well as publication bias, since validation studies with poor results may be less likely to be published; and
- the significant heterogeneity between included studies did not permit the conduct of quantitative analyses such as a meta-regression or meta-analysis.
Conclusion
To our knowledge, this is the first review that has systematically appraised and examined the empirical evidence on validity of administrative database algorithms to identify ASD, ADHD and FASD. While a few studies have validated algorithms for ASD and ADHD case ascertainment purposes, none have been performed for FASD to date. Significant heterogeneity across included studies limited our ability to carry out quantitative analyses. Such analyses would be beneficial to further strengthen the evidence around the best-performing algorithms for neurodevelopmental disorders surveillance and research, should the quality of available studies allow.
Nevertheless, there is some evidence to suggest that ASD and ADHD can be identified using administrative data, although information about the ability to discriminate reliably between individuals with and without the disorder of interest is limited. Given the variations in reporting quality and risk of bias issues found, additional high quality validation studies are needed. To optimize the usefulness of future studies, we recommend authors follow published recommendations on study design and conductFootnote 20Footnote 40 and reporting guidelines for validation studies involving administrative data.Footnote 12
Acknowledgements
The authors wish to acknowledge Katherine Merucci, a reference librarian from the Health Library within the Corporate Services Branch of Health Canada and Public Health Agency of Canada, who developed and ran the database and grey literature search strategies.
No funding, including grants or other research support, was obtained for this study.
Conflicts of interest
None.
Authors' contributions and statement
CL and SO conceptualized and designed the study; SO and SP helped with developing the search strategy and retrieving articles; CL, ML and SO screened the literature; ML and SO were responsible for data extraction, reporting quality, risk of bias and applicability assessments; all authors analyzed and/or interpreted the data; SO and SP drafted the manuscript; and all authors contributed to the initial draft and revisions of the manuscript.
The content and views expressed in this article are those of the authors and do not necessarily reflect those of the Government of Canada.
# | Searches | Results |
---|---|---|
1 | exp autism spectrum disorder/ or (autism or autistic or (asperger* adj (syndrome or disorder or disease)) or kanner* syndrome or childhood disintegrative disorder or (pervasive adj2 developmental disorder*) or heller* syndrome or disintegrative psychosis).tw,kf,kw. | 48476 |
2 | Attention Deficit Disorder with Hyperactivity/ or "Attention Deficit and Disruptive Behavior Disorders"/ or ((attention deficit adj4 disorder*) or (hyperkinetic adj2 (disorder or syndrome)) or minimal brain dysfunction or adhd or addh).tw,kf,kw. | 40083 |
3 | Fetal Alcohol Spectrum Disorders/ or (f?etal alcohol or (alcohol related adj3 (neurodevelopment or birth)) or "Neurobehavioral disorder associated with prenatal alcohol exposure" or "Growth Retardation, Facial Abnormalities, and Central Nervous System Dysfunction").tw,kw,kf. | 5699 |
4 | or/1-3 | 90368 |
5 | diagnosis/ or incidence/ or prevalence/ or (diagnos* or incidence* or prevalence* or new case?).tw,kw,kf. | 3507353 |
6 | 4 and 5 | 27896 |
7 | exp autism spectrum disorder/di, dg, ep or Attention Deficit Disorder with Hyperactivity/di, dg, ep or "Attention Deficit and Disruptive Behavior Disorders"/di, dg, ep or Fetal Alcohol Spectrum Disorders/di, dg, ep | 22499 |
8 | 6 or 7 [Developmental disorders] | 39288 |
9 | exp medical records/ or international classification of diseases/ or exp diagnostic techniques, neurological/ or exp clinical laboratory techniques/ or "diagnostic techniques and procedures"/ | 3030824 |
10 | (((patient* or medical or health) adj3 record*) or ((diagnos* or defin* or classificat*) adj2 disease list) or ((self report* or standard*) adj2 measure*) or (icd adj (cod* or classification*)) or (international adj3 classification)).tw,kw,kf. | 248712 |
11 | medicaid/ or birth certificates/ or death certificates/ or hospital records/ or insurance claim reporting/ or exp insurance, health/ or databases, factual/ or information systems/ or databases as topic/ or database management systems/ or software/ or insurance claim review/ or patient discharge/ or exp registries/ or utilization review/ | 472574 |
12 | (((administrat* or physician* or inpatient* or emergency* or hospital* or clinic or clinics or pharmac* or insurance) adj4 (admission* or data or dataset* or database* or data base? or data bank? or claim* or billing* or record* or utilizat* or utilisat*)) or ((claim* or discharg*) adj2 data*) or ((database? or databank? or data base? or data bank?) adj4 (factual or administrat* or claim? or register* or registr* or topic? or system?)) or (claim? adj2 (analy* or review? or physician? or pharmac* or drug?)) or (insurance adj2 (claim* or audit*)) or medicaid or (health* adj2 plan) or ((death* or birth*) adj1 (certificate* or record*))).tw,kw,kf. or (database? or databank? or data base? or data bank?).ti. | 290378 |
13 | or/9-12 [Diagnosis Methods] | 3827029 |
14 | 8 and 13 | 5662 |
15 | exp Diagnostic Errors/ or Diagnosis, Differential/ or "Predictive Value of Tests"/ or "Sensitivity and Specificity"/ or ROC Curve/ or Area under Curve/ or Bayes Theorem/ or algorithms/ or validation studies as topic/ | 1272706 |
16 | ((diagnos* adj3 (schedul* or clinical* or technique* or procedur* or assess* or standard* or error* or false or incorrect* or wrong* or correct*)) or misdiagnos* or (clinical* adj (standard* or criteri* or measure* or classifi* or technique* or assess*)) or ((positive or negative) adj2 (predict* or false)) or sensitiv* or specif* or accura* or valid* or reliab* or agree* or concord* or misclass* or ((case or cases) adj2 ascertain*) or algorithm? or (bayes* adj (theorem or analysis or approach or forecast or method or prediction)) or (roc adj (curve or analysis)) or receiver operating characteristic).tw,kw,kf. | 5708365 |
17 | 15 or 16 | 6369226 |
18 | 14 and 17 | 2476 |
19 | limit 18 to (yr="1995-2019" and (english or french)) | 2196 |
# | Searches | Results |
---|---|---|
1 | exp autism spectrum disorder/ or (autism or autistic or (asperger* adj (syndrome or disorder or disease)) or kanner* syndrome or childhood disintegrative disorder or (pervasive adj2 developmental disorder*) or heller* syndrome or disintegrative psychosis).tw,kf,kw. | 53121 |
2 | Attention Deficit Disorder with Hyperactivity/ or "Attention Deficit and Disruptive Behavior Disorders"/ or ((attention deficit adj4 disorder*) or (hyperkinetic adj2 (disorder or syndrome)) or minimal brain dysfunction or adhd or addh).tw,kf,kw. | 42517 |
3 | Fetal Alcohol Spectrum Disorders/ or (f?etal alcohol or (alcohol related adj3 (neurodevelopment or birth)) or "Neurobehavioral disorder associated with prenatal alcohol exposure" or "Growth Retardation, Facial Abnormalities, and Central Nervous System Dysfunction").tw,kw,kf. | 5917 |
4 | or/1-3 | 97183 |
5 | diagnosis/ or incidence/ or prevalence/ or (diagnos* or incidence* or prevalence* or new case?).tw,kw,kf. | 3713126 |
6 | 4 and 5 | 30299 |
7 | exp autism spectrum disorder/di, dg, ep or Attention Deficit Disorder with Hyperactivity/di, dg, ep or "Attention Deficit and Disruptive Behavior Disorders"/di, dg, ep or Fetal Alcohol Spectrum Disorders/di, dg, ep | 23876 |
8 | 6 or 7 [Developmental disorders] | 42246 |
9 | exp medical records/ or international classification of diseases/ or exp diagnostic techniques, neurological/ or exp clinical laboratory techniques/ or "diagnostic techniques and procedures"/ | 3106548 |
10 | (((patient* or medical or health) adj3 record*) or ((diagnos* or defin* or classificat*) adj2 disease list) or ((self report* or standard*) adj2 measure*) or (icd adj (cod* or classification*)) or (international adj3 classification)).tw,kw,kf. | 268618 |
11 | medicaid/ or birth certificates/ or death certificates/ or hospital records/ or insurance claim reporting/ or exp insurance, health/ or databases, factual/ or information systems/ or databases as topic/ or database management systems/ or software/ or insurance claim review/ or patient discharge/ or exp registries/ or utilization review/ | 498782 |
12 | (((administrat* or physician* or inpatient* or emergency* or hospital* or clinic or clinics or pharmac* or insurance) adj4 (admission* or data or dataset* or database* or data base? or data bank? or claim* or billing* or record* or utilizat* or utilisat*)) or ((claim* or discharg*) adj2 data*) or ((database? or databank? or data base? or data bank?) adj4 (factual or administrat* or claim? or register* or registr* or topic? or system?)) or (claim? adj2 (analy* or review? or physician? or pharmac* or drug?)) or (insurance adj2 (claim* or audit*)) or medicaid or (health* adj2 plan) or ((death* or birth*) adj1 (certificate* or record*))).tw,kw,kf. or (database? or databank? or data base? or data bank?).ti. | 312830 |
13 | or/9-12 [Diagnosis] | 3955207 |
14 | 8 and 13 | 6161 |
15 | exp Diagnostic Errors/ or Diagnosis, Differential/ or "Predictive Value of Tests"/ or "Sensitivity and Specificity"/ or ROC Curve/ or Area under Curve/ or Bayes Theorem/ or algorithms/ or validation studies as topic/ | 1317678 |
16 | ((diagnos* adj3 (schedul* or clinical* or technique* or procedur* or assess* or standard* or error* or false or incorrect* or wrong* or correct*)) or misdiagnos* or (clinical* adj (standard* or criteri* or measure* or classifi* or technique* or assess*)) or ((positive or negative) adj2 (predict* or false)) or sensitiv* or specif* or accura* or valid* or reliab* or agree* or concord* or misclass* or ((case or cases) adj2 ascertain*) or algorithm? or (bayes* adj (theorem or analysis or approach or forecast or method or prediction)) or (roc adj (curve or analysis)) or receiver operating characteristic).tw,kw,kf. | 6045219 |
17 | 15 or 16 | 6721820 |
18 | 14 and 17 | 2712 |
19 | limit 18 to (yr="1995-Current" and (english or french)) | 2429 |
20 | (201908* or 201909* or 201910* or 201911* or 201912* or 202*).ez. | 1099141 |
21 | 19 and 20 | 99 |
# | Searches | Results |
---|---|---|
1 | exp autism spectrum disorder/ or (autism or autistic or (asperger* adj (syndrome or disorder or disease)) or kanner* syndrome or childhood disintegrative disorder or (pervasive adj2 developmental disorder*) or heller* syndrome or disintegrative psychosis).tw,kf,kw. | 57062 |
2 | Attention Deficit Disorder with Hyperactivity/ or "Attention Deficit and Disruptive Behavior Disorders"/ or ((attention deficit adj4 disorder*) or (hyperkinetic adj2 (disorder or syndrome)) or minimal brain dysfunction or adhd or addh).tw,kf,kw. | 44499 |
3 | Fetal Alcohol Spectrum Disorders/ or (f?etal alcohol or (alcohol related adj3 (neurodevelopment or birth)) or "Neurobehavioral disorder associated with prenatal alcohol exposure" or "Growth Retardation, Facial Abnormalities, and Central Nervous System Dysfunction").tw,kw,kf. | 6077 |
4 | or/1-3 | 102829 |
5 | diagnosis/ or incidence/ or prevalence/ or (diagnos* or incidence* or prevalence* or new case?).tw,kw,kf. | 3904190 |
6 | 4 and 5 | 32263 |
7 | exp autism spectrum disorder/di, dg, ep or Attention Deficit Disorder with Hyperactivity/di, dg, ep or "Attention Deficit and Disruptive Behavior Disorders"/di, dg, ep or Fetal Alcohol Spectrum Disorders/di, dg, ep | 25017 |
8 | 6 or 7 [Developmental disorders] | 44704 |
9 | exp medical records/ or international classification of diseases/ or exp diagnostic techniques, neurological/ or exp clinical laboratory techniques/ or "diagnostic techniques and procedures"/ | 3171646 |
10 | (((patient* or medical or health) adj3 record*) or ((diagnos* or defin* or classificat*) adj2 disease list) or ((self report* or standard*) adj2 measure*) or (icd adj (cod* or classification*)) or (international adj3 classification)).tw,kw,kf. | 287276 |
11 | medicaid/ or birth certificates/ or death certificates/ or hospital records/ or insurance claim reporting/ or exp insurance, health/ or databases, factual/ or information systems/ or databases as topic/ or database management systems/ or software/ or insurance claim review/ or patient discharge/ or exp registries/ or utilization review/ | 519079 |
12 | (((administrat* or physician* or inpatient* or emergency* or hospital* or clinic or clinics or pharmac* or insurance) adj4 (admission* or data or dataset* or database* or data base? or data bank? or claim* or billing* or record* or utilizat* or utilisat*)) or ((claim* or discharg*) adj2 data*) or ((database? or databank? or data base? or data bank?) adj4 (factual or administrat* or claim? or register* or registr* or topic? or system?)) or (claim? adj2 (analy* or review? or physician? or pharmac* or drug?)) or (insurance adj2 (claim* or audit*)) or medicaid or (health* adj2 plan) or ((death* or birth*) adj1 (certificate* or record*))).tw,kw,kf. or (database? or databank? or data base? or data bank?).ti. | 334466 |
13 | or/9-12 [Diagnosis] | 4068131 |
14 | 8 and 13 | 6575 |
15 | exp Diagnostic Errors/ or Diagnosis, Differential/ or "Predictive Value of Tests"/ or "Sensitivity and Specificity"/ or ROC Curve/ or Area under Curve/ or Bayes Theorem/ or algorithms/ or validation studies as topic/ | 1352991 |
16 | ((diagnos* adj3 (schedul* or clinical* or technique* or procedur* or assess* or standard* or error* or false or incorrect* or wrong* or correct*)) or misdiagnos* or (clinical* adj (standard* or criteri* or measure* or classifi* or technique* or assess*)) or ((positive or negative) adj2 (predict* or false)) or sensitiv* or specif* or accura* or valid* or reliab* or agree* or concord* or misclass* or ((case or cases) adj2 ascertain*) or algorithm? or (bayes* adj (theorem or analysis or approach or forecast or method or prediction)) or (roc adj (curve or analysis)) or receiver operating characteristic).tw,kw,kf. | 6345372 |
17 | 15 or 16 | 7034188 |
18 | 14 and 17 | 2895 |
19 | limit 18 to (yr="1995-Current" and (english or french)) | 2611 |
20 | (202007* or 202008* or 202009* or 20201* or 202*).ez. | 1831745 |
21 | 19 and 20 | 182 |
Section, Topic and Item | Bickford, 2020Footnote 22 | Brooks, 2021Footnote 23 | Brooks, 2021Footnote 24 | Burke,2014Footnote 25 | Coleman, 2015Footnote 26 | Coo,2017Footnote 27 | Dodds,2009Footnote 28 | Hagberg, 2017Footnote 29 | Lauritsen, 2010Footnote 30 | Surén, 2019Footnote 31 | Daley,2014Footnote 32 | Gruschow, 2016Footnote 33 | Mohr-Jensen, 2016Footnote 34 | Morkem,2020Footnote 35 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Title, keywords, abstract | ||||||||||||||
1. Identifies article as study of assessing diagnostic accuracy? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
2. Identifies article as study of administrative data? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | Yes | Yes | No | Yes |
Introduction | ||||||||||||||
3. States disease identification and validation as one of goals of study? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Methods | ||||||||||||||
Participants in validation cohort | ||||||||||||||
4. Describes validation cohort (cohort of patients to which reference standard was applied)? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
4a. Age? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
4b. Disease? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
4c. Severity? | No | No | No | No | No | No | No | No | No | No | No | No | No | No |
4d. Location/jurisdiction? | Yes | Yes | Yes | No | No | Yes | Yes | No | Yes | Yes | No | Yes | Yes | No |
5. Describes recruitment procedure of validation cohort? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
5a. Inclusion criteria? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
5b. Exclusion criteria? | Yes | No | No | Yes | Yes | Yes | No | No | No | Yes | Yes | Yes | Yes | Yes |
6. Describes patient sampling (random, consecutive, all, etc.)? | Yes | Yes | Yes | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
7. Describes data collection? | n/a | Yes | Yes | Yes | Yes | Yes | n/a | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
7a. Who identified patients and ensured selection adhered to patient recruitment criteria? | n/a | Yes | Yes | Yes | Yes | Yes | n/a | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
7b. Who collected data? | n/a | Yes | Yes | Yes | Yes | Yes | n/a | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
7c. A priori data collection form? | n/a | No | Yes | Yes | Yes | Yes | n/a | No | Uncertain | Yes | Yes | No | Yes | No |
7d. How was disease classified? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No |
8. Was there a split sample (i.e. re-validation using a separate cohort)? | No | Yes | NoFootnote b | No | No | No | No | No | No | No | No | No | No | No |
Test methods | ||||||||||||||
9. Describe number, training and expertise of persons reading reference standard? | n/a | Yes | Yes | Yes | Yes | No | n/a | No | Yes | Yes | Yes | Yes | Yes | No |
10. If >1 person reading reference standard, measure of consistency is reported (e.g. kappa)? | n/a | No | No | Yes | Yes | n/a | n/a | No | Yes | n/a | No | Yes | Yes | n/a |
11. Were the readers of the reference (validation) test blinded to the results of the classification by administrative data for that patient? (e.g. Was the reviewer of the charts blinded to how that chart was billed?) | n/a | Yes | Yes | Uncertain | No | Yes | n/a | No | No | No | No | No | No | Yes |
Statistical methods | ||||||||||||||
12. Describe methods of calculating/comparing diagnostic accuracy? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes |
Results | ||||||||||||||
Participants | ||||||||||||||
13. Report when study done, start/end dates of enrolment? | Yes | No | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | No |
14. Describe number of people who satisfied inclusion/exclusion criteria? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
15. Study flow diagram? | No | Yes | Yes | Yes | No | No | No | No | No | Yes | Yes | Yes | Yes | No |
Test results | ||||||||||||||
16. Reports distribution of disease severity? | No | No | No | No | No | No | No | No | No | No | No | No | No | No |
17. Report cross-tabulation of index tests by results of reference standard? | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes | Yes |
Estimates | ||||||||||||||
18. Reports at least 4 estimates of diagnostic accuracy? (Estimates reported in included studies) | Yes | Yes | Yes | No | No | No | No | No | No | No | No | Yes | No | No |
18a. Sensitivity | Yes | Yes | Yes | No | No | Yes | Yes | No | No | No | No | Yes | No | No |
18b. Specificity | Yes | Yes | Yes | No | No | No | Yes | No | No | No | No | Yes | No | No |
18c. PPV | Yes | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
18d. NPV | Yes | Yes | Yes | Yes | No | No | No | No | No | No | No | Yes | No | Yes |
18e. Likelihood ratios | No | No | No | No | No | No | No | No | No | No | No | No | No | No |
18f. Kappa | Yes | No | No | No | No | No | No | No | No | No | No | Yes | No | No |
18g. Area under the ROC curve / c-statistic | Yes | No | No | No | No | No | Yes | No | No | No | No | No | No | No |
18h. Accuracy/agreement | No | No | No | No | No | No | No | No | No | No | No | Yes | No | No |
19. Was the accuracy reported for any subgroups (e.g. age, geography, different sex etc.)? | No | No | No | Yes | Yes | Yes | Yes | No | No | No | Yes | No | No | No |
20. If PPV/NPV reported, does ratio of cases/controls of validation cohort approximate prevalence of condition in the population? | No | Yes | Yes | No | No | No | n/a | No | No | No | No | No | No | No |
21. Reports 95% CIs for each diagnostic accuracy measure? | Yes | Yes | Yes | No | No | Yes | No | No | Yes | Yes | Yes | Yes | No | Yes |
Discussion | ||||||||||||||
22. Discusses the applicability of the findings? | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Study Domains | Bickford, 2020Footnote 22 | Brooks, 2021Footnote 23 | Brooks, 2021Footnote 24 | Burke, 2014Footnote 25 | Coleman, 2015Footnote 26 | Coo, 2017Footnote 27 | Dodds, 2009Footnote 28 | Hagberg, 2017Footnote 29 | Lauritsen, 2010Footnote 30 | Surén, 2019Footnote 31 | Daley, 2014Footnote 32 | Gruschow, 2016Footnote 33 | Mohr-Jensen, 2016Footnote 34 | Morkem, 2020Footnote 35 |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1. Patient selection | ||||||||||||||
A. Risk of Bias | ||||||||||||||
Q1 | Yes | Yes | Yes | No | Unclear | NoFootnote b YesFootnote c | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No |
Q2 | No | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Q3 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes | No | No | No | Yes | No |
Risk | HIGH | LOW | LOW | HIGH | UNCLEAR | HIGHFootnote b LOWFootnote c | LOW | LOW | LOW | HIGH | HIGH | HIGH | LOW | HIGH |
B. Applicability Concerns | ||||||||||||||
Concern | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW |
2. Index test | ||||||||||||||
A. Risk of Bias | ||||||||||||||
Q1 | Unclear | Unclear | Unclear | Yes | Yes | No | Unclear | Yes | Yes | Yes | Yes | Yes | Yes | Yes |
Risk | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW |
B. Applicability Concerns | ||||||||||||||
Concern | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW |
3. Reference standard | ||||||||||||||
A. Risk of Bias | ||||||||||||||
Q1 | Yes | Yes | Yes | Yes | Yes | UnclearFootnote b NoFootnote c | Yes | Unclear | Yes | Yes | Yes | No | Yes | Unclear |
Q2 | Yes | Yes | Yes | Unclear | No | YesFootnote b NoFootnote c | Yes | No | No | No | No | No | No | Yes |
Risk | LOW | LOW | LOW | UNCLEAR | LOW | UNCLEARFootnote b HIGHFootnote c | LOW | UNCLEAR | LOW | LOW | LOW | HIGH | LOW | UNCLEAR |
B. Applicability Concerns | ||||||||||||||
Concern | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW | LOW |
4. Flow and timing | ||||||||||||||
Risk of Bias | ||||||||||||||
Q1 | Yes | Unclear | Unclear | Yes | Unclear | Yes | Yes | Yes | Unclear | Yes | Yes | Yes | Yes | Unclear |
Q2 | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes | Yes | Yes | Yes | No | Yes | Yes |
Q3 | Yes | Yes | Yes | Yes | No | Yes | No | Yes | No | No | Yes | Yes | No | Yes |
Risk | LOW | LOW | LOW | LOW | HIGH | HIGH | HIGH | LOW | HIGH | HIGH | LOW | HIGH | HIGH | LOW |
Risk of bias and applicability concerns: Signalling questions and scoring guidelines
Study Domains
- Patient Selection
- Risk of Bias
- Q1 - Was a consecutive or random sample of patients enrolled?
Yes/No/Unclear- Select ‘Yes’ if consecutive or random sampling was used to select patients for the validation cohort.
- Select ‘No’ if non-consecutive or convenience sampling was used.
- Select ‘Unclear’ if insufficient information is reported.
- Q2 - Was a case-control design avoided?
Yes/No/Unclear- Select ‘Yes’ if a case-control design was avoided.
- Select ‘No’ if patients were selected based on known disease (i.e., confirmed as opposed to suspected cases) and non-disease status.
- Select ‘Unclear’ if insufficient information is reported.
- Q3 - Did the study avoid inappropriate exclusions?
Yes/No/Unclear- Select ‘Yes’ if the study avoided inappropriate exclusions.
- Select ‘No’ if the study excluded patients inappropriately, such as excluding difficult to diagnose patients or suspected but unconfirmed diagnoses.
- Select ‘Unclear’ if insufficient information is reported.
- Risk - Could the selection of patients have introduced bias
LOW/HIGH/UNCLEARFootnote a
- Q1 - Was a consecutive or random sample of patients enrolled?
- Applicability Concerns
- Concern - Is there concern that the included patients do not match the review question?
LOW/HIGH/UNCLEAR
- Concern - Is there concern that the included patients do not match the review question?
- Risk of Bias
- Index Test
- Risk of Bias
- Q1 - Were the administrative database algorithm(s) results interpreted without knowledge of the results of the reference standard?
Yes/No/Unclear- Select ‘Yes’ if the algorithm(s) results were interpreted without knowledge of the reference standard diagnosis.
- Select ‘No’ if it was reported that the algorithm(s) results were interpreted with knowledge of the results of the reference standard diagnosis.
- Select ‘Unclear’ if insufficient information is reported.
- Risk - Could the conduct or interpretation of the algorithm(s) have introduced bias?
LOW/HIGH/UNCLEARFootnote a
- Q1 - Were the administrative database algorithm(s) results interpreted without knowledge of the results of the reference standard?
- Applicability Concerns
- Concern - Is there concern that the algorithm(s), its/their conduct or interpretation differ from the review question?
LOW/HIGH/UNCLEAR
- Concern - Is there concern that the algorithm(s), its/their conduct or interpretation differ from the review question?
- Risk of Bias
- Reference Standard
- Risk of Bias
- Q1 - Is the reference standard likely to correctly classify the target condition?
Yes/No/Unclear- Select ‘Yes’ if established clinical classification criteria, clinical case definitions derived from medical records or a medical record diagnosis was used; if experienced or trained personnel carried out the record review/ abstractions (where applicable); and if agreement was calculated to be high when more than one person reviewed/abstracted data.
- Select ‘No’ if the reference standard was patient self-report; the personnel reviewing/abstracting information from the reference standard had insufficient experience or training (where applicable); or in cases where more than one person reviewed or abstracted data, agreement between personnel was low.
- Select ‘Unclear’ if insufficient information is reported (e.g., no information was reported on interrater agreement when more than one person reviewed).
- Q2 - Were the reference standard results interpreted without knowledge of the results of the algorithm(s)?
Yes/No/Unclear- Select ‘Yes’ if the reference standard results were interpreted without knowledge of the algorithm(s) results.
- Select ‘No’ if the reference standard was applied with knowledge of the algorithm(s) results, including when only patients flagged by the algorithm(s) received the reference standard.
- Select ‘Unclear’ if insufficient information is reported.
- Risk - Could the reference standard, its conduct or its interpretation have introduced bias?
LOW/HIGH/UNCLEARFootnote a
- Q1 - Is the reference standard likely to correctly classify the target condition?
- Applicability Concerns
- Concern - Is there concern that the target condition as defined by the reference standard does not match the review question?
LOW/HIGH/UNCLEAR
- Concern - Is there concern that the target condition as defined by the reference standard does not match the review question?
- Risk of Bias
- Flow and Timing
- Risk of Bias
- Q1 - Was there an appropriate interval between ascertaining cases from the algorithm(s) and the reference standard?
Yes/No/Unclear- Select ‘Yes’ if there was an appropriate time interval between the algorithm(s) and reference standard.
- Select ‘No’ if the time period between the reference standard diagnosis and algorithm(s) diagnosis was not appropriate.
- Select ‘Unclear’ if insufficient information is reported.
- Q2 - Did patients receive the same reference standard?
Yes/No/Unclear- Select ‘Yes’ if patients received the same reference standard.
- Select ‘No’ if different reference standards were used.
- Select ‘Unclear’ if insufficient information is reported.
- Q3 - Were all patients included in the analysis?
Yes/No/Unclear- Select ‘Yes’ if the number of patients enrolled (i.e., after exclusions) is the same as the number of patients included in the 2x2 table of results.
- Select ‘No’ if the number of patients enrolled differs from the number of patients included in the 2x2 table of results.
- Select ‘Unclear’ if insufficient information is reported (e.g., no information on how the final validation study population was achieved).
- Risk - Could the patient flow have introduced bias?
LOW/HIGH/UNCLEARFootnote a
- Q1 - Was there an appropriate interval between ascertaining cases from the algorithm(s) and the reference standard?
- Risk of Bias
- Footnote a
-
Scoring guidelines:
- If answers to all signalling questions within a domain were "yes" then risk of bias was judged as "LOW".
- If answers to all signalling questions within a domain were "no" then risk of bias was judged as "HIGH".
- If answers to all signalling questions within a domain were "unclear" then risk of bias was judged as "UNCLEAR".
- If any one signalling question was "no" this flagged the potential for bias and the review authors decided on what basis a judgment of high risk of bias might be made under such circumstances.
- The signalling question for "Index Test" (Q1) was considered a less important source of bias for this review. The second signalling question for "Reference Standard" (Q2) was also considered a less important source of bias but a judgment was made on a study-by-study basis. For all other signalling questions, one "no" response was sufficient for a judgment of high risk of bias.
- If any one signalling question was "unclear" this flagged the potential for bias and the review authors decided on what basis a judgment of unclear risk of bias might be made under such circumstances.
Page details
- Date modified: