Appendix II: Automated Categorization of ASD Publications (1980-2009): IACC/OARC ASD Research Publications Analysis  
Interagency Autism Coordinating Committee logo

Main content area.

IACC/OARC Autism Spectrum Disorder Publications Analysis: The Global Landscape of Autism Research, July 2012

Skip Over Navigation Links
Skip Over Navigation Links Printer Friendly Version 
« Previous | Next »

Appendix II: Automated Categorization of ASD Publications (1980-2009)

Manual assignment of the more than 20,000 ASD related publications between the years of 1980 to 2009 to the seven IACC Strategic Plan Critical Question areas would have required significant effort and resources. As a proxy for this manual review, publications were assigned using the semi-automated k-nearest neighbor (k-NN) algorithm.

As a first step in this approach, subject matter experts reviewed all of the 2010 ASD publications and manually assigned them to different research areas and subcategories based on the titles and abstracts of the specific publication. ASD publications from 1980 to 2009 ("prior publications") were then categorized based on their similarity to the previously classified 2010 publications. Prior publications were compared to 2010 publications and assigned a similarity score based on a modified Okapi BM25 algorithm. These publications were assigned to the category of the mode of the 25 publications from 2010 with the highest similarity scores. This approach to categorize publications was chosen because of the performance as well as accuracy and ease of use. Initial tests for automated categorization using other algorithms including Naïve Bayesian, supervised Latent Dirichlet, Boosting and Tree algorithms, all showed accuracy of much less than 70%.

While some publications naturally span multiple research categories, each publication is assigned to only one category for ease of tracking and trend analyses. Using this approach errs on the side of potentially underestimating the volume of research in some categories.

« Previous | Next »

HHS Home | Contacting IACC | Accessibility | Privacy Policy | FOIA | Disclaimer | | IACC Webmaster

U.S. Department of Health & Human Services • 200 Independence Avenue, S.W. • Washington, D.C. 20201