In autism, early-age biomarkers are scarce. Research is urgently needed to identify markers that precedesymptom onset, convey prognostic information, or indicate disorder subtypes. Our proposed functional genomicsstudy of early development in ASD addresses many of these biomarker goals and is an essential early step inthis discovery process. Robust biomarkers have been elusive presumably since ASD is a heterogeneousdevelopmental disorder with thousands of speculated risk genes and potential non-genetic immune factors. Wehypothesize that pathway-based transcriptomic biomarkers may be informative, as shown by our recent proof-of-concept study in which leukocyte-based gene expression provided an early diagnostic ASD classifier. Ourfindings are reasonable since many high confidence ASD genes (e.g., transcription factors, signaling genes,etc.) and networks are as strongly expressed in leukocytes as in brain. Furthermore, hypothesized immunedisruptions in ASD should also be reflected in leukocytes, especially since microglia are a type of leukocyte thatare established as a brain molecular and cellular pathology in ASD. In our proposed study, we will use 1,500RNA-Seq datasets from 1,000 ASD and typically and atypically developing toddlers to identify biomolecularpathway biomarkers for early detection, prognosis, clinical progression and clinical subtyping. We will furtherstudy biomarker relationships to ASD gene defects and expression patterns in early neural development. Aim 1will analyze RNA-Seq data from 1,000 1-2 year olds using data-driven and knowledge-based networkapproaches to identify early ASD diagnostic biomarkers that distinguish ASD (n=390) at ages 1-2 years fromnon-ASD (n=610) groups. Diagnostic biomarkers will include pathways and co-expression networks to addressthe heterogeneity across ASD subjects. Aim 2 will identify prognostic RNA-Seq expression patterns in the 390ASD 1-2 year olds by analyzing gene expression levels to reveal pathways that predict good/poor social andlanguage outcome at ages 3-4 years. Aim 2 will also look longitudinally at ASD (n=300) and typically developing(n=200) expression data to identify transcriptomic trajectories that underlie clinical progression from 1-2 years to3-4 years in these different clinical outcome subgroups. Aim 3 will examine how variation in developmentalfunctional genomic patterns relates to variation in social and language abilities across diagnostic categories(n=1,000) and within ASD (n=390) using dimensionality reduction and feature selecting regression. Multicollinearregressions will be used to combine multivariate trend observations of dimensionality reduction with thepredictive power of regressions. Aim 4 will link key transcriptomic effects in Aims 1 to 3 to genetic variants inhigh-confidence and probable ASD genes that are linked to disrupted cellular pathways in our ASD subjects.Deleterious variants in those genes will be tested in hematopoietic and neural stem cells using CRISPR-Cas9 tointroduce loss-of-function mutations in these genes. RNA-Seq will be used to assay the impact on ASD-relevantcellular pathways.