Fig. 6.
(A) SFARI genes and can4Dn, but not can4Up, share very similar enrichment profiles for upstream regulators. Value in each cell represents the significance score (Materials and Methods). In row labels, c indicates ChEA, and e indicates ENCODE, indicating the source of ChIP-seq data. (B) UES analysis and UES-blast workflow. ORA analysis finds significantly enriched upstream regulators for a test gene set using reference gene sets from the ChIP-X library. A significance score is calculated for each upstream regulator (Materials and Methods), creating a signature for each test gene set, which we named UES. A blast-style querying algorithm (UES-blast) then identifies test gene sets with the most similar UES compared to a query gene set, e.g., SFARI genes. UR, upstream regulator. (C) UES-blast (Materials and Methods) shows that can4Dn, can4Dn-SFARI, and several independently curated autism risk gene sets rank high among control gene sets against the query gene set of SFARI genes, indicating that they share a very similar UES with the SFARI genes. (D) tSNE clustering of can4Dn, can4Dn-SFARI, autism risk gene sets, and DisGeNET gene sets (only those containing >500 genes). Letters label autism comorbidities risk gene sets as follows: I, intellectual disability; D, depression; S, schizophrenia; B, bipolar disorder. (E) Genes in the SFARI and can4Dn gene sets, but not in can4Up, are specifically enriched for the H3K27me3 mark. (F) can4Dn, can4Dn-SFARI, and autism risk gene sets have the highest H3K27me3 scores when compared to the control gene sets.