Zhang, Michael Q.
Permanent URI for this collectionhttps://hdl.handle.net/10735.1/3154
Professor Michael Q. Zhang holds the Cecil H. and Ida Green Chair of Systems Biology Science. He also serves as director of the Center for Systems Biology. He is considered a leading scientist in computational biology and genomic research. Computational biology bridges the life sciences and quantitative sciences – mathematics, statistics and computer science – to understand living systems. His long-term research goal has been to use mathematical and statistical methods to identify functional elements in eucaryotic genomes, especially the genes and their control and regulatory elements.
ORCID page
Browse
Browsing Zhang, Michael Q. by Issue Date
Now showing 1 - 20 of 32
- Results Per Page
- Sort Options
Item A Highly Efficient and Effective Motif Discovery Method for ChIP-Seq/ChIP-Chip Data using Positional Information(2012-01-06) Ma, Xiaotu; Kulkarni, Ashwinikumar; Zhang, Zhihua; Xuan, Zhenyu; Serfling, Robert J. (Robert Joseph); Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Identification of DNA motifs from ChIP-seq/ChIP-chip [chromatin immunoprecipitation (ChIP)] data is a powerful method for understanding the transcriptional regulatory network. However, most established methods are designed for small sample sizes and are inefficient for ChIP data. Here we propose a new k-mer occurrence model to reflect the fact that functional DNA k-mers often cluster around ChIP peak summits. With this model, we introduced a new measure to discover functional k-mers. Using simulation, we demonstrated that our method is more robust against noises in ChIP data than available methods. A novel word clustering method is also implemented to group similar k-mers into position weight matrices (PWMs). Our method was applied to a diverse set of ChIP experiments to demonstrate its high sensitivity and specificity. Importantly, our method is much faster than several other methods for large sample sizes. Thus, we have developed an efficient and effective motif discovery method for ChIP experiments.Item New Fusion Transcripts Identified in Normal Karyotype Acute Myeloid Leukemia(2012-12-12) Wen, H.; Li, Yongjin; Malek, S. N.; Kim, Y. C.; Xu, J.; Chen, P.; Xiao, F.; Huang, X.; Xuan, Zhenyu; Mankala, Shiva; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Genetic aberrations contribute to acute myeloid leukemia (AML). However, half of AML cases do not contain the well-known aberrations detectable mostly by cytogenetic analysis, and these cases are classified as normal karyotype AML. Different outcomes of normal karyotype AML suggest that this subgroup of AML could be genetically heterogeneous. But lack of genetic markers makes it difficult to further study this subgroup of AML. Using paired-end RNAseq method, we performed a transcriptome analysis in 45 AML cases including 29 normal karyotype AML, 8 abnormal karyotype AML and 8 AML without karyotype informaiton. Our study identified 134 fusion transcripts, all of which were formed between the partner genes adjacent in the same chromosome and distributed at different frequencies in the AML cases. Seven fusions are exclusively present in normal karyotype AML, and the rest fusions are shared between the normal karyotype AML and abnormal karyotype AML. CIITA, a master regulator of MHC class II gene expression and truncated in B-cell lymphoma and Hodgkin disease, is found to fuse with DEXI in 48% of normal karyotype AML cases. The fusion transcripts formed between adjacent genes highlight the possibility that certain such fusions could be involved in oncological process in AML, and provide a new source to identify genetic markers for normal karyotype AML.Item OLego: Fast and Sensitive Mapping of Spliced mRNA-Seq Reads Using Small Seeds(Oxford University Press, 2013-04) Wu, Jie; Anczuk©w, Olga; Krainer, Adrian R.; Zhang, Michael Q.; Zhang, Chaolin; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.A crucial step in analyzing mRNA-Seq data is to accurately and efficiently map hundreds of millions of reads to the reference genome and exon junctions. Here we present OLego, an algorithm specifically designed for de novo mapping of spliced mRNA-Seq reads. OLego adopts a multiple-seed-and-extend scheme, and does not rely on a separate external aligner. It achieves high sensitivity of junction detection by strategic searches with small seeds (∼14 nt for mammalian genomes). To improve accuracy and resolve ambiguous mapping at junctions, OLego uses a built-in statistical model to score exon junctions by splice-site strength and intron size. Burrows-Wheeler transform is used in multiple steps of the algorithm to efficiently map seeds, locate junctions and identify small exons. OLego is implemented in C++ with fully multithreaded execution, and allows fast processing of large-scale data. We systematically evaluated the performance of OLego in comparison with published tools using both simulated and real data. OLego demonstrated better sensitivity, higher or comparable accuracy and substantially improved speed. OLego also identified hundreds of novel micro-exons (<30 nt) in the mouse transcriptome, many of which are phylogenetically conserved and can be validated experimentally in vivo. OLego is freely available at http://zhanglab.c2b2.columbia.edu/index.php/OLego.;Item FastDMA: An Infinium Humanmethylation450 Beadchip Analyzer(2013-09-05) Wu, D.; Gu, J.; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.DNA methylation is vital for many essential biological processes and human diseases. Illumina Infinium HumanMethylation450 Beadchip is a recently developed platform studying genome-wide DNA methylation state on more than 480,000 CpG sites and a few CHG sites with high data quality. To analyze the data of this promising platform, we developed FastDMA which can be used to identify significantly differentially methylated probes. Besides single probe analysis, FastDMA can also do region-based analysis for identifying the differentially methylated region (DMRs). A uniformed statistical model, analysis of covariance (ANCOVA), is used to achieve all the analyses in FastDMA. We apply FastDMA on three large-scale DNA methylation datasets from The Cancer Genome Atlas (TCGA) and find many differentially methylated genomic sites in different types of cancer. On the testing datasets, FastDMA shows much higher computational efficiency than current tools. FastDMA can benefit the data analyses of large-scale DNA methylation studies with an integrative pipeline and a high computational efficiency. The software is freely available via http://bioinfo.au.tsinghua.edu.cn/software/fastdma/.Item Integrated Omics Study Delineates the Dynamics of Lipid Droplets in Rhodococcus Opacus PD630(Oxford University Press, 2013-10-22) Chen, Yong; Ding, Yunfeng; Yang, Li; Yu, Jinhai; Liu, Guiming; Wang, Xumin; Zhang, Shuyan; Zhang, Michael Q.; Li, Yanda; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Rhodococcus opacus strain PD630 (R. opacus PD630), is an oleaginous bacterium, and also is one of few prokaryotic organisms that contain lipid droplets (LDs). LD is an important organelle for lipid storage but also intercellular communication regarding energy metabolism, and yet is a poorly understood cellular organelle. To understand the dynamics of LD using a simple model organism, we conducted a series of comprehensive omics studies of R. opacus PD630 including complete genome, transcriptome and proteome analysis. The genome of R. opacus PD630 encodes 8947 genes that are significantly enriched in the lipid transport, synthesis and metabolic, indicating a super ability of carbon source biosynthesis and catabolism. The comparative transcriptome analysis from three culture conditions revealed the landscape of gene-altered expressions responsible for lipid accumulation. The LD proteomes further identified the proteins that mediate lipid synthesis, storage and other biological functions. Integrating these three omics uncovered 177 proteins that may be involved in lipid metabolism and LD dynamics. A LD structure-like protein LPD06283 was further verified to affect the LD morphology. Our omics studies provide not only a first integrated omics study of prokaryotic LD organelle, but also a systematic platform for facilitating further prokaryotic LD research and biofuel development.Item Characterizing the strand-specific distribution of non-CpG methylation in human pluripotent cells(Oxford University Press, 2013-12-16) Guo, Weilong; Chung, Wen-Yu; Qian, Minping; Pellegrini, Matteo; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.DNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation. With the divide-and-compare strategy, we show that CHG and CHH methylation are not intrinsically different in human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We also find that non-CpG methylation is skewed between the two strands in introns, especially at intron boundaries and in highly expressed genes. Controlling for the proximal sequences of non-CpG sites, we show that the skew of non-CpG methylation in introns is mainly guided by sequence skew. By studying subgroups of transposable elements, we also found that non-CpG methylation is distributed in a strand-specific manner in both short interspersed nuclear elements (SINE) and long interspersed nuclear elements (LINE), but not in long terminal repeats (LTR). Finally, we show that on the antisense strand of Alus, a non-CpG site just downstream of the A-box is highly methylated. Together, the divide-and-compare strategy leads us to identify regions with strand-specific distributions of non-CpG methylation in humans.;Item HITS-CLIP and Integrative Modeling Define the Rbfox Splicing-Regulatory Network Linked to Brain Development and Autism(Cell Press, 2014-03) Weyn-Vanhentenryck, Sebastien; Mele, Aldo; Yan, Qinghong; Sun, Shuying; Farny, Natalie; Zhang, Zuo; Xue, Chenghai; Herre, Margaret; Silver, Pamela A.; Zhang, Michael Q.; Krainer, Adrian R.; Darnell, Robert B.; Zhang, Chaolin; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.The RNA binding proteins Rbfox1/2/3 regulate alternative splicing in the nervous system, and disruption of Rbfox1 has been implicated in autism. However, comprehensive identification of functional Rbfox targets has been challenging. Here, we perform HITS-CLIP for all three Rbfox family members in order to globally map, at a single-nucleotide resolution, their in vivo RNA interaction sites in the mouse brain. We find that the two guanines in the Rbfox binding motif UGCAUG are critical for protein-RNA interactions and crosslinking. Using integrative modeling, these interaction sites, combined with additional datasets, define 1,059 direct Rbfox target alternative splicing events. Over half of the quantifiable targets show dynamic changes during brain development. Of particular interest are 111 events from 48 candidate autism-susceptibility genes, including syndromic autism genes Shank3, Cacna1c, and Tsc2. Alteration of Rbfox targets in some autistic brains is correlated with downregulation of all three Rbfox proteins, supporting the potential clinical relevance of the splicing-regulatory network.Item Miror: A Method for Cell-Type Specific MicroRNA Occupancy Rate Prediction(Royal Soc Chemistry, 2014-03-13) Xie, Peng; Liu, Yu; Li, Yanda; Zhang, Michael Q.; Wang, Xiaowo; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.MicroRNA (miRNA) regulation is highly cell-type specific. It is sensitive to both the miRNA-mRNA relative abundance and the competitive endogenous RNA (ceRNA) effect. However, almost all existing miRNA target prediction methods neglected the influence of the cellular environment when analyzing miRNA regulation effects. In this study, we proposed a method, MIROR (miRNA Occupancy Rate predictor), to predict miRNA regulation intensity in a given cell type. The major considerations were the miRNA-mRNA relative abundance and the endogenous competition between different mRNA species. The output of MIROR is the predicted miRNA occupancy rates of each target site. The predicted results significantly correlated with Ago HITS-CLIP experiment that indicated miRNA binding intensities. When applied to the analysis of the breast invasive carcinoma dataset, MIROR identified a number of differentially regulated miRNA-mRNA pairs with significant miRNA occupancy rate changes between tumor and normal tissues. Many of the predictions were supported by previous research studies, including the ones without a significant change in the mRNA expression level. These results indicate that MIROR provides a novel strategy to study the miRNA differential regulation in different cell types.Item Nucleosome Eviction and Multiple Co-Factor Binding Predict Estrogen-Receptor-Alpha-Asociated Long-Range Interactions(Oxford University Press, 2014-04-29) He, C.; Wang, X.; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Many enhancers regulate their target genes via long-distance interactions. High-throughput experiments like ChIA-PET have been developed to map such largely cell-type-specific interactions between cis-regulatory elements genome-widely. In this study, we integrated multiple types of data in order to reveal the general hidden patterns embedded in the ChIA-PET data. We found characteristic distance features related to promoter-promoter, enhancer-enhancer and insulator-insulator interactions. Although a protein may have many binding sites along the genome, our hypothesis is that those sites that share certain open chromatin structure can accommodate relatively larger protein complex consisting of specific regulatory and 'bridging' factors, and may be more likely to form robust long-range deoxyribonucleic acid (DNA) loops. This hypothesis was validated in the estrogen receptor alpha (ERa) ChIA-PET data. An efficient classifier was built to predict ERa-associated long-range interactions solely from the related ChIP-seq data, hence linking distal ERa-dependent enhancers to their target genes. We further applied the classifier to generate additional novel interactions, which were undetected in the original ChIA-PET paper but were validated by other independent experiments. Our work provides a new insight into the long-range chromatin interactions through deeper and integrative ChIA-PET data analysis and demonstrates DNA looping predictability from ordinary ChIP-seq data.Item ModuleRole: A Tool for Modulization, Role Determination and Visualization in Protein-Protein Interaction Networks(Public Library of Science, 2014-05-01) Li, GuiPeng; Li, Ming; Zhang, YiWei; Wang, Dong; Li, Rong; Guimera, Roger; Gao, Juntao Tony; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Rapidly increasing amounts of (physical and genetic) protein-protein interaction (PPI) data are produced by various high-throughput techniques, and interpretation of these data remains a major challenge. In order to gain insight into the organization and structure of the resultant large complex networks formed by interacting molecules, using simulated annealing, a method based on the node connectivity, we developed ModuleRole, a user-friendly web server tool which finds modules in PPI network and defines the roles for every node, and produces files for visualization in Cytoscape and Pajek. For given proteins, it analyzes the PPI network from BioGRID database, finds and visualizes the modules these proteins form, and then defines the role every node plays in this network, based on two topological parameters Participation Coefficient and Z-score. This is the first program which provides interactive and very friendly interface for biologists to find and visualize modules and roles of proteins in PPI network. It can be tested online at the website http://www.bioinfo.org/modulerole/index.php, which is free and open to all users and there is no login requirement, with demo data provided by "User Guide'' in the menu Help. Non-server application of this program is considered for high-throughput data with more than 200 nodes or user's own interaction datasets. Users are able to bookmark the web link to the result page and access at a later time. As an interactive and highly customizable application, ModuleRole requires no expert knowledge in graph theory on the user side and can be used in both Linux and Windows system, thus a very useful tool for biologist to analyze and visualize PPI networks from databases such as BioGRID. Availability: ModuleRole is implemented in Java and C, and is freely available at http://www.bioinfo.org/modulerole/index.php. Supplementary information (user guide, demo data) is also available at this website. API for ModuleRole used for this program can be obtained upon request.Item Assembly and Validation of Versatile Transcription Activator-Like Effector Libraries(Nature Publishing Group, 2014-05-06) Li, Yi; Ehrhardt, Kristina; Zhang, Michael Q.; Bleris, Leonidas; 0000 0001 2535 9739 (Bleris, L); 0000 0001 1707 1372 (Zhang, MQ); 2012076942 (Bleris, L); 99086074 (Zhang, MQ); Zhang, Michael Q.The ability to perturb individual genes in genome-wide experiments has been instrumental in unraveling cellular and disease properties. Here we introduce, describe the assembly, and demonstrate the use of comprehensive and versatile transcription activator-like effector (TALE) libraries. As a proof of principle, we built an 11-mer library that covers all possible combinations of the nucleotides that determine the TALE-DNA binding specificity. We demonstrate the versatility of the methodology by constructing a constraint library, customized to bind to a known p53 motif. To verify the functionality in assays, we applied the 11-mer library in yeast-one-hybrid screens to discover TALEs that activate human SCN9A and miR-34b respectively. Additionally, we performed a genome-wide screen using the complete 11-mer library to confirm known genes that confer cycloheximide resistance in yeast. Considering the highly modular nature of TALEs and the versatility and ease of constructing these libraries we envision broad implications for high-throughput genomic assays. ;Item Genome Wide Mapping of Foxo1 Binding-Sites in Murine T Lymphocytes(Elsevier Inc, 2014-08-01) Liao, Will; Ouyang, Weiming; Zhang, Michael Q.; Li, Ming O.; Zhang, Michael Q.The Forkhead box O (Foxo) family of transcription factors has a critical role in controlling the development, differentiation, and function of T cells. However, the direct target genes of Foxo transcription factors in T cells have not been well characterized. In this study, we focused on mapping the genome wide Foxo1-binding sites in naïve CD4(+) T cells, CD8(+) T cells, and Foxp3(+) regulatory T (Treg) cells. By using chromatin immunoprecipitation coupled with deep sequencing (ChIP-Seq), we identified Foxo1 binding sites that were shared among or specific to the three T cell populations. Here we describe the experiments, quality controls, as well as the deep sequencing data. Part of the data analysis has been published by Ouyang W et al. in Nature 20121] and Kim MV et al. in Immunity 20132], and the associated data set were uploaded to NCBI Gene Expression Omnibus.;Item Hsa-miR-1246, Hsa-miR-320a and Hsa-miR-196b-5p Inhibitors can Reduce the Cytotoxicity of Ebola Virus Glycoprotein in Vitro(Science Press, 2014-09-12) Sheng, MiaoMiao; Ying, Zhong; Yang,Chen; Du, JianChao; Ju, XiangWu; Chen, Zhao; GuiGen, Zhang; LiFang,Zhang; Liu, KangTai; Yang, Ning; Xie, Peng; Li, DangSheng; Zhang, Michael Q.; Jiang, ChengYu; ATLAS Collaboration; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Ebola virus (EBOV) causes a highly lethal hemorrhagic fever syndrome in humans and has been associated with mortality rates of up to 91% in Zaire, the most lethal strain. Though the viral envelope glycoprotein (GP) mediates widespread inflammation and cellular damage, these changes have mainly focused on alterations at the protein level, the role of microRNAs (miRNAs) in the molecular pathogenesis underlying this lethal disease is not fully understood. Here, we report that the miRNAs hsa-miR-1246, hsa-miR-320a and hsa-miR-196b-5p were induced in human umbilical vein endothelial cells (HUVECs) following expression of EBOV GP. Among the proteins encoded by predicted targets of these miRNAs, the adhesion-related molecules tissue factor pathway inhibitor (TFPI), dystroglycan1 (DAG1) and the caspase 8 and FADD-like apoptosis regulator (CFLAR) were significantly downregulated in EBOV GP-expressing HUVECs. Moreover, inhibition of hsa-miR-1246, hsa-miR-320a and hsa-miR-196b-5p, or overexpression of TFPI, DAG1 and CFLAR rescued the cell viability that was induced by EBOV GP. Our results provide a novel molecular basis for EBOV pathogenesis and may contribute to the development of strategies to protect against future EBOV pandemics.Item Gene Module Based Regulator Inference Identifying miR-139 as a Tumor Suppressor in Colorectal Cancer(Royal Society of Chemistry, 2014-09-30) Gu, J.; Chen, Y.; Huang, H.; Yin, L.; Xie, Z.; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Colorectal cancer is one of the most commonly diagnosed cancer types worldwide. Identification of the key regulators of the altered biological networks is crucial for understanding the complex molecular mechanisms of colorectal cancer. We proposed a gene module based approach to infer key miRNAs regulating the major gene network alterations in cancer tissues. By integrating gene differential expression and co-expression information with a protein-protein interaction network, the differential gene expression modules, which captured the major gene network changes, were identified for colorectal cancer. Then, several key miRNAs, which extensively regulate the gene modules, were inferred by analyzing their target gene enrichment in the modules. Among the inferred candidates, three miRNAs, miR-101, miR-124 and miR-139, are frequently down-regulated in colorectal cancers. The following computational and experimental analyses demonstrate that miR-139 can inhibit cell proliferation and cell cycle G1/S transition. A known oncogene ETS1, a key transcription factor in the gene module, was experimentally verified as a novel target of miR-139. miR-139 was found to be significantly down-regulated in early pathological cancer stages and its expression remained at very low levels in advanced stages. These results indicate that miR-139, inferred by the gene module based approach, should be a key tumor suppressor in early cancer development.Item Activity-Dependent FUS Dysregulation Disrupts Synaptic Homeostasis(Natl Acad Sciences, 2014-10-16) Sephton, Chantelle F.; Tang, Amy A.; Kulkarni, Ashwinikumar; West, James; Brooks, Mieu; Stubblefield, Jeremy J.; Liu, Yun; Zhang, Michael Q.; Green, Carla B.; Huber, Kimberly M.; Huang, Eric J.; Herz, Joachim; Yu, Gang; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.The RNA-binding protein fused-in-sarcoma (FUS) has been associated with amyotrophic lateral sclerosis (ALS) and frontotemporal lobar degeneration (FTLD), two neurodegenerative disorders that share similar clinical and pathological features. Both missense mutations and overexpression of wild-type FUS protein can be pathogenic in human patients. To study the molecular and cellular basis by which FUS mutations and overexpression cause disease, we generated novel transgenic mice globally expressing low levels of human wild-type protein (FUSWT) and a pathological mutation (FUSR521G). FUSWT and FUSR521G mice that develop severe motor deficits also show neuroinflammation, denervated neuromuscular junctions, and premature death, phenocopying the human diseases. A portion of FUSR521G mice escape early lethality; these escapers have modest motor impairments and altered sociability, which correspond with a reduction of dendritic arbors and mature spines. Remarkably, only FUSR521G mice show dendritic defects; FUSWT mice do not. Activation of metabotropic glutamate receptors 1/5 in neocortical slices and isolated synaptoneurosomes increases endogenous mouse FUS and FUSWT protein levels but decreases the FUSR521G protein, providing a potential biochemical basis for the dendritic spine differences between FUSWT and FUSR521G mice.Item Distinct and Predictive Histone Lysine Acetylation Patterns at Promoters, Enhancers, and Gene Bodies(Genetics Society America, 2014-11-01) Rajagopal, Nisha; Ernst, Jason; Ray, Pradipta; Wu, Jie; Zhang, Michael Q.; Kellis, Manolis; Ren, Bing; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.In eukaryotic cells, histone lysines are frequently acetylated. However, unlike modifications such as methylations, histone acetylation modifications are often considered redundant. As such, the functional roles of distinct histone acetylations are largely unexplored. We previously developed an algorithm RFECS to discover the most informative modifications associated with the classification or prediction of mammalian enhancers. Here, we used this tool to identify the modifications most predictive of promoters, enhancers, and gene bodies. Unexpectedly, we found that histone acetylation alone performs well in distinguishing these unique genomic regions. Further, we found the association of characteristic acetylation patterns with genic regions and association of chromatin state with splicing. Taken together, our work underscores the diverse functional roles of histone acetylation in gene regulation and provides several testable hypotheses to dissect these roles.Item Chip-Array 2: Integrating Multiple Omics Data to Construct Gene Regulatory Networks(2015-04-27) Wang, Panwen; Qin, Jing; Qin, Yiming; Zhu, Yun; Wang, Lily Yan; Li, Mulin Jun; Zhang, Michael Q.; Wang, Junwen; 0000 0001 1707 1372 (Zhang, MQ); Zhang, Michael Q.Transcription factors (TFs) play an important role in gene regulation. The interconnections among TFs, chromatin interactions, epigenetic marks and cisregulatory elements form a complex gene transcription apparatus. Our previous work, ChIP-Array, combined TF binding and transcriptome data to construct gene regulatory networks (GRNs). Here we present an enhanced version, ChIP-Array 2, to integrate additional types of omics data including long-range chromatin interaction, open chromatin region and histone modification data to dissect more comprehensive GRNs involving diverse regulatory components. Moreover, we substantially extended our motif database for human, mouse, rat, fruit fly, worm, yeast and Arabidopsis, and curated large amount of omics data for users to select as input or backend support. With ChIP-Array 2, we compiled a library containing regulatory networks of 18 TFs/chromatin modifiers in mouse embryonic stem cell (mESC). The web server and the mESC library are publicly free and accessible at http://jjwanglab.org/chip-array.Item Quantitative Combination of Natural Anti-Oxidants Prevents Metabolic Syndrome by Reducing Oxidative Stress(2015-06-26) Gao, Mingjing; Zhao, Zhen; Lv, Pengyu; Li, YuFang; Gao, Juntao; Zhang, Michael Q.; Zhao, Baolu; 0000 0001 1707 1372 (Zhang, MQ); Zhang, Michael Q.Insulin resistance and abdominal obesity are present in the majority of people with the metabolic syndrome. Antioxidant therapy might be a useful strategy for type 2 diabetes and other insulin-resistant states. The combination of vitamin C (Vc) and vitamin E has synthetic scavenging effect on free radicals and inhibition effect on lipid peroxidation. However, there are few studies about how to define the best combination of more than three anti-oxidants as it is difficult or impossible to test the anti-oxidant effect of the combination of every concentration of each ingredient experimentally. Here we present a math model, which is based on the classical Hill equation to determine the best combination, called Fixed Dose Combination (FDC), of several natural anti-oxidants, including Vc, green tea polyphenols (GTP) and grape seed extract proanthocyanidin (GSEP). Then we investigated the effects of FDC on oxidative stress, blood glucose and serum lipid levels in cultured 3T3-L1 adipocytes, high fat diet (HFD)-fed rats which serve as obesity model, and KK-ay mice as diabetic model. The level of serum malondialdehyde (MDA) in the treated rats was studied and Hematoxylin-Eosin (HE) staining or Oil red slices of liver and adipose tissue in the rats were examined as well. FDC shows excellent antioxidant and anti-glycation activity by attenuating lipid peroxidation. FDC determined in this investigation can become a potential solution to reduce obesity, to improve insulin sensitivity and be beneficial for the treatment of fat and diabetic patients. It is the first time to use the math model to determine the best ratio of three anti-oxidants, which can save much more time and chemical materials than traditional experimental method. This quantitative method represents a potentially new and useful strategy to screen all possible combinations of many natural anti-oxidants, therefore may help develop novel therapeutics with the potential to ameliorate the worldwide metabolic abnormalities.Item Histone Deacetylases Positively Regulate Transcription Through the Elongation Machinery(Elsevier B.V., 2015-11-17) Greer, Celeste B.; Tanaka, Yoshiaki; Kim, Yoon Jung; Xie, Peng; Zhang, Michael Q.; Park, In-Hyun; Kim, Tae Hoon; 0000 0001 1707 1372 (Zhang, MQ); Kim, Yoon Jung; Xie, Peng; Zhang, Michael Q.; Park, In-HyunTranscription elongation regulates the expression of many genes, including oncogenes. Histone deacetylase (HDAC) inhibitors (HDACIs) block elongation, suggesting that HDACs are involved in gene activation. To understand this, we analyzed nascent transcription and elongation factor binding genome-wide after perturbation of elongation with small molecule inhibitors. We found that HDACI-mediated repression requires heat shock protein 90 (HSP90) activity. HDACIs promote the association of RNA polymerase II (RNAP2) and negative elongation factor (NELF), a complex stabilized by HSP90, at the same genomic sites. Additionally, HDACIs redistribute bromodomain-containing protein 4 (BRD4), a key elongation factor involved in enhancer activity. BRD4 binds to newly acetylated sites, and its occupancy at promoters and enhancers is reduced. Furthermore, HDACIs reduce enhancer activity, as measured by enhancer RNA production. Therefore, HDACs are required for limiting acetylation in gene bodies and intergenic regions. This facilitates the binding of elongation factors to properly acetylated promoters and enhancers for efficient elongation.; Copyright © 2015 The Authors. Published by Elsevier Inc. All rights reserved.Item Fast Dimension Reduction and Integrative Clustering of Multi-Omics Data Using Low-Rank Approximation: Application to Cancer Molecular Classification(BioMed Central, 2015-12-01) Wu, Dingming; Wang, Dongfang; Zhang, Michael Q.; Gu, Jin; 0000 0001 1707 1372 (Zhang, MQ); Zhang, Michael Q.Background: One major goal of large-scale cancer omics study is to identify molecular subtypes for more accurate cancer diagnoses and treatments. To deal with high-dimensional cancer multi-omics data, a promising strategy is to find an effective low-dimensional subspace of the original data and then cluster cancer samples in the reduced subspace. However, due to data-type diversity and big data volume, few methods can integrative and efficiently find the principal low-dimensional manifold of the high-dimensional cancer multi-omics data.; Results: In this study, we proposed a novel low-rank approximation based integrative probabilistic model to fast find the shared principal subspace across multiple data types: the convexity of the low-rank regularized likelihood function of the probabilistic model ensures efficient and stable model fitting. Candidate molecular subtypes can be identified by unsupervised clustering hundreds of cancer samples in the reduced low-dimensional subspace. On testing datasets, our method LRAcluster (low-rank approximation based multi-omics data clustering) runs much faster with better clustering performances than the existing method. Then, we applied LRAcluster on large-scale cancer multi-omics data from TCGA. The pan-cancer analysis results show that the cancers of different tissue origins are generally grouped as independent clusters, except squamous-like carcinomas. While the single cancer type analysis suggests that the omics data have different subtyping abilities for different cancer types.; Conclusions: LRAcluster is a very useful method for fast dimension reduction and unsupervised clustering of large-scale multi-omics data. LRAcluster is implemented in R and freely available via http://bioinfo.au.tsinghua.edu.cn/software/lracluster/