CSB Research
Permanent URI for this collectionhttps://hdl.handle.net/10735.1/3686
Browse
Browsing CSB Research by Author "0000 0001 1707 1372 (Zhang, MQ)"
Now showing 1 - 9 of 9
- Results Per Page
- Sort Options
Item Assembly and Validation of Versatile Transcription Activator-Like Effector Libraries(Nature Publishing Group, 2014-05-06) Li, Yi; Ehrhardt, Kristina; Zhang, Michael Q.; Bleris, Leonidas; 0000 0001 2535 9739 (Bleris, L); 0000 0001 1707 1372 (Zhang, MQ); 2012076942 (Bleris, L); 99086074 (Zhang, MQ); Zhang, Michael Q.The ability to perturb individual genes in genome-wide experiments has been instrumental in unraveling cellular and disease properties. Here we introduce, describe the assembly, and demonstrate the use of comprehensive and versatile transcription activator-like effector (TALE) libraries. As a proof of principle, we built an 11-mer library that covers all possible combinations of the nucleotides that determine the TALE-DNA binding specificity. We demonstrate the versatility of the methodology by constructing a constraint library, customized to bind to a known p53 motif. To verify the functionality in assays, we applied the 11-mer library in yeast-one-hybrid screens to discover TALEs that activate human SCN9A and miR-34b respectively. Additionally, we performed a genome-wide screen using the complete 11-mer library to confirm known genes that confer cycloheximide resistance in yeast. Considering the highly modular nature of TALEs and the versatility and ease of constructing these libraries we envision broad implications for high-throughput genomic assays. ;Item Characterizing the strand-specific distribution of non-CpG methylation in human pluripotent cells(Oxford University Press, 2013-12-16) Guo, Weilong; Chung, Wen-Yu; Qian, Minping; Pellegrini, Matteo; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.DNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation. With the divide-and-compare strategy, we show that CHG and CHH methylation are not intrinsically different in human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We also find that non-CpG methylation is skewed between the two strands in introns, especially at intron boundaries and in highly expressed genes. Controlling for the proximal sequences of non-CpG sites, we show that the skew of non-CpG methylation in introns is mainly guided by sequence skew. By studying subgroups of transposable elements, we also found that non-CpG methylation is distributed in a strand-specific manner in both short interspersed nuclear elements (SINE) and long interspersed nuclear elements (LINE), but not in long terminal repeats (LTR). Finally, we show that on the antisense strand of Alus, a non-CpG site just downstream of the A-box is highly methylated. Together, the divide-and-compare strategy leads us to identify regions with strand-specific distributions of non-CpG methylation in humans.;Item Distinct and Predictive Histone Lysine Acetylation Patterns at Promoters, Enhancers, and Gene Bodies(Genetics Society America, 2014-11-01) Rajagopal, Nisha; Ernst, Jason; Ray, Pradipta; Wu, Jie; Zhang, Michael Q.; Kellis, Manolis; Ren, Bing; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.In eukaryotic cells, histone lysines are frequently acetylated. However, unlike modifications such as methylations, histone acetylation modifications are often considered redundant. As such, the functional roles of distinct histone acetylations are largely unexplored. We previously developed an algorithm RFECS to discover the most informative modifications associated with the classification or prediction of mammalian enhancers. Here, we used this tool to identify the modifications most predictive of promoters, enhancers, and gene bodies. Unexpectedly, we found that histone acetylation alone performs well in distinguishing these unique genomic regions. Further, we found the association of characteristic acetylation patterns with genic regions and association of chromatin state with splicing. Taken together, our work underscores the diverse functional roles of histone acetylation in gene regulation and provides several testable hypotheses to dissect these roles.Item FastDMA: An Infinium Humanmethylation450 Beadchip Analyzer(2013-09-05) Wu, D.; Gu, J.; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.DNA methylation is vital for many essential biological processes and human diseases. Illumina Infinium HumanMethylation450 Beadchip is a recently developed platform studying genome-wide DNA methylation state on more than 480,000 CpG sites and a few CHG sites with high data quality. To analyze the data of this promising platform, we developed FastDMA which can be used to identify significantly differentially methylated probes. Besides single probe analysis, FastDMA can also do region-based analysis for identifying the differentially methylated region (DMRs). A uniformed statistical model, analysis of covariance (ANCOVA), is used to achieve all the analyses in FastDMA. We apply FastDMA on three large-scale DNA methylation datasets from The Cancer Genome Atlas (TCGA) and find many differentially methylated genomic sites in different types of cancer. On the testing datasets, FastDMA shows much higher computational efficiency than current tools. FastDMA can benefit the data analyses of large-scale DNA methylation studies with an integrative pipeline and a high computational efficiency. The software is freely available via http://bioinfo.au.tsinghua.edu.cn/software/fastdma/.Item Integrated Omics Study Delineates the Dynamics of Lipid Droplets in Rhodococcus Opacus PD630(Oxford University Press, 2013-10-22) Chen, Yong; Ding, Yunfeng; Yang, Li; Yu, Jinhai; Liu, Guiming; Wang, Xumin; Zhang, Shuyan; Zhang, Michael Q.; Li, Yanda; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Rhodococcus opacus strain PD630 (R. opacus PD630), is an oleaginous bacterium, and also is one of few prokaryotic organisms that contain lipid droplets (LDs). LD is an important organelle for lipid storage but also intercellular communication regarding energy metabolism, and yet is a poorly understood cellular organelle. To understand the dynamics of LD using a simple model organism, we conducted a series of comprehensive omics studies of R. opacus PD630 including complete genome, transcriptome and proteome analysis. The genome of R. opacus PD630 encodes 8947 genes that are significantly enriched in the lipid transport, synthesis and metabolic, indicating a super ability of carbon source biosynthesis and catabolism. The comparative transcriptome analysis from three culture conditions revealed the landscape of gene-altered expressions responsible for lipid accumulation. The LD proteomes further identified the proteins that mediate lipid synthesis, storage and other biological functions. Integrating these three omics uncovered 177 proteins that may be involved in lipid metabolism and LD dynamics. A LD structure-like protein LPD06283 was further verified to affect the LD morphology. Our omics studies provide not only a first integrated omics study of prokaryotic LD organelle, but also a systematic platform for facilitating further prokaryotic LD research and biofuel development.Item Miror: A Method for Cell-Type Specific MicroRNA Occupancy Rate Prediction(Royal Soc Chemistry, 2014-03-13) Xie, Peng; Liu, Yu; Li, Yanda; Zhang, Michael Q.; Wang, Xiaowo; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.MicroRNA (miRNA) regulation is highly cell-type specific. It is sensitive to both the miRNA-mRNA relative abundance and the competitive endogenous RNA (ceRNA) effect. However, almost all existing miRNA target prediction methods neglected the influence of the cellular environment when analyzing miRNA regulation effects. In this study, we proposed a method, MIROR (miRNA Occupancy Rate predictor), to predict miRNA regulation intensity in a given cell type. The major considerations were the miRNA-mRNA relative abundance and the endogenous competition between different mRNA species. The output of MIROR is the predicted miRNA occupancy rates of each target site. The predicted results significantly correlated with Ago HITS-CLIP experiment that indicated miRNA binding intensities. When applied to the analysis of the breast invasive carcinoma dataset, MIROR identified a number of differentially regulated miRNA-mRNA pairs with significant miRNA occupancy rate changes between tumor and normal tissues. Many of the predictions were supported by previous research studies, including the ones without a significant change in the mRNA expression level. These results indicate that MIROR provides a novel strategy to study the miRNA differential regulation in different cell types.Item ModuleRole: A Tool for Modulization, Role Determination and Visualization in Protein-Protein Interaction Networks(Public Library of Science, 2014-05-01) Li, GuiPeng; Li, Ming; Zhang, YiWei; Wang, Dong; Li, Rong; Guimera, Roger; Gao, Juntao Tony; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Rapidly increasing amounts of (physical and genetic) protein-protein interaction (PPI) data are produced by various high-throughput techniques, and interpretation of these data remains a major challenge. In order to gain insight into the organization and structure of the resultant large complex networks formed by interacting molecules, using simulated annealing, a method based on the node connectivity, we developed ModuleRole, a user-friendly web server tool which finds modules in PPI network and defines the roles for every node, and produces files for visualization in Cytoscape and Pajek. For given proteins, it analyzes the PPI network from BioGRID database, finds and visualizes the modules these proteins form, and then defines the role every node plays in this network, based on two topological parameters Participation Coefficient and Z-score. This is the first program which provides interactive and very friendly interface for biologists to find and visualize modules and roles of proteins in PPI network. It can be tested online at the website http://www.bioinfo.org/modulerole/index.php, which is free and open to all users and there is no login requirement, with demo data provided by "User Guide'' in the menu Help. Non-server application of this program is considered for high-throughput data with more than 200 nodes or user's own interaction datasets. Users are able to bookmark the web link to the result page and access at a later time. As an interactive and highly customizable application, ModuleRole requires no expert knowledge in graph theory on the user side and can be used in both Linux and Windows system, thus a very useful tool for biologist to analyze and visualize PPI networks from databases such as BioGRID. Availability: ModuleRole is implemented in Java and C, and is freely available at http://www.bioinfo.org/modulerole/index.php. Supplementary information (user guide, demo data) is also available at this website. API for ModuleRole used for this program can be obtained upon request.Item Nucleosome Eviction and Multiple Co-Factor Binding Predict Estrogen-Receptor-Alpha-Asociated Long-Range Interactions(Oxford University Press, 2014-04-29) He, C.; Wang, X.; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Many enhancers regulate their target genes via long-distance interactions. High-throughput experiments like ChIA-PET have been developed to map such largely cell-type-specific interactions between cis-regulatory elements genome-widely. In this study, we integrated multiple types of data in order to reveal the general hidden patterns embedded in the ChIA-PET data. We found characteristic distance features related to promoter-promoter, enhancer-enhancer and insulator-insulator interactions. Although a protein may have many binding sites along the genome, our hypothesis is that those sites that share certain open chromatin structure can accommodate relatively larger protein complex consisting of specific regulatory and 'bridging' factors, and may be more likely to form robust long-range deoxyribonucleic acid (DNA) loops. This hypothesis was validated in the estrogen receptor alpha (ERa) ChIA-PET data. An efficient classifier was built to predict ERa-associated long-range interactions solely from the related ChIP-seq data, hence linking distal ERa-dependent enhancers to their target genes. We further applied the classifier to generate additional novel interactions, which were undetected in the original ChIA-PET paper but were validated by other independent experiments. Our work provides a new insight into the long-range chromatin interactions through deeper and integrative ChIA-PET data analysis and demonstrates DNA looping predictability from ordinary ChIP-seq data.Item OLego: Fast and Sensitive Mapping of Spliced mRNA-Seq Reads Using Small Seeds(Oxford University Press, 2013-04) Wu, Jie; Anczuk©w, Olga; Krainer, Adrian R.; Zhang, Michael Q.; Zhang, Chaolin; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.A crucial step in analyzing mRNA-Seq data is to accurately and efficiently map hundreds of millions of reads to the reference genome and exon junctions. Here we present OLego, an algorithm specifically designed for de novo mapping of spliced mRNA-Seq reads. OLego adopts a multiple-seed-and-extend scheme, and does not rely on a separate external aligner. It achieves high sensitivity of junction detection by strategic searches with small seeds (∼14 nt for mammalian genomes). To improve accuracy and resolve ambiguous mapping at junctions, OLego uses a built-in statistical model to score exon junctions by splice-site strength and intron size. Burrows-Wheeler transform is used in multiple steps of the algorithm to efficiently map seeds, locate junctions and identify small exons. OLego is implemented in C++ with fully multithreaded execution, and allows fast processing of large-scale data. We systematically evaluated the performance of OLego in comparison with published tools using both simulated and real data. OLego demonstrated better sensitivity, higher or comparable accuracy and substantially improved speed. OLego also identified hundreds of novel micro-exons (<30 nt) in the mouse transcriptome, many of which are phylogenetically conserved and can be validated experimentally in vivo. OLego is freely available at http://zhanglab.c2b2.columbia.edu/index.php/OLego.;