CSB Research
Permanent URI for this collectionhttps://hdl.handle.net/10735.1/3686
Browse
Browsing CSB Research by Author "99086074 (Zhang, MQ)"
Now showing 1 - 7 of 7
- Results Per Page
- Sort Options
Item Assembly and Validation of Versatile Transcription Activator-Like Effector Libraries(Nature Publishing Group, 2014-05-06) Li, Yi; Ehrhardt, Kristina; Zhang, Michael Q.; Bleris, Leonidas; 0000 0001 2535 9739 (Bleris, L); 0000 0001 1707 1372 (Zhang, MQ); 2012076942 (Bleris, L); 99086074 (Zhang, MQ); Zhang, Michael Q.The ability to perturb individual genes in genome-wide experiments has been instrumental in unraveling cellular and disease properties. Here we introduce, describe the assembly, and demonstrate the use of comprehensive and versatile transcription activator-like effector (TALE) libraries. As a proof of principle, we built an 11-mer library that covers all possible combinations of the nucleotides that determine the TALE-DNA binding specificity. We demonstrate the versatility of the methodology by constructing a constraint library, customized to bind to a known p53 motif. To verify the functionality in assays, we applied the 11-mer library in yeast-one-hybrid screens to discover TALEs that activate human SCN9A and miR-34b respectively. Additionally, we performed a genome-wide screen using the complete 11-mer library to confirm known genes that confer cycloheximide resistance in yeast. Considering the highly modular nature of TALEs and the versatility and ease of constructing these libraries we envision broad implications for high-throughput genomic assays. ;Item Characterizing the strand-specific distribution of non-CpG methylation in human pluripotent cells(Oxford University Press, 2013-12-16) Guo, Weilong; Chung, Wen-Yu; Qian, Minping; Pellegrini, Matteo; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.DNA methylation is an important defense and regulatory mechanism. In mammals, most DNA methylation occurs at CpG sites, and asymmetric non-CpG methylation has only been detected at appreciable levels in a few cell types. We are the first to systematically study the strand-specific distribution of non-CpG methylation. With the divide-and-compare strategy, we show that CHG and CHH methylation are not intrinsically different in human embryonic stem cells (ESCs) and induced pluripotent stem cells (iPSCs). We also find that non-CpG methylation is skewed between the two strands in introns, especially at intron boundaries and in highly expressed genes. Controlling for the proximal sequences of non-CpG sites, we show that the skew of non-CpG methylation in introns is mainly guided by sequence skew. By studying subgroups of transposable elements, we also found that non-CpG methylation is distributed in a strand-specific manner in both short interspersed nuclear elements (SINE) and long interspersed nuclear elements (LINE), but not in long terminal repeats (LTR). Finally, we show that on the antisense strand of Alus, a non-CpG site just downstream of the A-box is highly methylated. Together, the divide-and-compare strategy leads us to identify regions with strand-specific distributions of non-CpG methylation in humans.;Item FastDMA: An Infinium Humanmethylation450 Beadchip Analyzer(2013-09-05) Wu, D.; Gu, J.; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.DNA methylation is vital for many essential biological processes and human diseases. Illumina Infinium HumanMethylation450 Beadchip is a recently developed platform studying genome-wide DNA methylation state on more than 480,000 CpG sites and a few CHG sites with high data quality. To analyze the data of this promising platform, we developed FastDMA which can be used to identify significantly differentially methylated probes. Besides single probe analysis, FastDMA can also do region-based analysis for identifying the differentially methylated region (DMRs). A uniformed statistical model, analysis of covariance (ANCOVA), is used to achieve all the analyses in FastDMA. We apply FastDMA on three large-scale DNA methylation datasets from The Cancer Genome Atlas (TCGA) and find many differentially methylated genomic sites in different types of cancer. On the testing datasets, FastDMA shows much higher computational efficiency than current tools. FastDMA can benefit the data analyses of large-scale DNA methylation studies with an integrative pipeline and a high computational efficiency. The software is freely available via http://bioinfo.au.tsinghua.edu.cn/software/fastdma/.Item Integrated Omics Study Delineates the Dynamics of Lipid Droplets in Rhodococcus Opacus PD630(Oxford University Press, 2013-10-22) Chen, Yong; Ding, Yunfeng; Yang, Li; Yu, Jinhai; Liu, Guiming; Wang, Xumin; Zhang, Shuyan; Zhang, Michael Q.; Li, Yanda; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Rhodococcus opacus strain PD630 (R. opacus PD630), is an oleaginous bacterium, and also is one of few prokaryotic organisms that contain lipid droplets (LDs). LD is an important organelle for lipid storage but also intercellular communication regarding energy metabolism, and yet is a poorly understood cellular organelle. To understand the dynamics of LD using a simple model organism, we conducted a series of comprehensive omics studies of R. opacus PD630 including complete genome, transcriptome and proteome analysis. The genome of R. opacus PD630 encodes 8947 genes that are significantly enriched in the lipid transport, synthesis and metabolic, indicating a super ability of carbon source biosynthesis and catabolism. The comparative transcriptome analysis from three culture conditions revealed the landscape of gene-altered expressions responsible for lipid accumulation. The LD proteomes further identified the proteins that mediate lipid synthesis, storage and other biological functions. Integrating these three omics uncovered 177 proteins that may be involved in lipid metabolism and LD dynamics. A LD structure-like protein LPD06283 was further verified to affect the LD morphology. Our omics studies provide not only a first integrated omics study of prokaryotic LD organelle, but also a systematic platform for facilitating further prokaryotic LD research and biofuel development.Item Miror: A Method for Cell-Type Specific MicroRNA Occupancy Rate Prediction(Royal Soc Chemistry, 2014-03-13) Xie, Peng; Liu, Yu; Li, Yanda; Zhang, Michael Q.; Wang, Xiaowo; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.MicroRNA (miRNA) regulation is highly cell-type specific. It is sensitive to both the miRNA-mRNA relative abundance and the competitive endogenous RNA (ceRNA) effect. However, almost all existing miRNA target prediction methods neglected the influence of the cellular environment when analyzing miRNA regulation effects. In this study, we proposed a method, MIROR (miRNA Occupancy Rate predictor), to predict miRNA regulation intensity in a given cell type. The major considerations were the miRNA-mRNA relative abundance and the endogenous competition between different mRNA species. The output of MIROR is the predicted miRNA occupancy rates of each target site. The predicted results significantly correlated with Ago HITS-CLIP experiment that indicated miRNA binding intensities. When applied to the analysis of the breast invasive carcinoma dataset, MIROR identified a number of differentially regulated miRNA-mRNA pairs with significant miRNA occupancy rate changes between tumor and normal tissues. Many of the predictions were supported by previous research studies, including the ones without a significant change in the mRNA expression level. These results indicate that MIROR provides a novel strategy to study the miRNA differential regulation in different cell types.Item ModuleRole: A Tool for Modulization, Role Determination and Visualization in Protein-Protein Interaction Networks(Public Library of Science, 2014-05-01) Li, GuiPeng; Li, Ming; Zhang, YiWei; Wang, Dong; Li, Rong; Guimera, Roger; Gao, Juntao Tony; Zhang, Michael Q.; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.Rapidly increasing amounts of (physical and genetic) protein-protein interaction (PPI) data are produced by various high-throughput techniques, and interpretation of these data remains a major challenge. In order to gain insight into the organization and structure of the resultant large complex networks formed by interacting molecules, using simulated annealing, a method based on the node connectivity, we developed ModuleRole, a user-friendly web server tool which finds modules in PPI network and defines the roles for every node, and produces files for visualization in Cytoscape and Pajek. For given proteins, it analyzes the PPI network from BioGRID database, finds and visualizes the modules these proteins form, and then defines the role every node plays in this network, based on two topological parameters Participation Coefficient and Z-score. This is the first program which provides interactive and very friendly interface for biologists to find and visualize modules and roles of proteins in PPI network. It can be tested online at the website http://www.bioinfo.org/modulerole/index.php, which is free and open to all users and there is no login requirement, with demo data provided by "User Guide'' in the menu Help. Non-server application of this program is considered for high-throughput data with more than 200 nodes or user's own interaction datasets. Users are able to bookmark the web link to the result page and access at a later time. As an interactive and highly customizable application, ModuleRole requires no expert knowledge in graph theory on the user side and can be used in both Linux and Windows system, thus a very useful tool for biologist to analyze and visualize PPI networks from databases such as BioGRID. Availability: ModuleRole is implemented in Java and C, and is freely available at http://www.bioinfo.org/modulerole/index.php. Supplementary information (user guide, demo data) is also available at this website. API for ModuleRole used for this program can be obtained upon request.Item OLego: Fast and Sensitive Mapping of Spliced mRNA-Seq Reads Using Small Seeds(Oxford University Press, 2013-04) Wu, Jie; Anczuk©w, Olga; Krainer, Adrian R.; Zhang, Michael Q.; Zhang, Chaolin; 0000 0001 1707 1372 (Zhang, MQ); 99086074 (Zhang, MQ); Zhang, Michael Q.A crucial step in analyzing mRNA-Seq data is to accurately and efficiently map hundreds of millions of reads to the reference genome and exon junctions. Here we present OLego, an algorithm specifically designed for de novo mapping of spliced mRNA-Seq reads. OLego adopts a multiple-seed-and-extend scheme, and does not rely on a separate external aligner. It achieves high sensitivity of junction detection by strategic searches with small seeds (∼14 nt for mammalian genomes). To improve accuracy and resolve ambiguous mapping at junctions, OLego uses a built-in statistical model to score exon junctions by splice-site strength and intron size. Burrows-Wheeler transform is used in multiple steps of the algorithm to efficiently map seeds, locate junctions and identify small exons. OLego is implemented in C++ with fully multithreaded execution, and allows fast processing of large-scale data. We systematically evaluated the performance of OLego in comparison with published tools using both simulated and real data. OLego demonstrated better sensitivity, higher or comparable accuracy and substantially improved speed. OLego also identified hundreds of novel micro-exons (<30 nt) in the mouse transcriptome, many of which are phylogenetically conserved and can be validated experimentally in vivo. OLego is freely available at http://zhanglab.c2b2.columbia.edu/index.php/OLego.;