乳腺癌RNA-Seq数据的差异表达分析 及ceRNA网络初探
王娅丽
学位类型硕士
导师马占山
2017-01
学位授予单位中国科学院大学
学位授予地点北京
关键词乳腺癌 Rna-seq数据 基因差异表达 Cerna网络 Breast Cancer Rna-seq Data Gene Differential Expression Cerna Network
摘要

乳腺癌是目前全世界女性发病率最高的癌症,每年约有140万人被确诊为乳腺癌,死于该病的人数大约50万人。虽然医疗水平和诊疗技术在不断进步、新药物在不断研发和上市,但是全世界每年乳腺癌发病率还在不断的攀升而且发病年龄逐渐下降。这主要是因为目前为止乳腺癌的发病机制尚未明确,不能及早发现异常延误诊断致使晚期乳腺癌患者增多。研究表明绝大多数乳腺癌是多基因控制下的基因表达水平变异引起的疾病,随着测序技术的发展,RNA-Seq测序技术在全基因组水平上研究基因表达具有巨大的优势。挖掘乳腺癌样本和正常组织样本之间差异表达为理解乳腺癌的发病机制的研究奠定了一定的基础。近几年,LncRNA作为ceRNA(competing endogenous RNAs)的功能对疾病的影响得到了越来越多的关注,研究lncRNA作为ceRNA的功能在乳腺癌中的作用是乳腺癌发病机制研究的重要组成部分。本次研究从GEO数据库中下载到一份乳腺癌RNA-Seq数据,该数据可分为ER+ 乳腺癌、ER+ AJ(ER+ 乳腺癌癌旁组织)、TNBC、TNBC AJ(TNBC 癌旁组织)、Reduce(缩乳手术取出的健康组织)五组。首先应用tophat和cufflinks软件来进行转录组组装和差异表达基因的分析,然后进行差异表达基因的GO 和KEGG功能注释,最后用基因表达列联表结合starbase 数据库中的基因互作关系构建ceRNA网络并比较网络属性的差异。基因差异表达结果显示癌症组织和癌旁组织之间有7000个以上的基因差异表达。ER+乳腺癌的KEGG通路主要是激素相关的功能通路,而TNBC主要注释在细胞循环和癌症通路上。四组样本所构建的网络均是无标度网络,癌症样本网络的整体属性显著地小于非癌样本,癌症样本的节点网络属性和非癌样本之间差异显著,两个癌症样本的ceRNA网络中均没有lncRNA。KEGG通路注释结果表明ER+乳腺癌的发生主要是激素相关基因表达调控的失常而导致的,通过调控相关激素的通路有可能对ER+乳腺癌起到抑制作用;三阴性乳腺癌与ER+乳腺癌相比,它的改变更多的是与细胞凋亡相关通路,因此它对激素治疗不敏感,也更难治疗,由此推测细胞凋亡相关通路和基因有可能成为以后三阴性乳腺癌治疗的重要靶标。CeRNA网络的结果表明lncRNA行使ceRNA的功能可能是乳腺癌的一个重要特征,并推测lncRNA作为ceRNA作用的消失极有可能是癌症发生和发展的一个重要因素。

其他摘要

In the present world, breast cancer stands for the highest incidence of all cancers among females, with 1.4 million confirmed cases and 500 thousand death toll per year. In spite that the ongoing development of medical treatments and diagnosing technologies have brought us the constantly emerging pharmaceuticals, the incidence of breast cancer has never been lowered. On the contrary, it has grown higher even featured with a descending onset age. The background for this situation is that the pathogenesis of breast cancer has not been surfaced, leading to the hiding of the disease hence the delayed diagnosis and the increased number of advanced patients. Studies have shown that the emergence and growth of breast cancer are not merely related to genetic mutation, but also the level of gene expression, which can be thoroughly analyzed by the outstandingly effective RNA-Seq approach, the instrument of this thesis. The influence of lncRNA-associated ceRNA(competing endogenous RNA) upon diseases has drawn rising attentions through recent years, making the research of how lncRNA-associated ceRNA functions against breast cancer an indispensable part in its pathogenesis.This thesis utilizes a series of RNA-Seq data from GEO database which are divided here as five groups and they are ER+, ER+ AJ(ER+ adjacent tissues), TNBC, TNBC AJ(TNBC adjacent tissues) and Reduce (healthy tissues acquired from breast reduction surgery) respectively. Firstly, the software of tophat and cufflinks are applied to analyze the transcriptome in differential expression genes, then the differentially expressed genes are taken under the GO and KEGG annotation. The last step is establishing ceRNA network and comparing its attribute through the gene expression matrix combined with the gene interaction downloaded from starbase.The result is that over 6000 differential gene expressions are detected between cancer tissue and its adjacent tissues. KEGG pathway of ER+ breast cancer is mainly hormone-related functional annotation, whereas TNBC is annotated on cell cycle andcancer pathway. On the scale-free properties of all four networks, we can see that those of the cancer sample networks as a whole are distinctively smaller than non-cancer ones, and the properties of nodes in cancer sample networks are greatly differed from non-cancer ones. Both of the ceRNA networks of the two cancer samples are proved to be without lncRNA.KEGG path annotation result indicates that the incidence of ER+ breast cancer is led from the disorder of hormone-related gene expression regulatory dysfunction, and reversing the disorder by regulating the pathways of certain hormone can presumably function in inhibiting ER+ breast cancer. Triple-negative breast cancer(TNBC), however, are less sensitive towards hormone therapy hence can be more stubborn against treatments, for the pathway changes of it happened to be more apoptosis-related compared with ER+, which brings the enlightening possibility that the apoptosis-related pathway and its genes offer a way-out concerning tripe-negative breast cancer healing. Through ceRNA network we acknowledge that lncRNA functioning as ceRNA can be a crucial trait of breast cancer, followed with the speculation that the vanishing of lncRNA-associated ceRNA may significantly affect the emergence and growth of cancer. 

学科领域生物学
学科门类理学
语种中文
文献类型学位论文
条目标识符http://ir.kiz.ac.cn/handle/152453/12604
专题科研部门_计算生物与医学生态学(马占山)
推荐引用方式
GB/T 7714
王娅丽. 乳腺癌RNA-Seq数据的差异表达分析 及ceRNA网络初探[D]. 北京. 中国科学院大学,2017.
条目包含的文件
文件名称/大小 文献类型 版本类型 开放类型 使用许可
2017硕士学位论文-王娅丽【导师】马占(1386KB)学位论文 开放获取CC BY-NC-SA请求全文
个性服务
推荐该条目
保存到收藏夹
查看访问统计
导出为Endnote文件
谷歌学术
谷歌学术中相似的文章
[王娅丽]的文章
百度学术
百度学术中相似的文章
[王娅丽]的文章
必应学术
必应学术中相似的文章
[王娅丽]的文章
相关权益政策
暂无数据
收藏/分享
所有评论 (0)
暂无评论
 

除非特别说明,本系统中所有内容都受版权保护,并保留所有权利。