论文部分内容阅读
基因组变异是个体间疾病易感性和药物反应等表型多样性的遗传基础.国际人类单体型图(International HapMap)旨在为复杂疾病相关遗传变异的研究提供路线图.单核苷酸多态性(SNPs)是HapMap的基本要素.SNPs等位基因频率影响连锁不平衡结构、单体型的构建、标签SNPs的筛选,是影响HapMap精度的主要因素之一.因此,次要等位基因频率筛选阈值的选择对图谱精度有深远影响.迄今大多数研究者选用自定的阈值,且鲜有针对次要等位基因频率筛选阈值对HapMap精度影响的研究.为探讨次要等位基因频率筛选阈值对相应HapMap精度的影响,本研究用中国汉、藏族人群15号染色体中心粒区域基因的测序结果按不同次要等位基因频率筛选阈值(≥0.01,≥0.05,≥0.10)将以往的数据分成了3组,即0.01组、0.05组以及0.10组,分别构建了3组数据的HapMap,并比较了各组HapMap精度、关联分析的研究效能及节约/总成本比值.结果显示,0.01组有最高的关联分析研究效能(相比0.05组:汉族,P=0.019;藏族,P=0.029),并捕获了最多的人群特异性单体型(相比0.05组,P=0.012).在所检区域内,与0.10阈值相比,0.05阈值并没有显著提高关联分析的研究效能(汉族,P=0.191;藏族,P=1.000)及人群特异性单体型的捕获(P=0.592).同时,在藏族人群中,0.05与0.10组产生了相同数据的标签SNPs效率、连锁不平衡结构域的数目和平均长度、关联分析研究效能及节约/总成本比值.结果提示,较低的次要等位基因频率筛选阈值更适合着重于人群特异性单体型的研究;不同人群最佳次要等位基因频率筛选阈值可能不尽相同.由于本研究检测基因数目有限,这一重要议题仍需更多深入的探讨.
Genome variation is the genetic basis for phenotypic diversity among individuals such as disease susceptibility and drug response etc. International HapMap is intended to provide a roadmap for the study of genetic variation associated with complex diseases Single nucleotide polymorphisms (SNPs) are the basic elements of HapMap.The allele frequencies of SNPs affect the structure of linkage disequilibrium, the construction of haplotypes and the screening of SNPs.This is one of the main factors affecting the accuracy of HapMap.Therefore, the frequency of minor alleles The selection of screening threshold has a far-reaching impact on the accuracy of the map.Most of the researchers so far choose to use custom thresholds, and there is little research on the impact of the secondary allele frequency screening threshold on the accuracy of HapMap.To explore the secondary allele frequency screening Threshold on the accuracy of the corresponding HapMap, the Chinese Han and Tibetan population chromosome 15 centriole gene sequencing results according to different minor allele frequency screening threshold (≥0.01, ≥0.05, ≥0.10) the past data Divided into three groups, namely 0.01 group, 0.05 group and 0.10 group, respectively, constructed three sets of data HapMap, and compared the accuracy of each group HapMap, correlation analysis of the research performance and section / Total cost ratio The results showed that the 0.01 group had the highest correlation between study efficacy (compared to 0.05 group: Han, P = 0.019; Tibetan, P = 0.029) and captured the most population-specific haplotypes 0.05 group, P = 0.012). In the examined area, the 0.05 threshold did not significantly improve the efficacy of the association analysis (Han Chinese, P = 0.191; Tibetan, P = 1.000) and population-specific monomer (P = 0.592) .In the same time, the efficiency of labeling SNPs, the number of linkage disequilibrium domains, the average length of linkage disequilibrium and the ratio of cost / total cost . The results suggest that the lower threshold frequency of minor allele frequencies is more suitable for population-specific haplotype research; the screening threshold for the optimal minor allele frequency may vary from population to population. A limited number of this important issue still need more in-depth discussion.