Fb2Gratis.com

Genetic Analysis of Complex Disease. Группа авторов
Чтение книги онлайн.

Читать онлайн книгу Genetic Analysis of Complex Disease - Группа авторов страница 12

Информация о книге:

Название: Genetic Analysis of Complex Disease

Автор: Группа авторов

Издательство: John Wiley & Sons Limited

Жанр: Биология

Серия:

isbn: 9781119104070

isbn:

СКАЧАТЬ human leukocyte antigen genes, T‐cell receptor genes, and the myelin basic protein gene, are prime candidates for analysis. The strength and weakness of this approach arise from the confidence in the role of these genes. If the evidence is strong that a direct role is played, only a few such genes may need to be tested to find a trait‐associated variant. If the evidence is more circumstantial, then many genes may have equal justification for being studied, and not much is gained over conducting a genome‐wide screen. Such studies are now most often conducted as follow‐up of prior genomic screens or other hypothesis‐generating experiments.

Analysis

Genomic Analysis

Generally, genome‐wide genotyping or sequencing is the first analytic step. Such studies may use newly collected blood samples or stored blood samples (or extracted DNA or RNA) made available by a biorepository. Depending on the goal of the study and its design, genome‐wide genotyping, sequencing, gene expression, or epigenetic analysis may be performed on these samples. Some studies may be able to re‐use stored genotype or sequence data available from public repositories (such as dbGaP [https://www.ncbi.nlm.nih.gov/gap] or the European Genome‐phenome Archive [https://www.ebi.ac.uk/ega/home]) or from prior studies of the sample being used. The technologies and approaches to these molecular experiments are covered in Chapter 10. In each case, it is important to formulate a quality control plan to detect potential laboratory errors such as sample switches, failed genotyping probes, sequencing errors, and batch effects. When possible, coordinating laboratory analysis with initial analytic quality control is optimal for finding and correcting such errors. If archived genomic data are being used, careful review of the initial quality control protocols and further checks (when possible) in the subsequent analysis is recommended.

Statistical Analysis

The analysis of genetic and phenotypic data for a complex trait is multifaceted and depends on the research question, study design, genomic data available, and phenotypic characteristics. Methods to analyze these data are under constant development, and new approaches are continuously being released. Therefore, the analytic strategy for a genomic study must be reviewed periodically and revised if necessary to take advantage of newly developed approaches. Depending on the study design, the analytic plan may include linkage analysis (Chapter 6) in families or association studies in families or population samples (Chapters 8 and 9). These approaches are not mutually exclusive – a design may start with a linkage analysis of large families followed by association analysis within regions of linkage. Similarly, other multi‐stage studies conduct a GWAS of individual SNPs (Chapter 9) and then incorporate gene–gene and gene–environment interactions to identify additional genetic loci. Additionally, “data mining” approaches may be applied to these datasets to extract even more genetic information using data reduction techniques, set‐based tests, and pathway analyses. These more complex analyses are discussed in detail in Chapter 11.

Bioinformatics

The large amount of information generated by any genomic study of a complex trait requires careful attention to quality control, efficient and secure storage, and compliance with data‐sharing requirements and privacy protections. These activities require a well‐designed and secure database system. Such systems have evolved over time from text files to relational databases, to large‐scale “data warehouses.” Such datasets also require large‐scale processing power with ample attached storage to facilitate linkage and association studies. High‐throughput sequencing in particular requires a large amount of storage and computational power for genome alignment (or assembly) and base calling. For multi‐site studies, these resources may need to be accessible from multiple locations, requiring levels of access and security depending on the role on the study and need to access other sites’ information. In addition to maintaining local resources for a study, a bioinformatics team also must be familiar with many different public sources of genomic data (e.g. UCSC and Ensembl browsers, ENCODE databases, sequence repositories, dbGaP) and be able to submit results to public repositories for sharing with the wider research community. These issues are discussed in more detail in Chapter 7.

Follow‐up

Variant Detection

Once a single gene (or region) is implicated by a screen (linkage or association), it is necessary to examine it for potentially functional variations that might explain the linkage or association signal. For positional cloning efforts, this generally consisted of sequencing the minimum candidate region and identifying mutations that segregated with the trait in families. For complex traits, this effort is more difficult, and the variant being sought may be a more common, yet functional, polymorphism. Several strategies, including haplotype analysis, conditional analysis, and exhaustive sequencing, may be used in this case. The analyses required for such efforts are discussed in Chapters 8 and 9. However, statistical analysis of a single dataset only goes so far to establish a trait‐associated variant. Additional studies, including replication in independent datasets and functional studies in cellular and animal models, may be required to ultimately determine if a variant influences the biology underlying the complex trait.

Replication

The literature on most complex traits is at this point littered with initial reports of allelic or genotypic associations that cannot be replicated at all (or are replicated in a small minority of studies). Reproducibility of findings in independent samples is a critical characteristic most investigators seek when weighing the evidence for a trait‐associated variant. Because of this, most studies (particularly those seeking government or foundation funding) now include a plan for replication of findings in a second dataset. These replication datasets should be independent of the initial finding (e.g. do not overlap with the discovery dataset) and be assessed in similar fashion (e.g. phenotype definitions agree, ascertainment is similar, genetic analysis is comparable). This does not mean that the datasets must be from the same population – indeed, demonstrating replication across populations (e.g. European, Asian, and African) for a common complex trait locus may add strength to the study. However, for rare variants, cross‐population replication might be more difficult (due to population‐specific alleles); for such studies, replication in a second sample from the sample population would be desirable.

Functional Studies

While most disease gene discovery efforts have claimed success based on finding variants that segregate with traits in pedigrees or polymorphisms significantly associated with the trait in population samples, this is, strictly, not sufficient evidence. More conclusive is evidence arising from biological systems (e.g. cultured cells, animal models, or human blood and tissue samples) that the trait can be either induced by introduction of the allele or ameliorated by blocking the action of the allele. In genetically complex traits, where the responsible variation may be a common polymorphism, it is even more critical that such evidence be found before success is declared.

Tests in biological systems can be of several types. Perhaps the most common is to test the action of the gene in a СКАЧАТЬ

Genetic Analysis of Complex Disease. Группа авторов Чтение книги онлайн.