Название: Genetic Analysis of Complex Disease
Автор: Группа авторов
Издательство: John Wiley & Sons Limited
Жанр: Биология
isbn: 9781119104070
isbn:
Ascertainment
There are three basic designs of case ascertainment for a genetic analysis of a binary (present/absent) trait: collection of a single affected individual (case), relative pairs from a family, and extended families with multiple affected and unaffected individuals. Examples of these ascertainment schemes are shown in Figure 3.1. As shown below, certain sampling schemes limit the types of analyses that can be performed. However, one’s sampling scheme is often dictated by the natural history of the condition under investigation. For example, with late‐onset disorders such as Alzheimer disease, Parkinson disease, and chronic obstructive pulmonary disease, collection of the parents of an affected individual is often not feasible as the parents are often already deceased. In such cases, one may be restricted to collection of affected sibpairs, or a case‐control sample.
Figure 3.1 Ascertainment schemes for genetic analysis.
Single Affected Individual
Collection of a single affected individual can take place in the context of the traditional epidemiologic cross‐sectional, case‐control, and cohort designs, as well as the case‐parent trio design. The case‐parent trio design is primarily used in family‐based association analysis and includes collection of a case and both their parents. An alternative to the case‐control and trio designs is the case‐only design (Khoury and Flanders 1996). Similar to the trio design, the case‐only approach arose from concerns regarding selection of appropriate controls for the study of genetic factors in the more traditional case‐control approach. The case‐only approach has been promoted as a particularly useful approach in the examination of gene–environment interactions (Piegorsch et al. 1994). From the ascertainment perspective, collection of single affected individuals is more feasible because in complex disorders, large families with multiple affected individuals are often difficult to identify. A disadvantage of this approach is that it limits the statistical genetic analyses to specific types of association methods (discussed in more detail in Chapter 8). Traditional linkage analysis (described in more detail in Chapter 6) usually cannot be performed on a case‐control or trio data set because the necessary family structure is not available for most genetic models.
Relative Pairs
The use of relative pairs has been a common ascertainment design in the genetic analysis of complex disorders. This approach may include the use of sibling pairs that are either concordant for the disorder (affected sibpairs) or discordant (one affected and one unaffected). Monozygotic (identical) and dizygotic (nonidentical) twins are a special case of sibling pairs, and the utility of twins in genetic analysis is described in greater detail later in the chapter. Additionally, there are statistical methods that utilize information from other types of relative pairs, such as parent–child, avuncular pairs (e.g. uncle–niece), cousin pairs, and so on (Weeks and Lange 1988; Davis et al. 1996; Kruglyak et al. 1996). In this approach, relatives other than the analysis pair may also be collected, so that linkage analysis can be performed. Keep in mind that in the case of monozygotic twins, only one of the two individuals may be used in the linkage and association analysis because the twins share 100% of their genetic material.
Extended Families
Extended families refer to large families with many affected individuals in several generations. This study design is optimal for traditional linkage analysis but is often a rare occurrence in complex disorders. If such a family is identified, it is possible that the genetic liability in this particular family is due to a single gene, rather than a more complex etiology. Such a family would provide a unique opportunity to localize a single gene that has a large effect on disease risk in that family but may have a more moderate effect on disease etiology in the general population. Advances in high‐throughput sequencing technologies (described more in Chapter 10) have made genome sequencing of small numbers of affected family members feasible, allowing the direct examination of segregation of variants with disease in these pedigrees (described more in Chapter 6). Association methods may also be used with extended families. However, one must ensure that the association method being used considers the within‐family dependence (such as the Pedigree Disequilibrium Test (Martin et al. 2000b) or GenABEL (Aulchenko et al. 2007)) or selects only one affected individual from the family to be used in the analysis. A special case can be made for analyzing X‐linked variants within families (Choi et al. 2016; Turkmen and Lin 2020).
There are also variations on these three ascertainment schemes. For example, in an analysis of breast cancer in Australia, Hopper and colleagues (1999) employed a “case‐control‐family” design. In this approach, the cases and controls were selected first and subsequently additional family members were recruited based on the family history. If applied correctly, this approach will have the analytic advantages of a family study, and the results can be placed in the context of an epidemiologic study. Statistical issues associated with this design have been reviewed by other investigators (Liang and Pulver 1996; Seybolt et al. 1997) and will not be discussed here.
Many investigators have explored sampling schemes to determine the optimal ascertainment scheme for genetic analysis of complex disorders. McCarthy and colleagues (1998) considered sampling strategies for affected sibpairs and found that the power to detect a disease gene locus is highly dependent on the larger pedigree structure from which the sibpairs were drawn. Furthermore, they concluded that imposing a few restrictions on that pedigree structure (such as the presence of at least one unaffected sibling or parent) can provide a modest increase in power, and ascertaining random affected sibpairs (regardless of the larger pedigree structure) tends to be a robust approach under a variety of genetic inheritance models. The advantage of restricting the pedigree structure to one or fewer affected parents is that one can reduce the possibility of bilineality in the pedigree. Terwilliger and Goring (2000) have argued that, even in the case of complex disorders, ascertainment of large pedigrees is a more successful approach for genetic analysis than a case‐control approach as the large pedigrees increase the likelihood of genetic homogeneity and additionally, once ascertaining large pedigrees, one has more flexibility with regard to the types of analyses that may be performed. For example, one can analyze the entire pedigree for linkage analysis, and also, by breaking the family structure into smaller units, consider affected sibpair, affected relative pair, or trio approaches as complementary methods for identifying the disease genes. Badner et al. (1998), however, suggest that there is no benefit to collecting large pedigrees under certain genetic models (a qualitative trait with common alleles under single locus, additive and multiplicative inheritance models). In spite of the many elegant theoretical considerations of sampling schemes, there does not appear to be any consensus with regard СКАЧАТЬ