-
قابل دانلود از دوشنبه, ۱۳ دی ۱۴۰۰
فهرست مطالب: 1 Introduction to Heterogeneity in Statistical Genetics . . . . . . . . . . . . . . . 1 1.1 Different Types of Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 A Note on Definitions and Notation Throughout This Book . . . . . . . 7 1.3 Hardy–Weinberg Equilibrium (HWE) Proportions and Their Importance in Gene-Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.4 Determination of Conditional Genotype Frequencies . . . . . . . . . . . . 9 1.4.1 Genetic Model-Free Approaches . . . . . . . . . . . . . . . . . . . . . . . 9 1.4.1.1 Locus Genotype Frequencies Follow HWE Proportions in Both Populations . . . . . . . . . . . . . . . . 10 1.4.1.2 Locus Genotype Frequencies Follow HWE Proportions in One Population but Not Both . . . . . 11 1.4.1.3 Locus Genotype Frequencies Follow HWE Proportions in Neither Population . . . . . . . . . . . . . . 12 1.4.2 Genetic Model-Based Approach Through the Use of Genotype Relative Risks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 1.4.2.1 Genetic Model-Based Approach Through the Use of Logistic Model . . . . . . . . . . . . . . . . . . . . . 18 1.5 The Box (and Whiskers) Plot as a Tool for Visualizing Empirical Data Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 1.6 Power and Minimum Sample Size (MSSN) for Different Statistical Tests of Genetic Association . . . . . . . . . . . . . . . . . . . . . . . . 19 1.6.1 Contingency Table for Organizing Categorical Phenotype and Genomic-Data . . . . . . . . . . . . . . . . . . . . . . . . . 21 1.6.1.1 Formula for Chi-Square Test of Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 1.6.1.2 (Cochran-Armitage) Test of Trend . . . . . . . . . . . . . . 23 1.6.1.3 The Transmission Disequilibrium Test for Detecting Linkage in the Presence of Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 1.6.1.4 Computing Power and MSSN for Tests of Genetic Association . . . . . . . . . . . . . . . . . . . . . . . . 25 1.7 The Expectation–Maximization (EM) Algorithm . . . . . . . . . . . . . . . . 28 xvxvi Contents 1.7.1 Example Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 1.7.1.1 Implementation of the Algorithm . . . . . . . . . . . . . . . 30 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 2 Overview of Genomic Heterogeneity in Statistical Genetics . . . . . . . . . 53 2.1 Heterogeneity Due to SNP Genotype Misclassification . . . . . . . . . . . 53 2.2 Examples of How Genotype Misclassification May Arise in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 2.3 Mathematical Models of Genotype Misclassification . . . . . . . . . . . . . 57 2.4 Genotype Misclassification for Genomic Data with Three or More Categories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 2.5 Effects of Misclassification on Statistical Tests . . . . . . . . . . . . . . . . . . 59 2.5.1 Non-differential Misclassification Error . . . . . . . . . . . . . . . . . 59 2.5.2 Differential Misclassification Error . . . . . . . . . . . . . . . . . . . . . 61 2.5.3 Non-differential Misclassification in Family-Based Tests of Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 2.6 Errors in Next-Generation Sequencing (NGS) . . . . . . . . . . . . . . . . . . 71 2.6.1 Definitions and Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 2.6.1.1 What Are Estimated NGS Probabilities for Empirical Data? . . . . . . . . . . . . . . . . . . . . . . . . . . 76 2.6.2 Mathematical Model for NGS Data . . . . . . . . . . . . . . . . . . . . . 82 2.6.3 Empirical Type I Error for Test Statistics Applied to NGS Data with Sequence Error—Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 2.7 Non-misclassification Forms of Heterogeneity . . . . . . . . . . . . . . . . . . 86 2.7.1 Mathematical Model for Heterogeneity . . . . . . . . . . . . . . . . . . 86 2.7.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 2.7.1.2 Mathematical Model for Locus Heterogeneity—Equations . . . . . . . . . . . . . . . . . . . . 88 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88 3 Phenotypic Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 3.1 Phenotype Misclassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 3.2 How Phenotype Misclassification May Arise in Practice . . . . . . . . . 101 3.2.1 Lack of Access to Gold-Standard Classification . . . . . . . . . . 101 3.2.2 Variability of Phenotype Expression over Time . . . . . . . . . . . 102 3.2.3 Variable Age of Onset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 3.2.4 Incomplete Knowledge of Gold-Standard Classifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 3.2.5 Model Misspecification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 3.3 Effects of Misclassification on Statistical Tests . . . . . . . . . . . . . . . . . . 104 3.3.1 Non-differential Misclassification Error Example for Single-Stage Genetic Association . . . . . . . . . . . . . . . . . . . 104 3.3.2 Why Do We Observe Such Large Power Loss/MSSN Increase for Phenotype Misclassification? . . . . . . . . . . . . . . . 106Contents xvii 3.3.3 Multi-stage Phenotype Classification and Limits of Observed Genotype Frequencies . . . . . . . . . . . . . . . . . . . . . 109 3.3.3.1 Conditional Genotype Frequencies in Presence of Conditionally Independent Phenotype Classification . . . . . . . . . . . . . . . . . . . . . . 110 3.3.3.2 Conditional Genotype Frequencies in the Presence of Biased Phenotype Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 3.4 Non-misclassification Forms of Heterogeneity . . . . . . . . . . . . . . . . . . 116 3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 4 Association Tests Allowing for Heterogeneity . . . . . . . . . . . . . . . . . . . . . 129 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 4.2 Statistical Tests that Use Genotype Data . . . . . . . . . . . . . . . . . . . . . . . 130 4.2.1 Likelihood Ratio Test that Allows for Random Phenotype and Genotype Misclassification Error (LRTae) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 4.2.1.1 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . 130 4.2.1.2 Log-Likelihoods of the Observed Data . . . . . . . . . . 133 4.2.1.3 Test Statistic—Likelihood Ratio Test Allowing for Error (LRTae) . . . . . . . . . . . . . . . . . . . . 137 4.2.1.4 Example Application . . . . . . . . . . . . . . . . . . . . . . . . . 138 4.2.2 Trend Statistic that Allows for Random Phenotype and Genotype Misclassification Error . . . . . . . . . . . . . . . . . . . 144 4.2.2.1 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . 145 4.2.2.2 Log-Likelihoods of the Observed Data . . . . . . . . . . 146 4.2.2.3 Test Statistic—(Linear) Test of Trend Allowing for Error (LTTae) . . . . . . . . . . . . . . . . . . . . 151 4.2.3 Likelihood Ratio Statistic for Family-Based Association that Incorporates Genotype Misclassification Errors (TDTae) . . . . . . . . . . . . . . . . . . . . . . . 151 4.2.3.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 4.2.3.2 Determination of the Bayesian Posterior Probabilities (BPPs) τ ( (r abc ) )(xyz) . . . . . . . . . . . . . . . . . 157 4.2.3.3 TDTae Parameter Estimates . . . . . . . . . . . . . . . . . . . . 157 4.2.3.4 Log-Likelihood of Observed Data . . . . . . . . . . . . . . 160 4.2.3.5 The TDTae Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . 160 4.3 Statistical Tests that Consider Heterogeneity Other Than Misclassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 4.3.1 Mixture Likelihood Ratio Test (MLRT) for Genetic Association in the Presence of Locus Heterogeneity . . . . . . . 161 4.3.1.1 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . 161 4.3.1.2 Log-Likelihoods of the Observed Data . . . . . . . . . . 163 4.3.1.3 Example Application . . . . . . . . . . . . . . . . . . . . . . . . . 165xviii Contents 4.3.1.4 Computing the MLRT Statistic for the Example Data . . . . . . . . . . . . . . . . . . . . . . . . . 172 4.3.2 Transmission Disequilibrium Test that Allows for Locus Heterogeneity (TDT-HET) . . . . . . . . . . . . . . . . . . . 174 4.3.2.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 4.3.2.2 Determination of the BPPs τ (r) (m)(abc) . . . . . . . . . . . . . 176 4.3.2.3 TDT-HET Parameter Estimates . . . . . . . . . . . . . . . . 177 4.3.2.4 Log-Likelihood of Observed Data . . . . . . . . . . . . . . 178 4.3.2.5 Computing the TDT-HET Statistic . . . . . . . . . . . . . . 179 4.3.2.6 Example Calculation . . . . . . . . . . . . . . . . . . . . . . . . . 179 4.3.2.7 How TDT-HET Permutation p-Values Are Computed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183 4.3.2.8 A Proof of the Robustness of the TDT-HET Statistic’s Type I Error When Null Data Are Drawn from Multiple Sub-populations . . . . . . . . . . 183 4.3.3 Tests that Incorporate Phenotype Heterogeneity . . . . . . . . . . 185 4.3.3.1 Analysis of Data with R (Greater Than Two) Phenotypes and C (Greater Than One) Genomic Data Categories . . . . . . . . . . . . . . . . . . . . . 186 4.3.3.2 Example Application of Chi-Square Test of Independence to Multiple Phenotypes . . . . . . . . 187 4.3.3.3 Does Modeling Phenotypic Heterogeneity Increase Power for Detecting Association? Results from Example Data . . . . . . . . . . . . . . . . . . . 193 4.3.3.4 Other Methods for Addressing Phenotype Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 4.3.3.5 Morton’s M-Test for Heterogeneity Applied to Different Groups of Phenotypes . . . . . . . . . . . . . . 197 4.4 Statistical Tests that Use Sequence Data . . . . . . . . . . . . . . . . . . . . . . . 203 4.4.1 Single-Variant and Multiple Variant Tests of Trend for Genetic Association that Allows for Random and Differential NGS Error LTTae,NGS . . . . . . . . . . . . . . . . . 203 4.4.1.1 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . 204 4.4.1.2 Log-Likelihood of the Observed Data . . . . . . . . . . . 207 4.4.1.3 LTTae,NGS Parameter Estimates . . . . . . . . . . . . . . . . 208 4.4.1.4 LTTae,NGS Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . 209 4.4.2 Transmission Disequilibrium Test that Allows for Next-Generation Sequence Error (TDT1-NGS) . . . . . . . . 209 4.4.2.1 Notation and Definitions . . . . . . . . . . . . . . . . . . . . . . 210 4.4.2.2 Log-Likelihood of the Observed Data . . . . . . . . . . . 214 4.4.2.3 TDT1-NGS Parameter Estimates . . . . . . . . . . . . . . . 215 4.4.2.4 TDT1-NGS Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . 217 4.4.2.5 Example Calculations . . . . . . . . . . . . . . . . . . . . . . . . 217 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237Contents xix 5 Designing Genetic Linkage and Association Studies that Maintain Desired Statistical Power in the Presence of Mixtures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 247 5.1 Parameter Settings, for Example, Calculations . . . . . . . . . . . . . . . . . . 247 5.1.1 Example Parameter Settings to Compute Power for a Fixed Sample Size and Significance Level . . . . . . . . . . 248 5.1.2 Example Parameter Settings to Compute MSSN for a Fixed Power and Significance Level . . . . . . . . . . . . . . . . 249 5.2 Statistical Tests that Use Genotype Data . . . . . . . . . . . . . . . . . . . . . . . 250 5.2.1 Power and MSSN for Population-Based Data in the Presence of Non-differential Genotype Misclassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 5.2.1.1 Example Power Calculation . . . . . . . . . . . . . . . . . . . 252 5.2.2 Power and MSSN for Population-Based Data in the Presence of Non-differential Phenotype Misclassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 5.2.2.1 Example MSSN Calculation . . . . . . . . . . . . . . . . . . . 254 5.2.3 Likelihood Ratio Test that Allows for Random Phenotype and Genotype Misclassification Error (LRTae)—Empirical Power . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 5.2.3.1 Genetic Model Parameters Determined Using Two Loci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 5.2.3.2 Conditional Two-Locus Genotype Frequencies Based on Affection Status . . . . . . . . . . 256 5.2.3.3 Results of Simulations for LRTae Under Two-Locus Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 5.2.4 Trend Statistic that Allows for Random Phenotype and Genotype Misclassification Error . . . . . . . . . . . . . . . . . . . 276 5.2.5 Family-Based Tests of Association—Analytic Solution to Increase in Rejection Rate for TDT in the Presence of Genotype Misclassification Errors . . . . . . 277 5.2.5.1 Non-centrality Parameter and Inflation in Rejection Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 5.2.6 Family-Based Tests of Association—Analytic Solution to Increase in Rejection Rate for TDT in the Presence of Phenotype Misclassification Errors . . . . . 284 5.2.6.1 Example MSSN Calculations for TDT in the Presence of Phenotype Misclassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286 5.3 Statistical Tests that Consider Heterogeneity Other Than Misclassification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 5.3.1 Sample Size Calculations in the Presence of Locus Heterogeneity—Population-Based Tests of Genetic Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287 5.3.1.1 Example Power Calculation—Test of Trend . . . . . . 287xx Contents 5.3.1.2 Factors that Most Significantly Influence MSSN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 5.3.2 Power and Sample Size Calculations for Chi-Square Tests of Independence on Allele and Genotype Data for Phenotype Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 5.3.3 Family-Based Test of Linkage/Association . . . . . . . . . . . . . . . 293 5.3.3.1 Example MSSN Calculations for TDT in the Presence of Locus Heterogeneity . . . . . . . . . 294 5.4 Power Calculations in the Presence of NGS Misclassification . . . . . 295 5.4.1 Test of Trend Applied to Multiple NGS Data for SNP Loci . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 5.4.2 Increase in Interest for NGS Statistics . . . . . . . . . . . . . . . . . . . 297 5.4.3 Empirical Null and Power Simulations for the LTTae,NGS Statistic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 5.4.3.1 Empirical Type I Error (Null) Results . . . . . . . . . . . 299 5.4.3.2 Empirical Power Results . . . . . . . . . . . . . . . . . . . . . . 302 5.4.3.3 Additional Investigation of Three Factors . . . . . . . . 304 5.4.4 Factors that Most Significantly Affect Power for NGS-Based TDT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316 6 Threshold-Selected Quantitative Trait Loci and Pleiotropy . . . . . . . . . 323 6.1 Quantitative Trait Locus with Single Phenotype . . . . . . . . . . . . . . . . . 323 6.1.1 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 6.1.2 Conditional Genotype Frequencies for Threshold-Selected Phenotypes . . . . . . . . . . . . . . . . . . . . . 325 6.1.3 Example Sample Size Calculation for Threshold-Selected Phenotypes . . . . . . . . . . . . . . . . . . . . . 326 6.1.4 Why Use Threshold-Selected Dichotomous Phenotypes as Compared with Quantitative Phenotypes? Power Comparison with ANOVA . . . . . . . . . . . 328 6.2 Quantitative Trait Locus with Multiple Phenotypes . . . . . . . . . . . . . . 329 6.2.1 Notation for Multivariate Quantitative Traits . . . . . . . . . . . . . 330 6.2.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 6.2.3 Thresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 6.2.4 Example MSSN Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 6.2.5 A Final Note on Advantages of the Threshold-Selected Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349 مشخصات فایل |
|
عنوان (Title): | Heterogeneity in Statistical Genetics_ How to Assess, Address, and Account for Mixtures in Association Studies |
نام فایل (File name): | 599-www.GeneProtocols.ir-Heterogeneity in Statistical Genetics_ How to Assess, Address, and Account for Mixtures in Association Studies(2020).pdf |
عنوان فارسی (Title in Persian): | ناهمگونی در ژنتیک آماری - نحوه ارزیابی و محاسبه اختلاط در مطالعات همخوانی |
ایجاد کننده: | Derek Gordon, Stephen J. Finch, Wonkuk Kim |
زبان (Language): | انگلیسی English |
سال انتشار: | 2020 |
شابک ISBN: | 3030611205, 9783030611200 |
نوع سند (Doc. type): | کتاب |
فرمت (File extention): | |
حجم فایل (File size): | 6.22 |
تعداد صفحات (Book length in pages): | 366 |
تمامی درگاه های پرداخت ژنـ پروتکل توسط شرکت دانش بنیان نکست پی پشتیبانی می شود. نکست پی دارای مجوز رسمی پرداختیاری به شماره 1971/ص/98 ، از شرکت شاپرک و بانک مرکزی جمهوری اسلامی ایران و دارای نماد اعتماد در حوزه (متمرکزکنندگان پرداخت) از مرکز توسعه تجارت الکترونیکی وزارت صنعت معدن و تجارت است.