توضیحاتی در مورد کتاب Multivariate Statistical Machine Learning Methods for Genomic Prediction
نام کتاب : Multivariate Statistical Machine Learning Methods for Genomic Prediction
ویرایش : 1st ed. 2022
عنوان ترجمه شده به فارسی : روشهای یادگیری ماشینی آماری چند متغیره برای پیشبینی ژنومی
سری :
نویسندگان : Osval Antonio Montesinos López, Abelardo Montesinos López, José Crossa
ناشر : Springer
سال نشر : 2022
تعداد صفحات : 707
ISBN (شابک) : 3030890090 , 9783030890094
زبان کتاب : English
فرمت کتاب : pdf
حجم کتاب : 12 مگابایت
بعد از تکمیل فرایند پرداخت لینک دانلود کتاب ارائه خواهد شد. درصورت ثبت نام و ورود به حساب کاربری خود قادر خواهید بود لیست کتاب های خریداری شده را مشاهده فرمایید.
توضیحاتی در مورد کتاب :
این کتاب تحت مجوز CC BY 4.0 با دسترسی آزاد است
این کتاب دسترسی آزاد جدیدترین مدلهای پیشبینی پایه ژنوم را که در حال حاضر توسط آماردانان، پرورش دهندگان و پرورشدهندگان استفاده میشود گرد هم میآورد. دانشمندان داده این یک روش در دسترس برای درک نظریه پشت هر ابزار یادگیری آماری، پیش پردازش مورد نیاز، اصول ساخت مدل، نحوه آموزش روش های یادگیری آماری، اسکریپت های R اساسی مورد نیاز برای اجرای هر ابزار یادگیری آماری، و خروجی هر ابزار برای انجام این کار، کتاب برای هر ابزار، نظریه پسزمینه، برخی از عناصر نرمافزار آماری R را برای اجرای آن، زیربنای مفهومی و حداقل دو مثال گویا با دادههای آزمایشهای انتخاب ژنومی در دنیای واقعی ارائه میکند. در نهایت، مثالهای کار شده به خوانندگان کمک میکند تا درک خود را بررسی کنند.
این کتاب برای خوانندگان اصلاح نژاد گیاهی (و حیوانات)، ژنتیکدانان و آماردانان بسیار جذاب خواهد بود، زیرا به روشی بسیار قابل دسترس نظریه لازم را ارائه میکند. ، کد R مناسب و مثال های گویا برای درک کامل هر ابزار یادگیری آماری. علاوه بر این، مزایا و معایب هر ابزار را می سنجد.
فهرست مطالب :
Foreword
Preface
Acknowledgments
Contents
Chapter 1: General Elements of Genomic Selection and Statistical Learning
1.1 Data as a Powerful Weapon
1.2 Genomic Selection
1.2.1 Concepts of Genomic Selection
1.2.2 Why Is Statistical Machine Learning a Key Element of Genomic Selection?
1.3 Modeling Basics
1.3.1 What Is a Statistical Machine Learning Model?
1.3.2 The Two Cultures of Model Building: Prediction Versus Inference
1.3.3 Types of Statistical Machine Learning Models and Model Effects
1.3.3.1 Types of Statistical Machine Learning Models
1.3.3.2 Model Effects
1.4 Matrix Algebra Review
1.5 Statistical Data Types
1.5.1 Data Types
1.5.2 Multivariate Data Types
1.6 Types of Learning
1.6.1 Definition and Examples of Supervised Learning
1.6.2 Definitions and Examples of Unsupervised Learning
1.6.3 Definition and Examples of Semi-Supervised Learning
References
Chapter 2: Preprocessing Tools for Data Preparation
2.1 Fixed or Random Effects
2.2 BLUEs and BLUPs
2.3 Marker Depuration
2.4 Methods to Compute the Genomic Relationship Matrix
2.5 Genomic Breeding Values and Their Estimation
2.6 Normalization Methods
2.7 General Suggestions for Removing or Adding Inputs
2.8 Principal Component Analysis as a Compression Method
Appendix 1
Appendix 2
References
Chapter 3: Elements for Building Supervised Statistical Machine Learning Models
3.1 Definition of a Linear Multiple Regression Model
3.2 Fitting a Linear Multiple Regression Model via the Ordinary Least Square (OLS) Method
3.3 Fitting the Linear Multiple Regression Model via the Maximum Likelihood (ML) Method
3.4 Fitting the Linear Multiple Regression Model via the Gradient Descent (GD) Method
3.5 Advantages and Disadvantages of Standard Linear Regression Models (OLS and MLR)
3.6 Regularized Linear Multiple Regression Model
3.6.1 Ridge Regression
3.6.2 Lasso Regression
3.7 Logistic Regression
3.7.1 Logistic Ridge Regression
3.7.2 Lasso Logistic Regression
Appendix 1: R Code for Ridge Regression Used in Example 2
References
Chapter 4: Overfitting, Model Tuning, and Evaluation of Prediction Performance
4.1 The Problem of Overfitting and Underfitting
4.2 The Trade-Off Between Prediction Accuracy and Model Interpretability
4.3 Cross-validation
4.3.1 The Single Hold-Out Set Approach
4.3.2 The k-Fold Cross-validation
4.3.3 The Leave-One-Out Cross-validation
4.3.4 The Leave-m-Out Cross-validation
4.3.5 Random Cross-validation
4.3.6 The Leave-One-Group-Out Cross-validation
4.3.7 Bootstrap Cross-validation
4.3.8 Incomplete Block Cross-validation
4.3.9 Random Cross-validation with Blocks
4.3.10 Other Options and General Comments on Cross-validation
4.4 Model Tuning
4.4.1 Why Is Model Tuning Important?
4.4.2 Methods for Hyperparameter Tuning (Grid Search, Random Search, etc.)
4.5 Metrics for the Evaluation of Prediction Performance
4.5.1 Quantitative Measures of Prediction Performance
4.5.2 Binary and Ordinal Measures of Prediction Performance
4.5.3 Count Measures of Prediction Performance
References
Chapter 5: Linear Mixed Models
5.1 General of Linear Mixed Models
5.2 Estimation of the Linear Mixed Model
5.2.1 Maximum Likelihood Estimation
5.2.1.1 EM Algorithm
E Step
M Step
5.2.1.2 REML
5.2.1.3 BLUPs
5.3 Linear Mixed Models in Genomic Prediction
5.4 Illustrative Examples of the Univariate LMM
5.5 Multi-trait Genomic Linear Mixed-Effects Models
5.6 Final Comments
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Appendix 5
Appendix 6
Appendix 7
References
Chapter 6: Bayesian Genomic Linear Regression
6.1 Bayes Theorem and Bayesian Linear Regression
6.2 Bayesian Genome-Based Ridge Regression
6.3 Bayesian GBLUP Genomic Model
6.4 Genomic-Enabled Prediction BayesA Model
6.5 Genomic-Enabled Prediction BayesB and BayesC Models
6.6 Genomic-Enabled Prediction Bayesian Lasso Model
6.7 Extended Predictor in Bayesian Genomic Regression Models
6.8 Bayesian Genomic Multi-trait Linear Regression Model
6.8.1 Genomic Multi-trait Linear Model
6.9 Bayesian Genomic Multi-trait and Multi-environment Model (BMTME)
Appendix 1
Appendix 2: Setting Hyperparameters for the Prior Distributions of the BRR Model
Appendix 3: R Code Example 1
Appendix 4: R Code Example 2
Appendix 5
R Code Example 3
R Code for Example 4
References
Chapter 7: Bayesian and Classical Prediction Models for Categorical and Count Data
7.1 Introduction
7.2 Bayesian Ordinal Regression Model
7.2.1 Illustrative Examples
7.3 Ordinal Logistic Regression
7.4 Penalized Multinomial Logistic Regression
7.4.1 Illustrative Examples for Multinomial Penalized Logistic Regression
7.5 Penalized Poisson Regression
7.6 Final Comments
Appendix 1
Appendix 2
Appendix 3
Appendix 4 (Example 4)
Appendix 5
Appendix 6
References
Chapter 8: Reproducing Kernel Hilbert Spaces Regression and Classification Methods
8.1 The Reproducing Kernel Hilbert Spaces (RKHS)
8.2 Generalized Kernel Model
8.2.1 Parameter Estimation Under the Frequentist Paradigm
8.2.2 Kernels
8.2.3 Kernel Trick
8.2.4 Popular Kernel Functions
8.2.5 A Two Separate Step Process for Building Kernel Machines
8.3 Kernel Methods for Gaussian Response Variables
8.4 Kernel Methods for Binary Response Variables
8.5 Kernel Methods for Categorical Response Variables
8.6 The Linear Mixed Model with Kernels
8.7 Hyperparameter Tuning for Building the Kernels
8.8 Bayesian Kernel Methods
8.8.1 Extended Predictor Under the Bayesian Kernel BLUP
8.8.2 Extended Predictor Under the Bayesian Kernel BLUP with a Binary Response Variable
8.8.3 Extended Predictor Under the Bayesian Kernel BLUP with a Categorical Response Variable
8.9 Multi-trait Bayesian Kernel
8.10 Kernel Compression Methods
8.10.1 Extended Predictor Under the Approximate Kernel Method
8.11 Final Comments
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Appendix 5
Appendix 6
Appendix 7
Appendix 8
Appendix 9
Appendix 10
Appendix 11
References
Chapter 9: Support Vector Machines and Support Vector Regression
9.1 Introduction to Support Vector Machine
9.2 Hyperplane
9.3 Maximum Margin Classifier
9.3.1 Derivation of the Maximum Margin Classifier
9.3.2 Wolfe Dual
9.4 Derivation of the Support Vector Classifier
9.5 Support Vector Machine
9.5.1 One-Versus-One Classification
9.5.2 One-Versus-All Classification
9.6 Support Vector Regression
Appendix 1
Appendix 2
Appendix 3
References
Chapter 10: Fundamentals of Artificial Neural Networks and Deep Learning
10.1 The Inspiration for the Neural Network Model
10.2 The Building Blocks of Artificial Neural Networks
10.3 Activation Functions
10.3.1 Linear
10.3.2 Rectifier Linear Unit (ReLU)
10.3.3 Leaky ReLU
10.3.4 Sigmoid
10.3.5 Softmax
10.3.6 Tanh
10.4 The Universal Approximation Theorem
10.5 Artificial Neural Network Topologies
10.6 Successful Applications of ANN and DL
10.7 Loss Functions
10.7.1 Loss Functions for Continuous Outcomes
10.7.2 Loss Functions for Binary and Ordinal Outcomes
10.7.3 Regularized Loss Functions
10.7.4 Early Stopping Method of Training
10.8 The King Algorithm for Training Artificial Neural Networks: Backpropagation
10.8.1 Backpropagation Algorithm: Online Version
10.8.1.1 Feedforward Part
10.8.1.2 Backpropagation Part
10.8.2 Illustrative Example 10.1: A Hand Computation
10.8.3 Illustrative Example 10.2-By Hand Computation
References
Chapter 11: Artificial Neural Networks and Deep Learning for Genomic Prediction of Continuous Outcomes
11.1 Hyperparameters to Be Tuned in ANN and DL
11.1.1 Network Topology
11.1.2 Activation Functions
11.1.3 Loss Function
11.1.4 Number of Hidden Layers
11.1.5 Number of Neurons in Each Layer
11.1.6 Regularization Type
11.1.7 Learning Rate
11.1.8 Number of Epochs and Number of Batches
11.1.9 Normalization Scheme for Input Data
11.2 Popular DL Frameworks
11.3 Optimizers
11.4 Illustrative Examples
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Appendix 5
References
Chapter 12: Artificial Neural Networks and Deep Learning for Genomic Prediction of Binary, Ordinal, and Mixed Outcomes
12.1 Training DNN with Binary Outcomes
12.2 Training DNN with Categorical (Ordinal) Outcomes
12.3 Training DNN with Count Outcomes
12.4 Training DNN with Multivariate Outcomes
12.4.1 DNN with Multivariate Continuous Outcomes
12.4.2 DNN with Multivariate Binary Outcomes
12.4.3 DNN with Multivariate Ordinal Outcomes
12.4.4 DNN with Multivariate Count Outcomes
12.4.5 DNN with Multivariate Mixed Outcomes
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Appendix 5
References
Chapter 13: Convolutional Neural Networks
13.1 The Importance of Convolutional Neural Networks
13.2 Tensors
13.3 Convolution
13.4 Pooling
13.5 Convolutional Operation for 1D Tensor for Sequence Data
13.6 Motivation of CNN
13.7 Why Are CNNs Preferred over Feedforward Deep Neural Networks for Processing Images?
13.8 Illustrative Examples
13.9 2D Convolution Example
13.10 Critics of Deep Learning
Appendix 1
Appendix 2
References
Chapter 14: Functional Regression
14.1 Principles of Functional Linear Regression Analyses
14.2 Basis Functions
14.2.1 Fourier Basis
14.2.2 B-Spline Basis
14.3 Illustrative Examples
14.4 Functional Regression with a Smoothed Coefficient Function
14.5 Bayesian Estimation of the Functional Regression
Appendix 1
Appendix 2 (Example 14.4)
Appendix 3 (Example 14.5)
Appendix 4 (Example 14.6)
References
Chapter 15: Random Forest for Genomic Prediction
15.1 Motivation of Random Forest
15.2 Decision Trees
15.3 Random Forest
15.4 RF Algorithm for Continuous, Binary, and Categorical Response Variables
15.4.1 Splitting Rules
15.5 RF Algorithm for Count Response Variables
15.6 RF Algorithm for Multivariate Response Variables
15.7 Final Comments
Appendix 1
Appendix 2
Appendix 3
Appendix 4
Appendix 5
Appendix 6
References
Index
توضیحاتی در مورد کتاب به زبان اصلی :
This book is open access under a CC BY 4.0 license
This open access book brings together the latest genome base prediction models currently being used by statisticians, breeders and data scientists. It provides an accessible way to understand the theory behind each statistical learning tool, the required pre-processing, the basics of model building, how to train statistical learning methods, the basic R scripts needed to implement each statistical learning tool, and the output of each tool. To do so, for each tool the book provides background theory, some elements of the R statistical software for its implementation, the conceptual underpinnings, and at least two illustrative examples with data from real-world genomic selection experiments. Lastly, worked-out examples help readers check their own comprehension.
The book will greatly appeal to readers in plant (and animal) breeding, geneticists and statisticians, as it provides in a very accessible way the necessary theory, the appropriate R code, and illustrative examples for a complete understanding of each statistical learning tool. In addition, it weighs the advantages and disadvantages of each tool.