توضیحاتی در مورد کتاب Tidy Modeling with R: A Framework for Modeling in the Tidyverse
نام کتاب : Tidy Modeling with R: A Framework for Modeling in the Tidyverse
ویرایش : 1
عنوان ترجمه شده به فارسی : مدل سازی مرتب با R: چارچوبی برای مدل سازی در Tidyverse
سری :
نویسندگان : Max Kuhn, Julia Silge
ناشر : O'Reilly Media
سال نشر : 2022
تعداد صفحات : 626
ISBN (شابک) : 1492096482 , 9781492096481
زبان کتاب : English
فرمت کتاب : pdf
حجم کتاب : 20 مگابایت
بعد از تکمیل فرایند پرداخت لینک دانلود کتاب ارائه خواهد شد. درصورت ثبت نام و ورود به حساب کاربری خود قادر خواهید بود لیست کتاب های خریداری شده را مشاهده فرمایید.
فهرست مطالب :
Preface
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
I. Introduction
1. Software for Modeling
Fundamentals for Modeling Software
Types of Models
Descriptive Models
Inferential Models
Predictive Models
Connections Between Types of Models
Some Terminology
How Does Modeling Fit into the Data Analysis Process?
Chapter Summary
2. A Tidyverse Primer
Tidyverse Principles
Design for Humans
Reuse Existing Data Structures
Design for the Pipe and Functional Programming
Examples of Tidyverse Syntax
Chapter Summary
3. A Review of R Modeling Fundamentals
An Example
What Does the R Formula Do?
Why Tidiness Is Important for Modeling
Combining Base R Models and the Tidyverse
The tidymodels Metapackage
Chapter Summary
II. Modeling Basics
4. The Ames Housing Data
Exploring Features of Homes in Ames
Chapter Summary
5. Spending Our Data
Common Methods for Splitting Data
What About a Validation Set?
Multilevel Data
Other Considerations for a Data Budget
Chapter Summary
6. Fitting Models with parsnip
Create a Model
Use the Model Results
Make Predictions
parsnip-Extension Packages
Creating Model Specifications
Chapter Summary
7. A Model Workflow
Where Does the Model Begin and End?
Workflow Basics
Adding Raw Variables to the workflow()
How Does a workflow() Use the Formula?
Tree-Based Models
Special Formulas and Inline Functions
Creating Multiple Workflows at Once
Evaluating the Test Set
Chapter Summary
8. Feature Engineering with Recipes
A Simple recipe() for the Ames Housing Data
Using Recipes
How Data Are Used by the recipe()
Examples of Steps
Encoding Qualitative Data in a Numeric Format
Interaction Terms
Spline Functions
Feature Extraction
Row Sampling Steps
General Transformations
Natural Language Processing
Skipping Steps for New Data
Tidy a recipe()
Column Roles
Chapter Summary
9. Judging Model Effectiveness
Performance Metrics and Inference
Regression Metrics
Binary Classification Metrics
Multiclass Classification Metrics
Chapter Summary
III. Tools for Creating Effective Models
10. Resampling for Evaluating Performance
The Resubstitution Approach
Resampling Methods
Cross-Validation
Repeated Cross-Validation
Leave-One-Out Cross-Validation
Monte Carlo Cross-Validation
Validation Sets
Bootstrapping
Rolling Forecasting Origin Resampling
Estimating Performance
Parallel Processing
Saving the Resampled Objects
Chapter Summary
11. Comparing Models with Resampling
Creating Multiple Models with Workflow Sets
Comparing Resampled Performance Statistics
Simple Hypothesis Testing Methods
Bayesian Methods
A Random Intercept Model
The Effect of the Amount of Resampling
Chapter Summary
12. Model Tuning and the Dangers of Overfitting
Model Parameters
Tuning Parameters for Different Types of Models
What Do We Optimize?
The Consequences of Poor Parameter Estimates
Two General Strategies for Optimization
Tuning Parameters in tidymodels
Chapter Summary
13. Grid Search
Regular and Nonregular Grids
Regular Grids
Nonregular Grids
Evaluating the Grid
Finalizing the Model
Tools for Creating Tuning Specifications
Tools for Efficient Grid Search
Submodel Optimization
Parallel Processing
Benchmarking Boosted Trees
Access to Global Variables
Racing Methods
Chapter Summary
14. Iterative Search
A Support Vector Machine Model
Bayesian Optimization
A Gaussian Process Model
Acquisition Functions
The tune_bayes() Function
Simulated Annealing
Simulated Annealing Search Process
The tune_sim_anneal() Function
Chapter Summary
15. Screening Many Models
Modeling Concrete Mixture Strength
Creating the Workflow Set
Tuning and Evaluating the Models
Efficiently Screening Models
Finalizing a Model
Chapter Summary
IV. Beyond the Basics
16. Dimensionality Reduction
What Problems Can Dimensionality Reduction Solve?
A Picture Is Worth a Thousand…Beans
A Starter Recipe
Recipes in the Wild
Preparing a Recipe
Baking the Recipe
Feature Extraction Techniques
Principal Component Analysis
Partial Least Squares
Independent Component Analysis
Uniform Manifold Approximation and Projection
Modeling
Chapter Summary
17. Encoding Categorical Data
Is an Encoding Necessary?
Encoding Ordinal Predictors
Using the Outcome for Encoding Predictors
Effect Encodings in tidymodels
Effect Encodings with Partial Pooling
Feature Hashing
More Encoding Options
Chapter Summary
18. Explaining Models and Predictions
Software for Model Explanations
Local Explanations
Global Explanations
Building Global Explanations from Local Explanations
Back to Beans!
Chapter Summary
19. When Should You Trust Your Predictions?
Equivocal Results
Determining Model Applicability
Chapter Summary
20. Ensembles of Models
Creating the Training Set for Stacking
Blend the Predictions
Fit the Member Models
Test Set Results
Chapter Summary
21. Inferential Analysis
Inference for Count Data
Comparisons with Two-Sample Tests
Log-Linear Models
A More Complex Model
More Inferential Analysis
Chapter Summary
A. Recommended Preprocessing
References
Index
About the Authors