Data Science: Concepts and Practice

دانلود کتاب Data Science: Concepts and Practice

30000 تومان موجود

کتاب علم داده: مفاهیم و عمل نسخه زبان اصلی

دانلود کتاب علم داده: مفاهیم و عمل بعد از پرداخت مقدور خواهد بود
توضیحات کتاب در بخش جزئیات آمده است و می توانید موارد را مشاهده فرمایید


این کتاب نسخه اصلی می باشد و به زبان فارسی نیست.


امتیاز شما به این کتاب (حداقل 1 و حداکثر 5):

امتیاز کاربران به این کتاب:        تعداد رای دهنده ها: 10


توضیحاتی در مورد کتاب Data Science: Concepts and Practice

نام کتاب : Data Science: Concepts and Practice
ویرایش : 2
عنوان ترجمه شده به فارسی : علم داده: مفاهیم و عمل
سری :
نویسندگان :
ناشر : Morgan Kaufmann
سال نشر : 2018
تعداد صفحات : 549
ISBN (شابک) : 012814761X , 9780128147610
زبان کتاب : English
فرمت کتاب : pdf
حجم کتاب : 49 مگابایت



بعد از تکمیل فرایند پرداخت لینک دانلود کتاب ارائه خواهد شد. درصورت ثبت نام و ورود به حساب کاربری خود قادر خواهید بود لیست کتاب های خریداری شده را مشاهده فرمایید.

توضیحاتی در مورد کتاب :




مبانی علم داده را از طریق یک چارچوب مفهومی آسان بیاموزید و بلافاصله با استفاده از پلتفرم RapidMiner تمرین کنید. چه در علم داده کاملاً تازه کار باشید و چه روی دهمین پروژه خود کار می کنید، این کتاب به شما نشان می دهد که چگونه داده ها را تجزیه و تحلیل کنید، الگوها و روابط پنهان را برای کمک به تصمیم گیری ها و پیش بینی های مهم کشف کنید.

علم داده به ابزاری ضروری برای استخراج ارزش از داده‌ها برای هر سازمانی تبدیل شده است که داده‌ها را به عنوان بخشی از عملیات خود جمع‌آوری، ذخیره و پردازش می‌کند. این کتاب برای کاربران تجاری، تحلیلگران داده، تحلیلگران کسب و کار، مهندسان و متخصصان تحلیل و برای هر کسی که با داده ها کار می کند ایده آل است.

شما قادر خواهید بود:

  1. دانش لازم را در مورد تکنیک های مختلف علوم داده برای استخراج ارزش از داده ها به دست آورید.
  2. بر مفاهیم و عملکرد درونی 30 الگوریتم قدرتمند علم داده که معمولاً مورد استفاده قرار می گیرند، تسلط داشته باشید.
  3. اجرای گام به گام فرآیند علم داده با استفاده از RapidMiner، یک پلتفرم علم داده مبتنی بر رابط کاربری گرافیکی منبع باز

تکنیک های علم داده تحت پوشش : تجزیه و تحلیل داده های اکتشافی، تجسم، درختان تصمیم، القاء قانون، k-نزدیک ترین همسایگان، طبقه بندی کننده های بیزی ساده، شبکه های عصبی مصنوعی، یادگیری عمیق، ماشین های بردار پشتیبانی، مدل های مجموعه، جنگل های تصادفی، رگرسیون، موتورهای توصیه، تحلیل انجمن، K-Means و خوشه‌بندی مبتنی بر چگالی، نقشه‌های خود سازماندهی، متن کاوی، پیش‌بینی سری‌های زمانی، تشخیص ناهنجاری، انتخاب ویژگی و موارد دیگر...


فهرست مطالب :


Cover Data Science: Concepts and Practice Copyright Dedication Foreword Preface Why Data Science? Why This Book? Who Can Use This Book? Acknowledgments 1 Introduction 1.1 AI, Machine learning, and Data Science 1.2 What is Data Science? 1.2.1 Extracting Meaningful Patterns 1.2.2 Building Representative Models 1.2.3 Combination of Statistics, Machine Learning, and Computing 1.2.4 Learning Algorithms 1.2.5 Associated Fields 1.3 Case for Data Science 1.3.1 Volume 1.3.2 Dimensions 1.3.3 Complex Questions 1.4 Data Science Classification 1.5 Data Science Algorithms 1.6 Roadmap for This Book 1.6.1 Getting Started With Data Science 1.6.2 Practice using RapidMiner 1.6.3 Core Algorithms References 2 Data Science Process 2.1 Prior Knowledge 2.1.1 Objective 2.1.2 Subject Area 2.1.3 Data 2.1.4 Causation Versus Correlation 2.2 Data Preparation 2.2.1 Data Exploration 2.2.2 Data Quality 2.2.3 Missing Values 2.2.4 Data Types and Conversion 2.2.5 Transformation 2.2.6 Outliers 2.2.7 Feature Selection 2.2.8 Data Sampling 2.3 Modeling 2.3.1 Training and Testing Datasets 2.3.2 Learning Algorithms 2.3.3 Evaluation of the Model 2.3.4 Ensemble Modeling 2.4 Application 2.4.1 Production Readiness 2.4.2 Technical Integration 2.4.3 Response Time 2.4.4 Model Refresh 2.4.5 Assimilation 2.5 Knowledge References 3 Data Exploration 3.1 Objectives of Data Exploration 3.2 Datasets 3.2.1 Types of Data Numeric or Continuous Categorical or Nominal 3.3 Descriptive Statistics 3.3.1 Univariate Exploration Measure of Central Tendency Measure of Spread 3.3.2 Multivariate Exploration Central Data Point Correlation 3.4 Data Visualization 3.4.1 Univariate Visualization Histogram Quartile Distribution Chart 3.4.2 Multivariate Visualization Scatterplot Scatter Multiple Scatter Matrix Bubble Chart Density Chart 3.4.3 Visualizing High-Dimensional Data Parallel Chart Deviation Chart Andrews Curves 3.5 Roadmap for Data Exploration References 4 Classification 4.1 Decision Trees 4.1.1 How It Works Step 1: Where to Split Data? Step 2: When to Stop Splitting Data? 4.1.2 How to Implement Implementation 1: To Play Golf or Not? Implementation 2: Prospect Filtering Step 1: Data Preparation Step 2: Divide dataset Into Training and Testing Samples Step 3: Modeling Operator and Parameters Step 4: Configuring the Decision Tree Model Step 5: Process Execution and Interpretation 4.1.3 Conclusion 4.2 Rule Induction WARNING!!! DUMMY ENTRY Approaches to Developing a Rule Set 4.2.1 How It Works Step 1: Class Selection Step 2: Rule Development Step 3: Learn-One-Rule Step 4: Next Rule Step 5: Development of Rule Set 4.2.2 How to Implement Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Results Interpretation Alternative Approach: Tree-to-Rules 4.2.3 Conclusion 4.3 k-Nearest Neighbors 4.3.1 How It Works Measure of Proximity Distance Weights Correlation similarity Simple matching coefficient Jaccard similarity Cosine similarity 4.3.2 How to Implement Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Execution and Interpretation 4.3.3 Conclusion 4.4 Naïve Bayesian 4.4.1 How It Works Step 1: Calculating Prior Probability P(Y) Step 2: Calculating Class Conditional Probability P(Xi|Y) Step 3: Predicting the Outcome Using Bayes’ Theorem Issue 1: Incomplete Training Set Issue 2: Continuous Attributes Issue 3: Attribute Independence 4.4.2 How to Implement Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Evaluation Step 4: Execution and Interpretation 4.4.3 Conclusion 4.5 Artificial Neural Networks 4.5.1 How It Works Step 1: Determine the Topology and Activation Function Step 2: Initiation Step 3: Calculating Error Step 4: Weight Adjustment 4.5.2 How to Implement Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Evaluation Step 4: Execution and Interpretation 4.5.3 Conclusion 4.6 Support Vector Machines WARNING!!! DUMMY ENTRY Concept and Terminology 4.6.1 How It Works 4.6.2 How to Implement Implementation 1: Linearly Separable Dataset Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Process Execution and Interpretation Example 2: Linearly Non-Separable Dataset Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Execution and Interpretation Parameter Settings 4.6.3 Conclusion 4.7 Ensemble Learners WARNING!!! DUMMY ENTRY Wisdom of the Crowd 4.7.1 How It Works Achieving the Conditions for Ensemble Modeling 4.7.2 How to Implement Ensemble by Voting Bootstrap Aggregating or Bagging Implementation Boosting AdaBoost Implementation Random Forest Implementation 4.7.3 Conclusion References 5 Regression Methods 5.1 Linear Regression 5.1.1 How it Works 5.1.2 How to Implement Step 1: Data Preparation Step 2: Model Building Step 3: Execution and Interpretation Step 4: Application to Unseen Test Data 5.1.3 Checkpoints 5.2 Logistic Regression 5.2.1 How It Works How Does Logistic Regression Find the Sigmoid Curve? A Simple but Tragic Example 5.2.2 How to Implement Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Execution and Interpretation Step 4: Using MetaCost Step 5: Applying the Model to an Unseen Dataset 5.2.3 Summary Points 5.3 Conclusion References 6 Association Analysis 6.1 Mining Association Rules 6.1.1 Itemsets Support Confidence Lift Conviction 6.1.2 Rule Generation 6.2 Apriori Algorithm 6.2.1 How it Works Frequent Itemset Generation Rule Generation 6.3 Frequent Pattern-Growth Algorithm 6.3.1 How it Works Frequent Itemset Generation 6.3.2 How to Implement Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Create Association Rules Step 4: Interpreting the Results 6.4 Conclusion References 7 Clustering Clustering to Describe the Data Clustering for Preprocessing Types of Clustering Techniques 7.1 k-Means Clustering 7.1.1 How It Works Step 1: Initiate Centroids Step 2: Assign Data Points Step 3: Calculate New Centroids Step 4: Repeat Assignment and Calculate New Centroids Step 5: Termination Special Cases Evaluation of Clusters 7.1.2 How to Implement Step 1: Data Preparation Step 2: Clustering Operator and Parameters Step 3: Evaluation Step 4: Execution and Interpretation 7.2 DBSCAN Clustering 7.2.1 How It Works Step 1: Defining Epsilon and MinPoints Step 2: Classification of Data Points Step 3: Clustering Optimizing Parameters Special Cases: Varying Densities 7.2.2 How to Implement Step 1: Data Preparation Step 2: Clustering Operator and Parameters Step 3: Evaluation Step 4: Execution and Interpretation 7.3 Self-Organizing Maps 7.3.1 How It Works Step 1: Topology Specification Step 2: Initialize Centroids Step 3: Assignment of Data Objects Step 4: Centroid Update Step 5: Termination Step 6: Mapping a New Data Object 7.3.2 How to Implement Step 1: Data Preparation Step 2: SOM Modeling Operator and Parameters Step 3: Execution and Interpretation Visual Model Location Coordinates Conclusion References 8 Model Evaluation 8.1 Confusion Matrix 8.2 ROC and AUC 8.3 Lift Curves 8.4 How to Implement WARNING!!! DUMMY ENTRY Step 1: Data Preparation Step 2: Modeling Operator and Parameters Step 3: Evaluation Step 4: Execution and Interpretation 8.5 Conclusion References 9 Text Mining 9.1 How It Works 9.1.1 Term Frequency–Inverse Document Frequency 9.1.2 Terminology 9.2 How to Implement 9.2.1 Implementation 1: Keyword Clustering Step 1: Gather Unstructured Data Step 2: Data Preparation Step 3: Apply Clustering 9.2.2 Implementation 2: Predicting the Gender of Blog Authors Step 1: Gather Unstructured Data Step 2: Data Preparation Step 3.1: Identify Key Features Step 3.2: Build Models Step 4.1: Prepare Test Data for Model Application Step 4.2: Applying the Trained Models to Testing Data Bias in Machine Learning 9.3 Conclusion References 10 Deep Learning 10.1 The AI Winter AI Winter: 1970’s Mid-Winter Thaw of the 1980s The Spring and Summer of Artificial Intelligence: 2006—Today 10.2 How it works 10.2.1 Regression Models As Neural Networks 10.2.2 Gradient Descent 10.2.3 Need for Backpropagation 10.2.4 Classifying More Than 2 Classes: Softmax 10.2.5 Convolutional Neural Networks 10.2.6 Dense Layer 10.2.7 Dropout Layer 10.2.8 Recurrent Neural Networks 10.2.9 Autoencoders 10.2.10 Related AI Models 10.3 How to Implement WARNING!!! DUMMY ENTRY Handwritten Image Recognition Step 1: Dataset Preparation Step 2: Modeling using the Keras Model Step 3: Applying the Keras Model Step 4: Results 10.4 Conclusion References 11 Recommendation Engines Why Do We Need Recommendation Engines? Applications of Recommendation Engines 11.1 Recommendation Engine Concepts WARNING!!! DUMMY ENTRY Building up the Ratings Matrix Step 1: Assemble Known Ratings Step 2: Rating Prediction Step 3: Evaluation The Balance 11.1.1 Types of Recommendation Engines 11.2 Collaborative Filtering 11.2.1 Neighborhood-Based Methods User-Based Collaborative Filtering Step 1: Identifying Similar Users Step 2: Deducing Rating From Neighborhood Users Item-Based Collaborative Filtering User-Based or Item-Based Collaborative Filtering? Neighborhood based Collaborative Filtering - How to Implement Dataset Implementation Steps Conclusion 11.2.2 Matrix Factorization Matrix Factorization - How to Implement Implementation Steps 11.3 Content-Based Filtering WARNING!!! DUMMY ENTRY Building an Item Profile 11.3.1 User Profile Computation Content-Based Filtering - How to Implement Dataset Implementation steps 11.3.2 Supervised Learning Models Supervised Learning Models - How to Implement Dataset Implementation steps 11.4 Hybrid Recommenders 11.5 Conclusion WARNING!!! DUMMY ENTRY Summary of the Types of Recommendation Engines References 12 Time Series Forecasting Taxonomy of Time Series Forecasting 12.1 Time Series Decomposition 12.1.1 Classical Decomposition 12.1.2 How to Implement Forecasting Using Decomposed Data 12.2 Smoothing Based Methods 12.2.1 Simple Forecasting Methods Naïve Method Seasonal Naive Method Average Method Moving Average Smoothing Weighted Moving Average Smoothing 12.2.2 Exponential Smoothing Holt’s Two-Parameter Exponential Smoothing Holt-Winters’ Three-Parameter Exponential Smoothing 12.2.3 How to Implement R Script for Holt-Winters’ Forecasting 12.3 Regression Based Methods 12.3.1 Regression 12.3.2 Regression With Seasonality How to implement 12.3.3 Autoregressive Integrated Moving Average Autocorrelation Autoregressive Models Stationary Data Differencing Moving Average of Error Autoregressive Integrated Moving Average How to Implement 12.3.4 Seasonal ARIMA How to Implement 12.4 Machine Learning Methods 12.4.1 Windowing Model Training How to Implement Step 1: Set Up Windowing Step 2: Train the Model Step 3: Generate the Forecast in a Loop 12.4.2 Neural Network Autoregressive How to Implement 12.5 Performance Evaluation 12.5.1 Validation Dataset Mean Absolute Error Root Mean Squared Error Mean Absolute Percentage Error Mean Absolute Scaled Error 12.5.2 Sliding Window Validation 12.6 Conclusion 12.6.1 Forecasting Best Practices References 13 Anomaly Detection 13.1 Concepts 13.1.1 Causes of Outliers 13.1.2 Anomaly Detection Techniques Outlier Detection Using Statistical Methods Outlier Detection Using Data Science 13.2 Distance-Based Outlier Detection 13.2.1 How It Works 13.2.2 How to Implement Step 1: Data Preparation Step 2: Detect Outlier Operator Step 3: Execution and Interpretation 13.3 Density-Based Outlier Detection 13.3.1 How It Works 13.3.2 How to Implement Step 1: Data Preparation Step 2: Detect Outlier Operator Step 3: Execution and Interpretation 13.4 Local Outlier Factor 13.4.1 How it Works 13.4.2 How to Implement Step 1: Data Preparation Step 2: Detect Outlier Operator Step 3: Results Interpretation 13.5 Conclusion References 14 Feature Selection 14.1 Classifying Feature Selection Methods 14.2 Principal Component Analysis 14.2.1 How It Works 14.2.2 How to Implement Step 1: Data Preparation Step 2: PCA Operator Step 3: Execution and Interpretation 14.3 Information Theory-Based Filtering 14.4 Chi-Square-Based Filtering 14.5 Wrapper-Type Feature Selection 14.5.1 Backward Elimination 14.6 Conclusion References 15 Getting Started with RapidMiner 15.1 User Interface and Terminology WARNING!!! DUMMY ENTRY Terminology 15.2 Data Importing and Exporting Tools 15.3 Data Visualization Tools WARNING!!! DUMMY ENTRY Univariate Plots Bivariate Plots Multivariate Plots 15.4 Data Transformation Tools 15.5 Sampling and Missing Value Tools 15.6 Optimization Tools5 15.7 Integration with R 15.8 Conclusion References Comparison of Data Science Algorithms About the Authors Vijay Kotu Bala Deshpande, PhD Index Praise Back Cover

توضیحاتی در مورد کتاب به زبان اصلی :


Learn the basics of Data Science through an easy to understand conceptual framework and immediately practice using RapidMiner platform. Whether you are brand new to data science or working on your tenth project, this book will show you how to analyze data, uncover hidden patterns and relationships to aid important decisions and predictions.

Data Science has become an essential tool to extract value from data for any organization that collects, stores and processes data as part of its operations. This book is ideal for business users, data analysts, business analysts, engineers, and analytics professionals and for anyone who works with data.

You'll be able to:

  1. Gain the necessary knowledge of different data science techniques to extract value from data.
  2. Master the concepts and inner workings of 30 commonly used powerful data science algorithms.
  3. Implement step-by-step data science process using using RapidMiner, an open source GUI based data science platform

Data Science techniques covered: Exploratory data analysis, Visualization, Decision trees, Rule induction, k-nearest neighbors, Naïve Bayesian classifiers, Artificial neural networks, Deep learning, Support vector machines, Ensemble models, Random forests, Regression, Recommendation engines, Association analysis, K-Means and Density based clustering, Self organizing maps, Text mining, Time series forecasting, Anomaly detection, Feature selection and more...




پست ها تصادفی