توضیحاتی در مورد کتاب Machine Learning with PySpark: With Natural Language Processing and Recommender Systems
نام کتاب : Machine Learning with PySpark: With Natural Language Processing and Recommender Systems
عنوان ترجمه شده به فارسی : یادگیری ماشین با PySpark: با پردازش زبان طبیعی و سیستمهای توصیهکننده
سری :
نویسندگان : Pramod Singh
ناشر : Apress
سال نشر : 2021
تعداد صفحات : 230
ISBN (شابک) : 1484277767 , 9781484277768
زبان کتاب : English
فرمت کتاب : pdf
حجم کتاب : 9 مگابایت
بعد از تکمیل فرایند پرداخت لینک دانلود کتاب ارائه خواهد شد. درصورت ثبت نام و ورود به حساب کاربری خود قادر خواهید بود لیست کتاب های خریداری شده را مشاهده فرمایید.
فهرست مطالب :
Table of Contents
About the Author
About the Technical Reviewer
Acknowledgments
Foreword
Introduction
Chapter 1: Introduction to Spark
Data Generation
Before the 1990s
The Internet and Social Media Era
The Machine Data Era
Spark
Setting Up the Environment
Downloading Spark
Installing Spark
Docker
Databricks
Spin a New Cluster
Create a Notebook
Conclusion
Chapter 2: Manage Data with PySpark
Load and Read Data
Data Filtering Using filter
Data Filtering Using where
Pandas UDF
Drop Duplicate Values
Writing Data
CSV
Parquet
Data Handling Using Koalas
Conclusion
Chapter 3: Introduction to Machine Learning
Rise in Data
Increased Computational Efficiency
Improved ML Algorithms
Availability of Data Scientists
Supervised Machine Learning
Unsupervised Machine Learning
Semi-supervised Learning
Reinforcement Learning
Industrial Application and Challenges
Retail
Healthcare
Finance
Travel and Hospitality
Media and Marketing
Manufacturing and Automobile
Social Media
Others
Conclusion
Chapter 4: Linear Regression
Variables
Theory
Interpretation
Evaluation
Code
Conclusion
Chapter 5: Logistic Regression
Probability
Using Linear Regression
Using Logit
Interpretation (Coefficients)
Dummy Variables
Model Evaluation
True Positives
True Negatives
False Positives
False Negatives
Accuracy
Recall
Precision
F1 Score
Probability Cut-Off/Threshold
ROC Curve
Logistic Regression Code
Data Info
Confusion Matrix
Accuracy
Recall
Precision
Conclusion
Chapter 6: Random Forests Using PySpark
Decision Tree
Entropy
Information Gain
Random Forests
Code
Conclusion
Chapter 7: Clustering in PySpark
Applications
K-Means
Deciding on the Number of Clusters (K)
Elbow Method
Hierarchical Clustering
Agglomerative Clustering
Code
Data Info
Conclusion
Chapter 8: Recommender Systems
Recommendations
Popularity-Based RS
Content-Based RS
User Profile
Euclidean Distance
Cosine Similarity
Collaborative Filtering–Based RS
User Item Matrix
Explicit Feedback
Implicit Feedback
Nearest Neighbors–Based CF
Missing Values
Latent Factor–Based CF
Hybrid Recommender Systems
Code
Data Info
Conclusion
Chapter 9: Natural Language Processing
Steps Involved in NLP
Corpus
Tokenize
Stopword Removal
Bag of Words
CountVectorizer
TF-IDF
Text Classification Using Machine Learning
Sequence Embeddings
Embeddings
Conclusion
Index