توضیحاتی در مورد کتاب Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
نام کتاب : Speech and Language Processing An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition
ویرایش : 3
عنوان ترجمه شده به فارسی : پردازش گفتار و زبان مقدمه ای بر پردازش زبان طبیعی، زبان شناسی محاسباتی و تشخیص گفتار
سری :
نویسندگان : Dan Jurafsky and James H. Martin
ناشر :
سال نشر : 2020
تعداد صفحات : 623
زبان کتاب : English
فرمت کتاب : pdf
حجم کتاب : 22 مگابایت
بعد از تکمیل فرایند پرداخت لینک دانلود کتاب ارائه خواهد شد. درصورت ثبت نام و ورود به حساب کاربری خود قادر خواهید بود لیست کتاب های خریداری شده را مشاهده فرمایید.
فهرست مطالب :
Introduction
Regular Expressions, Text Normalization, Edit Distance
Regular Expressions
Basic Regular Expression Patterns
Disjunction, Grouping, and Precedence
A Simple Example
More Operators
A More Complex Example
Substitution, Capture Groups, and ELIZA
Lookahead Assertions
Words
Corpora
Text Normalization
Unix Tools for Crude Tokenization and Normalization
Word Tokenization
Byte-Pair Encoding for Tokenization
Word Normalization, Lemmatization and Stemming
Sentence Segmentation
Minimum Edit Distance
The Minimum Edit Distance Algorithm
Summary
Bibliographical and Historical Notes
Exercises
N-gram Language Models
N-Grams
Evaluating Language Models
Perplexity
Generalization and Zeros
Unknown Words
Smoothing
Laplace Smoothing
Add-k smoothing
Backoff and Interpolation
Kneser-Ney Smoothing
Huge Language Models and Stupid Backoff
Advanced: Perplexity's Relation to Entropy
Summary
Bibliographical and Historical Notes
Exercises
Naive Bayes and Sentiment Classification
Naive Bayes Classifiers
Training the Naive Bayes Classifier
Worked example
Optimizing for Sentiment Analysis
Naive Bayes for other text classification tasks
Naive Bayes as a Language Model
Evaluation: Precision, Recall, F-measure
Evaluating with more than two classes
Test sets and Cross-validation
Statistical Significance Testing
The Paired Bootstrap Test
Avoiding Harms in Classification
Summary
Bibliographical and Historical Notes
Exercises
Logistic Regression
Classification: the sigmoid
Example: sentiment classification
Learning in Logistic Regression
The cross-entropy loss function
Gradient Descent
The Gradient for Logistic Regression
The Stochastic Gradient Descent Algorithm
Working through an example
Mini-batch training
Regularization
Multinomial logistic regression
Features in Multinomial Logistic Regression
Learning in Multinomial Logistic Regression
Interpreting models
Advanced: Deriving the Gradient Equation
Summary
Bibliographical and Historical Notes
Exercises
Vector Semantics and Embeddings
Lexical Semantics
Vector Semantics
Words and Vectors
Vectors and documents
Words as vectors: document dimensions
Words as vectors: word dimensions
Cosine for measuring similarity
TF-IDF: Weighing terms in the vector
Pointwise Mutual Information (PMI)
Applications of the tf-idf or PPMI vector models
Word2vec
The classifier
Learning skip-gram embeddings
Other kinds of static embeddings
Visualizing Embeddings
Semantic properties of embeddings
Embeddings and Historical Semantics
Bias and Embeddings
Evaluating Vector Models
Summary
Bibliographical and Historical Notes
Exercises
Neural Networks and Neural Language Models
Units
The XOR problem
The solution: neural networks
Feed-Forward Neural Networks
Training Neural Nets
Loss function
Computing the Gradient
Computation Graphs
Backward differentiation on computation graphs
More details on learning
Neural Language Models
Embeddings
Training the neural language model
Summary
Bibliographical and Historical Notes
Sequence Labeling for Parts of Speech and Named Entities
(Mostly) English Word Classes
Part-of-Speech Tagging
Named Entities and Named Entity Tagging
HMM Part-of-Speech Tagging
Markov Chains
The Hidden Markov Model
The components of an HMM tagger
HMM tagging as decoding
The Viterbi Algorithm
Working through an example
Conditional Random Fields (CRFs)
Features in a CRF POS Tagger
Features for CRF Named Entity Recognizers
Inference and Training for CRFs
Evaluation of Named Entity Recognition
Further Details
Bidirectionality
Rule-based Methods
POS Tagging for Morphologically Rich Languages
Summary
Bibliographical and Historical Notes
Exercises
Deep Learning Architectures for Sequence Processing
Language Models Revisited
Recurrent Neural Networks
Inference in RNNs
Training
RNNs as Language Models
Other Applications of RNNs
RNNs for Sequence Classification
Stacked and Bidirectional RNNs
Managing Context in RNNs: LSTMs and GRUs
Long Short-Term Memory
Gated Recurrent Units
Gated Units, Layers and Networks
Self-Attention Networks: Transformers
Transformers as Autoregressive Language Models
Potential Harms from Language Models
Summary
Bibliographical and Historical Notes
Contextual Embeddings
Machine Translation and Encoder-Decoder Models
Language Divergences and Typology
Word Order Typology
Lexical Divergences
Morphological Typology
Referential density
The Encoder-Decoder Model
Encoder-Decoder with RNNs
Training the Encoder-Decoder Model
Attention
Beam Search
Encoder-Decoder with Transformers
Some practical details on building MT systems
Tokenization
MT corpora
Backtranslation
MT Evaluation
Using Human Raters to Evaluate MT
Automatic Evaluation: BLEU
Automatic Evaluation: Embedding-Based Methods
Bias and Ethical Issues
Summary
Bibliographical and Historical Notes
Exercises
Constituency Grammars
Constituency
Context-Free Grammars
Formal Definition of Context-Free Grammar
Some Grammar Rules for English
Sentence-Level Constructions
Clauses and Sentences
The Noun Phrase
The Verb Phrase
Coordination
Treebanks
Example: The Penn Treebank Project
Treebanks as Grammars
Heads and Head Finding
Grammar Equivalence and Normal Form
Lexicalized Grammars
Combinatory Categorial Grammar
Summary
Bibliographical and Historical Notes
Exercises
Constituency Parsing
Ambiguity
CKY Parsing: A Dynamic Programming Approach
Conversion to Chomsky Normal Form
CKY Recognition
CKY Parsing
CKY in Practice
Span-Based Neural Constituency Parsing
Computing Scores for a Span
Integrating Span Scores into a Parse
Evaluating Parsers
Partial Parsing
CCG Parsing
Ambiguity in CCG
CCG Parsing Frameworks
Supertagging
CCG Parsing using the A* Algorithm
Summary
Bibliographical and Historical Notes
Exercises
Dependency Parsing
Dependency Relations
Dependency Formalisms
Projectivity
Dependency Treebanks
Transition-Based Dependency Parsing
Creating an Oracle
Advanced Methods in Transition-Based Parsing
Graph-Based Dependency Parsing
Parsing
Features and Training
Advanced Issues in Graph-Based Parsing
Evaluation
Summary
Bibliographical and Historical Notes
Exercises
Logical Representations of Sentence Meaning
Computational Desiderata for Representations
Model-Theoretic Semantics
First-Order Logic
Basic Elements of First-Order Logic
Variables and Quantifiers
Lambda Notation
The Semantics of First-Order Logic
Inference
Event and State Representations
Representing Time
Aspect
Description Logics
Summary
Bibliographical and Historical Notes
Exercises
Computational Semantics and Semantic Parsing
Information Extraction
Relation Extraction
Relation Extraction Algorithms
Using Patterns to Extract Relations
Relation Extraction via Supervised Learning
Semisupervised Relation Extraction via Bootstrapping
Distant Supervision for Relation Extraction
Unsupervised Relation Extraction
Evaluation of Relation Extraction
Extracting Times
Temporal Expression Extraction
Temporal Normalization
Extracting Events and their Times
Temporal Ordering of Events
Template Filling
Machine Learning Approaches to Template Filling
Earlier Finite-State Template-Filling Systems
Summary
Bibliographical and Historical Notes
Exercises
Word Senses and WordNet
Word Senses
Defining Word Senses
How many senses do words have?
Relations Between Senses
WordNet: A Database of Lexical Relations
Sense Relations in WordNet
Word Sense Disambiguation
WSD: The Task and Datasets
The WSD Algorithm: Contextual Embeddings
Alternate WSD algorithms and Tasks
Feature-Based WSD
The Lesk Algorithm as WSD Baseline
Word-in-Context Evaluation
Wikipedia as a source of training data
Using Thesauruses to Improve Embeddings
Word Sense Induction
Summary
Bibliographical and Historical Notes
Exercises
Semantic Role Labeling
Semantic Roles
Diathesis Alternations
Semantic Roles: Problems with Thematic Roles
The Proposition Bank
FrameNet
Semantic Role Labeling
A Feature-based Algorithm for Semantic Role Labeling
A Neural Algorithm for Semantic Role Labeling
Evaluation of Semantic Role Labeling
Selectional Restrictions
Representing Selectional Restrictions
Selectional Preferences
Primitive Decomposition of Predicates
Summary
Bibliographical and Historical Notes
Exercises
Lexicons for Sentiment, Affect, and Connotation
Defining Emotion
Available Sentiment and Affect Lexicons
Creating Affect Lexicons by Human Labeling
Semi-supervised Induction of Affect Lexicons
Semantic Axis Methods
Label Propagation
Other Methods
Supervised Learning of Word Sentiment
Log Odds Ratio Informative Dirichlet Prior
Using Lexicons for Sentiment Recognition
Other tasks: Personality
Affect Recognition
Lexicon-based methods for Entity-Centric Affect
Connotation Frames
Summary
Bibliographical and Historical Notes
Coreference Resolution
Coreference Phenomena: Linguistic Background
Types of Referring Expressions
Information Status
Complications: Non-Referring Expressions
Linguistic Properties of the Coreference Relation
Coreference Tasks and Datasets
Mention Detection
Architectures for Coreference Algorithms
The Mention-Pair Architecture
The Mention-Rank Architecture
Entity-based Models
Classifiers using hand-built features
A neural mention-ranking algorithm
Evaluation of Coreference Resolution
Winograd Schema problems
Gender Bias in Coreference
Summary
Bibliographical and Historical Notes
Exercises
Discourse Coherence
Coherence Relations
Rhetorical Structure Theory
Penn Discourse TreeBank (PDTB)
Discourse Structure Parsing
EDU segmentation for RST parsing
RST parsing
PDTB discourse parsing
Centering and Entity-Based Coherence
Centering
Entity Grid model
Evaluating Neural and Entity-based coherence
Representation learning models for local coherence
Global Coherence
Argumentation Structure
The structure of scientific discourse
Summary
Bibliographical and Historical Notes
Exercises
Question Answering
Information Retrieval
Term weighting and document scoring
Document Scoring
Inverted Index
Evaluation of Information-Retrieval Systems
IR with Dense Vectors
IR-based Factoid Question Answering
IR-based QA: Datasets
IR-based QA: Reader (Answer Span Extraction)
Entity Linking
Linking based on Anchor Dictionaries and Web Graph
Neural Graph-based linking
Knowledge-based Question Answering
Knowledge-Based QA from RDF triple stores
QA by Semantic Parsing
Using Language Models to do QA
Classic QA Models
Evaluation of Factoid Answers
Bibliographical and Historical Notes
Exercises
Chatbots & Dialogue Systems
Properties of Human Conversation
Chatbots
Rule-based chatbots: ELIZA and PARRY
Corpus-based chatbots
Hybrid architectures
GUS: Simple Frame-based Dialogue Systems
Control structure for frame-based dialogue
Natural language understanding for filling slots in GUS
Other components of frame-based dialogue
The Dialogue-State Architecture
Dialogue Acts
Slot Filling
Dialogue State Tracking
Dialogue Policy
Natural language generation in the dialogue-state model
Evaluating Dialogue Systems
Evaluating Chatbots
Evaluating Task-Based Dialogue
Dialogue System Design
Ethical Issues in Dialogue System Design
Summary
Bibliographical and Historical Notes
Exercises
Phonetics
Speech Sounds and Phonetic Transcription
Articulatory Phonetics
Prosody
Prosodic Prominence: Accent, Stress and Schwa
Prosodic Structure
Tune
Acoustic Phonetics and Signals
Waves
Speech Sound Waves
Frequency and Amplitude; Pitch and Loudness
Interpretation of Phones from a Waveform
Spectra and the Frequency Domain
The Source-Filter Model
Phonetic Resources
Summary
Bibliographical and Historical Notes
Exercises
Automatic Speech Recognition and Text-to-Speech
The Automatic Speech Recognition Task
Feature Extraction for ASR: Log Mel Spectrum
Sampling and Quantization
Windowing
Discrete Fourier Transform
Mel Filter Bank and Log
Speech Recognition Architecture
Learning
CTC
CTC Inference
CTC Training
Combining CTC and Encoder-Decoder
Streaming Models: RNN-T for improving CTC
ASR Evaluation: Word Error Rate
TTS
TTS Preprocessing: Text normalization
TTS: Spectrogram prediction
TTS: Vocoding
TTS Evaluation
Other Speech Tasks
Summary
Bibliographical and Historical Notes
Exercises
Bibliography
Subject Index