# Deep Learning Glossary: Machine Learning Terms Definitions & Acronyms

Machine learning is a rapidly growing and exciting industry that has endless applications. Deep learning terms are pretty typical to newcomers. The below DL glossary defines the commonly used words and resources to help readers dive deeper into specific topics. As a developer, would you like to learn more about Machine learning? This guide will help you best in your learning process. It includes the revised and updated terms. Every professional needs a handy reference.

## Deep Learning Glossary

Activation Function

A/B testing

Accuracy

Action

Active learning

AdaGrad

Adam Optimization

Agent

Agglomerative clustering

AI Agent

Algorithm

AlphaGo

AR

The area under the PR curve

The area under the ROC curve

Artificial general intelligence

Artificial Intelligence (AI)

Attribute

AUC (Area under the ROC Curve)

Augmented reality

Automation bias

Autonomous

Average precision

Back Propagation

Backpropagation

Bag of words

Baseline

Batch

Batch normalization

Batch size

Bayesian neural network

Bellman equation

Bias (ethics/fairness)

Bias (math)

Bigram

Binary classification

Binning

Black box

Boosting

Bot

Bounding box

Broadcasting

Bucketing

Calibration layer

Candidate generation

Candidate sampling

Categorical data

Centroid

Centroid-based clustering

Checkpoint

Class

Classification model

Classification threshold

Class-imbalanced dataset

Clipping

Cloud TPU

Clustering

Co-adaptation

Collaborative filtering

Computational learning theory

Computer program

Computer science (CS)

Computer Vision

Confirmation bias

Confusion matrix

Continuous feature

Convenience sampling

Convergence

Convex function

Convex optimization

Convex set

Convolution

Convolutional filter

Convolutional layer

Convolutional Neural Network

Convolutional neural network (CNN)

Convolutional operation

Cost

Cost Function

Counterfactual fairness

Coverage bias

Crash blossom

Critic

Cross-entropy

Cross-validation

Custom Estimator

Data

Data analysis

Data augmentation

Data Cleansing

Data mining

Data Science

Data set or dataset

DataFrame

Dataset API (tf. data)

Datasets

Decision boundary

Decision threshold

Decision tree

Deep learning

Deep model

Deep neural network

Deep neural network (DNN)

Deep Q-Network (DQN)

Demographic parity

Dense feature

Dense layer

Depth

Depthwise separable convolutional neural network (sepCNN)

Device

Dimension reduction

Dimensional reduction

Dimensions

Discrete feature

Discriminative model

Discriminator

Disparate impact

Disparate treatment

Divisive clustering

Downsampling

DQN

Dropout regularization

Dynamic model

Eager execution

Early stopping

Embedding space

Embeddings

Empirical risk minimization (ERM)

Ensemble

Environment

Episode

Epoch

Epsilon greedy policy

Equality of opportunity

Equalized odds

Estimator

Example

Experience replay

Experimenter’s bias

Explainable AI

Exploding gradient problem

Fairness constraint

Fairness metric

False-negative (FN)

False-positive (FP)

False-positive rate (FPR)

Feature

Feature column (tf.feature_column)

Feature cross

Feature engineering

Feature extraction

Feature set

Feature spec

Feature vector

Federated learning

Feedback loop

Feedforward Neural Network

Feedforward neural network (FFN)

Few-shot learning

Fine-tuning

Forget gate

Full softmax

Fully connected layer

GAN

Generalization

Generalization curve

Generalized linear model

Generative adversarial network (GAN)

Generative adversarial networks (GANs)

Generative model

Generator

Gradient

Gradient clipping

Gradient Descent

Graph

Graph execution

Greedy policy

Ground truth

Group attribution bias

Hashing

Heuristic

Heuristics

Hidden layer

Hierarchical clustering

Hinge loss

Holdout data

Hyperparameter

Hyperplane

i.i.d.

Image recognition

Imbalanced dataset

Implicit bias

Incompatibility of fairness metrics

Independently and identically distributed (i.i.d)

Individual fairness

Inference

In-group bias

Input

Input function

Input layer

Instance

Intelligence

Interpretability

Inter-rater agreement

Intersection over union (IoU)

IoU

Item matrix

Items

Iteration

Keras

Kernel Support Vector Machines (KSVMs)

Key points

K-means

K-median

L1 loss

L1 regularization

L2 loss

L2 regularization

Label

Labeled example

Lambda

Landmarks

Layer

Layers API (tf. layers)

Learning rate

Least squares regression

Linear Algebra

Linear model

Linear regression

Log Loss

Logistic regression

Logits

Log-odds

Long Short-Term Memory (LSTM)

Long Short-Term Memory Network (LSTM)

Long short-term memory networks (LSTMs)

Loss

Loss curve

Loss surface

LSTM

Machine learning (ML)

Machine learning model

Machine perception

Majority class

Markov decision process (MDP)

Markov property

Matplotlib

Matrix factorization

Mean Absolute Error (MAE)

Mean Squared Error (MSE)

Metric

Metrics API (tf. metrics)

Mini-batch

Mini-batch stochastic gradient descent (SGD)

Minimax loss

Minority class

ML

MOST

Model

Model Capacity

Model function

Model training

Momentum

Multi-class classification

Multi-class logistic regression

Multilayer Perceptron (MLP)

Multinomial classification

NaN trap

Natural Language Processing (NLP)

Natural language understanding

Negative class

Neural Network

Neuron

N-gram

NLU

Node (neural network)

Node (TensorFlow graph)

Noise

Non-response bias

Normalization

Numerical data

NumPy

Objective

Objective function

Offline inference

One-hot encoding

One-shot learning

One-vs.-all

Online inference

Operation (op)

Optimizer

Out-group homogeneity bias

Outliers

Output layer

Overfitting

Pandas

Parameter

Parameter Server (PS)

Parameter update

Partial derivative

Participation bias

Partitioning strategy

Perceptron

Performance

Perplexity

Pipeline

Policy

Pooling

Positive class

Post-processing

PR AUC (area under the PR curve)

Precision

Precision-recall curve

Prediction

Prediction bias

Predictive parity

Predictive rate parity

Premade Estimator

Preprocessing

Pre-trained model

Prior belief

Proxy (sensitive attributes)

Proxy labels

PyTorch

Q-function

Q-learning

Quantile

Quantile bucketing

Quantization

Queue

Random forest

Random policy

Rank (ordinality)

Rank (Tensor)

Rater

Recall

Recommendation system

Rectified Linear Unit (ReLU)

Recurrent Neural Network

Recurrent neural network (RNN)

Regression model

Regularization

Regularization rate

Reinforcement learning (RL)

Replay buffer

Reporting bias

Representation

Re-ranking

Return

Reward

Ridge regularization

RNN

ROC (receiver operating characteristic) Curve

Root directory

Root Mean Squared Error (RMSE)

Rotational invariance

Sampling bias

SavedModel

Saver

Scalar

Scaling

Scikit-learn

Scoring

Selection bias

Semi-supervised learning

Sensitive attribute

Sentiment analysis

Sequence model

Serving

Session (tf. session)

Shape (Tensor)

Sigmoid function

Similarity measure

Size invariance

Sketching

Softmax

Sparse feature

Sparse representation

Sparse vector

Sparsity

Spatial pooling

Squared hinge loss

Squared loss

State

State-action value function

Static model

Stationarity

Step

Step size

Stochastic gradient descent (SGD)

Stride

Structural risk minimization (SRM)

Subsampling

Summary

Supervised machine learning

Supervised Neural Network

Synthetic feature

Tabular Q-learning

Target

Target network

Temporal data

Tensor

Tensor Processing Unit (TPU)

Tensor rank

Tensor shape

Tensor size

TensorBoard

TensorFlow

TensorFlow Playground

TensorFlow Serving

Termination condition

Test set

tf.Example

tf. keras

Time series analysis

Timestep

Torch

Tower

TPU

TPU chip

TPU device

TPU master

TPU node

TPU Pod

TPU resource

TPU slice

TPU type

TPU worker

Training

Training set

Trajectory

Transfer learning

Translational invariance

Trigram

True negative (TN)

True positive (TP)

Actual positive rate (TPR)

Turing test

Unawareness (to a sensitive attribute)

Underfitting

Unlabeled example

Unsupervised machine learning

Unsupervised Neural Network

Upweighting

User matrix

Validation

Validation set

Vanishing Gradient Problem

Wasserstein loss

Weight

Weighted Alternating Least Squares (WALS)

Wide model

Width

Word Embedding

**Conclusion:**

Companies and more people learn about the uses of Deep learning technology, and relevant tools become increasingly available. As we can expect to see, machine learning has become a more significant part of everyday life. By reading the above Deep learning Glossary, you will know the interactions and processes in various industries.