AI Glossary

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.


AIOps, also known as Artificial Intelligence for IT Operations, refers to the use of artificial intelligence and machine learning for analytics to automate the monitoring of IT Operations.


API, short for Application Programming Interface, is a set of rules that allow different software applications to communicate and interact with one another.


Accuracy is a commonly used metric that measures the correctness of a model's predictions. It represents the ratio of correctly predicted instances to the total number of instances in the dataset.

Activation Function

An activation function in neural networks determines an output based on whether input values surpass a threshold. It acts as a gate that activates a signal if the input exceeds a critical number.

Active Learning

A model chooses the most informative data from an unlabeled set, and requests more information for those sets from a human annotator.


A set of instructions a computer follows in order to learn how to operate without assistance.


All of the digital products that are used. This could be an input, output, .etc.

Artificial Hallucination

Artificial Hallucination refers to when a machine, such as a language model, generates a response that is simply untrue. Artificial Hallucination can occur through false news, false advertisement, and false portrayal of people, things, and events.

Artificial Superintelligence

Artificial Superintelligence (ASI) is an advanced form of artificial intelligence that has beyond-human intelligence in every aspect. Notably, ASI has high-caliber problem solving and developing skills.


AutoML, also known as automated machine learning, refers to the automation of manual tasks involved in the creation and training of machine learning models by data scientists.

BI (Business Intelligence)

The use of different technologies and tools to collect and analyze business data. BI provides companies with useful information and analysis to help in decision-making.


Backpropagation is an AI algorithm that refines the weights in a neural network to improve its output accuracy. It adjusts the network's processing by using gradient descent to calculate the loss function's gradient. This gradient is then distributed backward through the network's layers, resulting in adjusted neuron weights.


A set of training exercises that are processed during the training phase of a model. The model holds the computations and updates the parameters.

Bayesian Networks

Bayesian networks depict the probabilistic ties between variables using graphical models. Using Bayes' theorem, they estimate outcome probabilities from given data. These networks, a subset of Probabilistic Graphical Models, excel in tasks like diagnosis, decision-making, and anomaly detection, deriving insights from both data and expert input.


A phenomenon that occurs when an algorithm produces prejudiced results due to certain assumptions in the machine learning process.

Bounding Box

An imaginary rectangle used in image processing projects to create collision boxes that serves as a point of reference for object detection.


Continuous Delivery(CD) finalizes the CI/CD practices. Merged changes are automatically tested and integrated into the repository, awaiting development. The deployment to production relies on manual intervention by staff. CD and also refer to Continuous Deployment, where automation plays a larger role, automatically deploying changes to production without manual intervention.


Continuous integrations(CI) is an approach where developers merge, integrate, and test code changes in an automated process. It involves merging changes into a shared repository as early as possible, accompanied by automated testing to ensure functionality.


A processor that processes a computer's basic commands, including arithmetic, logical operations, and Input/Output operations.

Categorical Data

Categorical data is information categorized by names or labels rather than numerical measurements. It is a qualitative type of data that is grouped into distinct classifications.


Snapshots of your working model taken during training that are saved in non-volatile memory.


Cloud is the term used to describe the servers that are available through the Internet, and the software and databases that can be accessed through those servers.

Cloud Computing

Anything that involves delivering hosted services through the internet.


A process of machine learning where data is organized into subgroups that are similar to one another.

Code Editor

Tools commonly used by web developers and programmers to write and edit code.

Collision box

A collision box refers to an imaginary shape that detects interactions between objects, helping determine whether or not objects are colliding with each other.

Computer Vision

A field of computer science concerned with allowing computers to identify and analyze images or videos. It develops algorithms and models that extract information and perform tasks with that information, including object recognition, image classification, image segmentation, and object tracking.


A container is a software unit or package that includes all necessary elements and configurations to reliably run an application across various computing environments.

Continuous Learning

Continuous Learning (CL) refers to a model continually learning from a flow of data. CL involves the model self-learning from previous data and continuously adapting its mechanisms based on its takeaways as it goes through the data stream.

Convolutional Neural Network (CNN)

A convolutional neural network (CNN) is a deep learning neural network designed for processing structured arrays of data such as images


Cross-validation assesses ML models by training them on subsets of data and testing on the complementary data. It helps detect overfitting, a phenomenon where models fail to generalize patterns.

Curse of Dimensionality

The Curse of Dimensionality refers to challenges that arise when working with high-dimensional data, where the dimensionality corresponds to the number of dataset attributes or features. As dimensions increase, there's an exponential growth in the computational effort needed for data processing and the data points required for effective machine learning. These challenges make analyzing, visualizing, and training machine learning models increasingly difficult.

Data Annotation

Process of labeling individual elements of training data (text, images, audio, or video) to help machines understand what is included in it and what is important

Data Engineering

Data engineering is the process of collecting, analyzing, and transforming raw data into a usable and readable format.

Decision Tree

A decision tree is a supervised machine learning method that predicts outcomes based on prior questions.


A set of cultural philosophies, practices, and tools that improves an organization's ability to deliver applications and services with efficiency.


A software platform that includes everything needed to create and run an application such as libraries, tools, and code. Docker containers are versatile, efficient, standardized, and scalable.


Dropout is a regularization technique in neural networks where certain nodes are intentionally omitted to enhance performance and reduce overfitting. By temporarily removing these nodes and their connections, it simulates training various network architectures in parallel.


An endpoint is a functioning URL that reroutes to a model upon the input of required information.

Ensemble Methods

Ensemble methods combine multiple machine learning models to improve overall performance and reduce overfitting. By leveraging the strengths of various individual models, they often achieve more accurate and robust predictions than any single model alone.


An epoch is a single pass through the entire training dataset during the model's training process.

Error Rate

Refers to a measure of the degree of prediction error of a model made with respect to the true model.

Explainable AI

Explainable AI is artificial intelligence in which humans can understand the reasoning and the decisions produced. This AI model provides explanations, visualizations, and sets of rules to improve transparency and interpretability in AI systems.

F1 Score

The F1 score is the harmonic mean of precision and recall, serving as a performance metric primarily for binary classification, though multi-class extensions are available.


A feature is a variable that is used as an input to a machine learning model.

Feature Engineering

The manipulation of data – such as addition, deletion, and combination – to improve the machine learning model’s performance.

Feature Store

A feature store is a data system tailored to machine learning when in need of centralizing the storage, processing, and access to common features. This allows for increased efficiency when utilizing the feature store for reuse of these features in future models.

Federated Learning

A machine learning technique that trains an algorithm through several independent sessions, each using a distinct dataset.

Few-Shot Learning

Few-Shot Learning is a machine learning technique in which the data set involved in the training merely contains limited samples of information.

Fine Tuning

Fine-tuning involves taking the weights from a pre-trained network and utilizing them as the initial parameters for training a new network.


A specialized processor that speeds up graphics rendering. A CPU (central processing unit) works together with a GPU (graphics processing unit) to increase the throughput of data and the number of concurrent calculations within an application.

General AI

General AI, or artificial general intelligence (AGI), can understand, learn, and perform a wide range of intellectual and cognitive tasks that replicate those of humans.

Generative AI

Generative AI uses AI algorithms and machine learning to generate new content in the form of text, images, audio, code, and data given user input.

Generative Adversarial Network (GAN)

Generative Adversarial Network (GAN) is a machine learning model in which two neural networks compete with each other by using deep learning methods to become more accurate in their predictions.

Generative Pre-trained Transformer (GPT)

Generative Pre-trained Transformers are a family of neural network models that uses the transformer architecture and is a key advancement in artificial intelligence powering generative AI applications such as ChatGPT.

Gradient Boosting

Gradient boosting is a machine learning method that creates an ensemble of weak prediction models to address regression and classification issues. It builds models iteratively, enhancing accuracy by optimizing a differentiable loss function. By continuously fitting new models to the negative gradient of the loss, weak learners collectively evolve into a robust predictor.

Gradient Descent

An optimization algorithm that makes adjustments to the network's parameters to enhance a neural network's performance so that the loss is as small as possible (reduces the network’s loss/error rate).

Graph Neural Networks (GNNs)

Graph neural networks (GNNs) use deep learning to analyze data structured as graphs, with nodes representing data points and edges symbolizing relationships. These networks make predictions on nodes, edges, or entire graphs by mathematically expressing their elements. GNNs excel in modeling intricate relationships, offering a flexibility beyond traditional neural networks.

Graph Search

The term graph search or graph traversal refers to a class of algorithms that systematically explore the vertices and edges of a graph. Graph-search algorithms can be used to compute many interesting properties of graphs.

Grid Search

A grid search is a search space as a grid of hyperparameter values that evaluates every position in the grid.

Heuristic Algorithm

A method to solve NP problems by determining the optimal solution. This is done by trading accuracy for speed.

Hidden Layer

A hidden layer in an artificial neural network lies between the input and output layers, processing weighted inputs via an activation function. It is essential to most neural networks, emulating activities similar to those in the human brain.


Hyperparameters are parameters that are determined and set before the training process begins on a machine learning model. Unlike model parameters, which are updated throughout training, hyperparameters are not changed during training and can affect the learning process.

Hyperparameter Search

Hyperparameter search refers to the process of finding the optimal hyperparameters. It involves training models with various hyperparameter settings and observing how well they perform.

Hyperparameter Tuning

Hyperparameter tuning consists of finding a set of optimal hyperparameter values for a learning algorithm while applying this optimized algorithm to any data set. This combination of hyperparameters maximizes the model’s performance, minimizing a predefined loss function to produce better results with fewer errors.


An IDE, short for Integrated Development Environment, is a software that contains the most common functions for developing applications such as text editing, debugging, and compiling.

Image Classification

Image classification refers to the process of assigning labels to images based on trained data of pre-labeled examples. This involves intricate pixel-level analysis to determine the image's overall appropriate label.

Image Segmentation

Image segmentation is a computer vision process that divides images into segments, further extending the concept of object detection. Each segment, determined at the pixel level, precisely outlines and labels objects within the image. The resulting segments, or outputs, are often color-highlighted based on the type of segmentation.

Imbalanced Data

This refers to datasets involving two or more classes in which the classes are not represented equally. For example, you may have a binary classification task with 100 instances, 95% representing one class, and only 5% of the other. This imbalance can skew the learning of the algorithm to be more biased towards the majority class.


In machine learning, inference is the process of learning about something that happened in the past using a data model.


An instance refers to a single row of data within the dataset.


JavaScript Object Notation is a lightweight format for data storage and transmission. JSON Example: This example defines an customers object: an array of 3 customer records (objects): { "customers":[ {"firstName":"Amber", "lastName":"Smith"}, {"firstName":"Bob", "lastName":"Jackson"}, {"firstName":"Casandra", "lastName":"Ross"} ] }


Keras in an advanced deep-learning application programming interface (API) created by google and written in Python. It is utilized to increase the efficiency of implementation of neural networks.


K-means is a machine learning algorithm that assembles similar data by identifying patterns based on common features and putting them into a bundle.


When raw data has been given a label, it has been tagged with more relevant information so that machine learning has context when applying its algorithms.

Language Model for Dialogue Applications (LaMDA)

Language Model for Dialogue Applications (LaMDA) is a recently developed technology that Google created with the intention of launching a dialogue application that is capable of free-flowing conversations about a wide variety of topics.

Large Language Model (LLM)

Advanced AI models with several parameters that focus on natural language processing and natural language generation tasks.


Latency is a measurement of time which indicates the number of seconds it takes to process one unit of data given that only one unit is examined at a time.


Layers are sub-structures within a deep learning model. An individual layer transfers information to the next layer that it gained from the previous layer.

Learning Rate (LR)

The learning rate sets the speed at which a model learns from data by adjusting the size of parameter changes.

Linear Regression

Linear regression is a method by which a linear function is used to show the relationship between a response variable and one or more explanatory variables.

Loss Function

A loss function is a mathematical function that is typically used during testing to calculate the loss on a batch of examples.


MLOps (Machine Learning Operations) is a core function of Machine Learning engineering, focused on streamlining the process of taking machine learning models to production, and then maintaining and monitoring them.

Machine Learning

The capability of a machine to replicate human behavior, where it improves accuracy over time.


A software application that examines datasets in order to identify patterns and generate predictions.


Monitoring in machine learning and AI consists of analyzing the data of installed models to track its performance.


Natural learning processing is a branch of AI that focuses on giving computers the ability to interpret, manipulate, and comprehend the human language.


NTrees is an important tuning parameter for ML that dictates the number of trees generated within the model. For reference, decision trees are logic trees where the data is split and sorted repeatedly by decision points referred to as nodes.

Naive Bayes

Naive Bayes, which originated through the Bayes Theorem, are classification algorithms that are used as the first steps toward classifying complicated datasets.

Narrow AI

Narrow AI is a learning algorithm that is programmed to complete singular tasks without human assistance

Neural Network

A neural network enables computers to recognize patterns, solve complex problems, and model highly volatile data.

No Free Lunch Theorem (NFLT)

The No Free Lunch Theorem (NFLT) posits that all optimization algorithms, on average, perform equally well across all possible objective functions. It underscores that there's no singular best optimization algorithm. Similarly, in the realm of machine learning, there is no one superior algorithm for tasks such as classification and regression.


Notebooks are a collaborative computing service utilized by data science and machine learning users. It provides a platform for writing, visualizing, and collaborating on code.

Object Detection

Object detection, a computer vision method, identifies and locates objects within images or videos using machine learning or deep learning. The aim is to emulate the quick recognition abilities that humans naturally possess.

One-Shot Learning

One-shot learning is a machine learning setup in which the model can recognize objects or concepts when given only one or a very limited number of examples.


In deep learning, an optimizer is an algorithm that adjusts the neural network's weights and parameters, such as learning rate, to minimize the loss function. It aids in enhancing model accuracy and reducing overall loss.


Orchestration refers to the management of various machine learning tasks to “orchestrate” a larger scale of tasks.


Pandas is a Python library used for data analysis and machine learning. Pandas can efficiently analyze, manipulate, and preprocess big data used in AI applications.


A model parameter is an internal configuration variable within the model, which can be estimated based on the provided data. Parameters are required by the model when making predictions on new data


A pipeline is a series of automated data processing operations involved in ingesting and moving raw data from disparate sources to a destination.


A tool to help investigate data bases as well as refine acquired knowledge. These procedures include various pruning routines, rule filtering, rule combination, model combination, and more.

Pre-Trained Model

A pre-trained model is a previously created network that was trained to solve a similar problem, providing individuals looking to build an AI model with a starting point.


Pre-processing refers to any preliminary steps, such as formatting and filtering data, that are taken to prepare raw data for further analysis or primary processing.


A prediction is the probable values for unknown variables generated by an algorithm that has been trained on a historical data set.

Principal Component Analysis (PCA)

Principal Component Analysis (PCA) is an unsupervised learning technique used for dimensionality reduction, transforming correlated features into linearly uncorrelated ones called Principal Components through orthogonal transformation. This process aids in exploratory data analysis and predictive modeling by highlighting strong patterns and reducing variance. PCA effectively projects high-dimensional data onto a lower-dimensional surface.


Production systems are made of a global database, production rules, and a control system which altogether make up for a computer program that creates AI.


Natural learning processing is a branch of AI that focuses on giving computers the ability to interpret, manipulate, and comprehend the human language.


NTrees is an important tuning parameter for ML that dictates the number of trees generated within the model. For reference, decision trees are logic trees where the data is split and sorted repeatedly by decision points referred to as nodes.

Naive Bayes

Naive Bayes, which originated through the Bayes Theorem, are classification algorithms that are used as the first steps toward classifying complicated datasets.

Narrow AI

Narrow AI is a learning algorithm that is programmed to complete singular tasks without human assistance

Neural Network

A neural network enables computers to recognize patterns, solve complex problems, and model highly volatile data.

No Free Lunch Theorem (NFLT)

The No Free Lunch Theorem (NFLT) posits that all optimization algorithms, on average, perform equally well across all possible objective functions. It underscores that there's no singular best optimization algorithm. Similarly, in the realm of machine learning, there is no one superior algorithm for tasks such as classification and regression.


Notebooks are a collaborative computing service utilized by data science and machine learning users. It provides a platform for writing, visualizing, and collaborating on code.

Random Forest

A random forest is a machine learning technique which combines decision trees to solve classification and regression errors.

Recommendation Systems

A recommendation system is an AI algorithm that assists in narrowing down a large selection to a few desirable items.

Recurrent Neural Networks (RNN)

A recurrent neural network (RNN) is a type of artificial neural network designed for sequential or time series data, often used in tasks like language translation and speech recognition. Unlike traditional neural networks, RNNs have "memory", allowing past inputs to influence current outputs. This unique characteristic enables RNNs to consider the entire context of a sequence, unlike traditional networks that treat inputs and outputs as independent entities.

Regression Model

A regression model creates numerical predictions about the relationship between an independent and dependent variable.


Regularization is a crucial machine learning technique that prevents overfitting by introducing a penalty to complex models. By adding this penalty or complexity term, the model is deterred from fitting noise in the training data. This ensures more generalizable results and better performance on unseen data.

Reinforcement Learning (RL)

Reinforcement learning is a group of algorithms that work together to analyze data in order to ensure a suitable action is taken to maximize reward.

SHAP Values

SHAP Values is a method based on the concepts of game theory that can be used to justify predictions surrounding machine learning models.


Scoring occurs when an algorithm that was built from a past dataset is applied to a new dataset in order to gain insight for future improvement.


A search algorithm is a means to reach an end goal from what the data initially came as with its unique problem solving.

Selection Bias

Selection Bias refers to the prejudice that occurs when chosen data sets do not reflect the data distribution of the real world.

Sentiment Analysis

Sentiment analysis is a technique that incorporates statistical and machine learning algorithms to identify the emotional meaning of communications.

Sigmoid Function

The sigmoid function maps input values to a range between 0 and 1, commonly serving as an activation function in feedforward neural networks for binary classification and logistic regression tasks.

Softmax Function

The softmax function transforms a vector of real numbers into a probability distribution suitable for multi-class classification in neural networks. It exponentiates each element for positivity and then normalizes them to ensure a cumulative sum of 1.

Software Development Kit (SDK)

A Software Development Kit (SDK) is a set of software-building tools for a specific platform, including the building blocks, debuggers and, often, a framework or group of code libraries such as a set of routines specific to an operating system. SDKs provide a comprehensive collection of tools that enable software developers to build software applications faster and in a more standardized way.

Stable Diffusion

Stable Diffusion is an advanced AI image generator. It is known to be extremely flexible, allowing for users to train their own models based on their individual datasets.

Stochastic Gradient Descent

Stochastic Gradient Descent (SGD) is a variant of Gradient Descent used to optimize machine learning models. Unlike traditional Gradient Descent that uses the entire dataset for each update, SGD uses just one random data point (or a small batch) per iteration. This makes it faster, especially with large datasets, but introduces some randomness, hence "stochastic."

Structured Data

Structured data is data that has a standardized format for efficient access by software and humans alike. It is typically tabular with rows and columns that clearly define data attributes. Computers can effectively process structured data for insights due to its quantitative nature.

Supervised ML

Supervised machine learning, which is most commonly used to make predictions or classify data, relies on labeled datasets to train new machines.

Support Vector Machine (SVM)

Support Vector Machine (SVM) is a supervised learning algorithm mainly used for classification tasks. It aims to establish the optimal hyperplane that best separates different classes within an n-dimensional space. This hyperplane is determined by the extreme data points called support vectors, giving the algorithm its name.


A tensor stores numerical data and changes under strict rules when there is a change of coordinates.

Tensor Processing Unit (TPU)

A Tensor Processing Unit (TPU) is a computer chip developed by google that handles mathematical operations in machine learning swiftly and efficiently.


A TensorBoard is a visualization tool that displays how well the machine learning framework, usually tensorflow, is working.


Tensorflow is a publicly accessible open-source AI framework developed by Google. It is a framework for more efficient use of machine learning and numerical computation.


Tokens are AI-based currencies that are either powered by or utilized in AI products.


The teaching that is used for AI to properly interpret data. It is primarily taught how to perform with accuracy and speed.

Training Set

Training Sets are sample data utilized during the training of machine learning (ML) processes. They are fed to the ML algorithms which predict and analyze within the dataset to develop a more advanced algorithm.

Transfer Learning

Transfer learning is an efficient method of transferring information from one machine learning task to another.


A transformer is a deep learning & neural machine learning model that self-transforms one type of data input into another.

Tree Search

A tree search is a method in AI that navigates through a hierarchical structure to perform various functions. It is called a tree search as the process of finding information resembles a tree.


Underfitting arises when a model fails to grasp the underlying patterns between input and output variables, leading to high errors on training and test data. This issue often stems from overly simplistic models, insufficient training time, inadequate input features, or excessive regularization.

Unstructured Data

Unstructured Data refers to data that does not have a predefined layout and lacks a consistent and organized structure like a database. It instead appears as other various formats such as text documents, images, audio, video, and more, which ultimately results in a relatively complex analyzing process compared to structured data.

Unsupervised ML

Unsupervised ML analyzes, clusters, and associates unlabeled or raw datasets using machine learning algorithms.


In AI, a variable is a labeled placeholder for data, so that AI systems can store and manipulate the data within when needed.


A quantity that has both magnitude and direction.


In AI, weight refers to a value for the strength in connection between artificial neurons. This determines how information is utilized within a neural network.


XGBoost, which stands for Extreme Gradient Boosting, is a scalable, distributed gradient-boosted decision tree (GBDT) machine learning library. It provides parallel tree boosting and is the leading machine learning library for regression, classification, and ranking problems.


YAML is a human-readable data serialization standard that can be used in conjunction with all programming languages and is often used to write configuration files. It can be used with nearly any application that needs to store or transmit data because it’s made up of elements of other languages.

Zero-Shot Learning

Zero-shot learning is a machine learning setup in which the model can recognize objects or concepts it has never explicitly been given in training.