AI Glossary
Disclaimer: This glossary is for explanatory purposes only. Examples of AI systems, products or companies are included illustratively and do not imply endorsement or recommendation. The entries are designed to provide neutral definitions and context, not guidance on how or whether to use specific tools in editorial practice.
|
accountability: occurs when two parties are in a relationship where one is answerable for their actions to the other. Regulatory or social bodies can make AI designers and providers accountable for their actions. See also traceability. |
|
agent: a type of AI system that can carry out tasks using different tools and, if designed to have a memory, will retain or adapt to information about the user. Unlike chatbots, which respond mainly within a conversation, agents are designed to decide on and perform a sequence of steps toward a goal. They can use external tools or data, manage multi-step workflows and operate with a higher level of autonomy. Some are built into commercial platforms, while others are custom-made for in-house use only. Eg frameworks such as LangChain agents, or platform-integrated agents like Microsoft Copilot Researcher and similar task automation tools. See also chatbot and virtual assistant. |
|
algorithm: a defined set of instructions that a computer system follows to transform its inputs into outputs. |
|
artificial general intelligence (AGI): a hypothetical type of AI system trained to perform or outperform humans on all kinds of human tasks (intellectual, physical and emotional). Related terms include strong AI, broad AI, general AI and general-purpose AI. Compare artificial narrow intelligence. See also large language model and foundation model. |
|
artificial intelligence (AI): (1) a broad field of scientific research, focussed on computerised systems which perform tasks associated with human cognitive abilities (making choices, recognising patterns, processing language, solving problems etc). AI systems can show some degree of adaptability (to new data or environments) and autonomy (ability to act or make decisions without explicit human programming) during their development and/or deployment; (2) such a computerised system. When used in this sense, 'AI' is often accompanied by the word 'system' to help avoid attribution of human characteristics. See also machine learning and generative AI. |
|
artificial narrow intelligence (ANI): a type of AI system trained to perform one or a few specific tasks. Narrow AI systems typically cannot generalise beyond these tasks; in their specialised domain they tend to outperform more generalised AI systems. Also known as weak AI. Compare artificial general intelligence. See also small language model. |
|
bias: typically, unfair treatment of an individual or group, often based on an attribute such as gender or ethnicity. A bias may be deliberate or unintentional. AI systems can reflect, reproduce or reinforce human biases when they are trained on data which is itself unrepresentative or imbalanced. AI systems can also introduce biases when their algorithms connect data in unanticipated ways. See also ethical AI. |
|
chatbot (bot): conversational AI that can converse with human users in natural language inside a single app or chat. Chatbots react and respond to user prompts, which can include user-uploaded files, but do not typically act on files or systems outside the conversation. Chatbots are generally less complex than agents and often follow a scripted flow or single-task design, although some advanced ones can connect to tools for specific functions. Eg ChatGPT and Claude. See also virtual assistant and agent. |
|
confidentiality: shows respect for access or disclosure restrictions imposed on information, typically of a sensitive nature. |
|
context window: a finite quantity of previously seen information which an AI system uses as context for processing a new piece of information. |
|
corpus (corpora): a collection of information which has been assembled for training an AI system. A corpus typically contains material on topics which are relevant and representative of the data an AI system will work on in deployment. See also data, training data and dataset. |
|
data: any form of information which can be collected, processed and/or interpreted. Data may be structured (organised and consistently formatted) or unstructured, whether sourced from real life or synthetic. AI systems are typically trained on large amounts of data. See also corpus. |
|
dataset: a collection of related data, used for training, testing or validating an AI system. A corpus is a specific and more structured type of dataset. See also data, training data and corpus. |
|
deepfake: an image (still or moving) or audio piece which simulates real people and situations to show something which did not happen. See also disinformation, misinformation and hallucination. |
|
disinformation: information which is false, inaccurate or manipulated and has been produced or shared with the deliberate intention to mislead. See also deepfake, misinformation and hallucination. |
|
ethical AI: follows a framework of moral principles, societal norms and regulatory frameworks during the design, development and deployment of an AI system. See also responsible AI. |
|
explainability: the degree to which the external context and internal functionality of a system can be expressed and/or justified in a reasonably understandable way. Compare transparency. |
|
fine-tuning: a technique in machine learning to re-train a large language model to perform a more context-specific task. See also large language model, foundation model, artificial narrow intelligence and small language model. |
|
foundation model: a type of generative AI model trained on a large dataset which can perform a large number of tasks. Foundation models can be adapted to perform various more specialised tasks. See also fine-tuning, generative AI, large language model and artificial general intelligence. |
|
generative AI (genAI): a subfield of machine learning concerned with systems that create new content (text, images, audio, code, video) based on patterns they learn in training data and prompts. The term is also used as an umbrella label for applications built on these models, such as chatbots, agents and virtual assistants. See also artificial intelligence, machine learning and large language model. |
|
guardrail: a restriction placed on an AI system to safeguard against potentially harmful or undesirable outputs. |
|
hallucination: an output from an AI system that appears plausible or true but is false or unsupported by the data it was trained on. Hallucinations can result from biased training data, incomplete training data or incorrect assumptions made by the system. See also disinformation, misinformation and deepfake. |
|
human in the loop (HITL): processes by which humans critically monitor and review outputs to correct and improve an AI system. A related idea is Expert in the Lead (XITL), where the human is the expert and retains the final authority over the AI’s decisions. See also: ethical AI and accountability. |
|
large language model (LLM): a type of machine learning model trained on a vast text dataset and designed for tasks involving natural language. LLMs work using natural language processing, building associations between text snippets. Compare small language model. See also machine learning, foundation model, natural language processing and training data. |
|
machine learning (ML): a subfield of artificial intelligence, concerned with machines that learn patterns from training data to make predictions on new data, with minimal human intervention. Such machines can learn to improve their performance over time. See also artificial intelligence, generative AI and training data. |
|
misinformation: information which is false or inaccurate and has been produced or shared without deliberate intention. See also deepfake, disinformation and hallucination. |
|
natural language processing (NLP): a subfield of machine learning concerned with systems which analyse, interpret, synthesise and generate language in a way that is both meaningful and useful to humans. NLP requires machines to be trained on natural language training data. See also large language model and natural language query. |
|
natural language query (NLQ): a type of prompt that allows users to express their information needs to an AI system in everyday conversational language. See also natural language processing and prompt. |
|
prompt: an input or instruction provided to an AI system to generate an output or response. The format of a prompt can be natural language, code, visual or other. See also prompt engineering. |
|
prompt engineering: the practice of crafting and refining prompts to get an AI system to consistently generate desired outputs. See also prompt. |
|
responsible AI: See ethical AI. |
|
retrieval-augmented generation (RAG): a method where an AI combines its training data with up-to-date information retrieved from a database or the web before generating a response. This improves accuracy and reduces hallucinations. See also: training data, hallucination, generative AI and large language model. |
|
small language model (SLM): a type of machine learning model specialised for natural language tasks, and trained on a small, more specialised dataset. Compare large language model. See also artificial narrow intelligence, fine-tuning and generative AI. |
|
tagging: annotating parts (or all) of a phrase with information such as part of speech, grammatical or semantic relationships, cultural views or other information. Tagging is often used to prepare training data for AI models. |
|
token: a chunk of text, usually a word or part of a word, that an AI model processes as a unit. Tokens are how AI measures input length, memory and cost. See also: context window, training data and dataset. |
|
traceability: the extent to which it is possible to transparently follow and monitor a system’s decision-making steps and processes. See also accountability and transparency. |
|
training data: the information used to teach an AI system during its development. Training data is fed into machine learning models so that they can learn patterns and associations before being applied to new inputs. It can be structured or unstructured, and may include text, images, audio or other formats. The quality, representativeness and size of training data strongly affect how well an AI system performs. See also dataset, corpus and token. |
|
transparency: appropriate, easily understandable disclosure of information about a model’s external context, such as how people’s data is processed or what assessments were undertaken to confirm compliance. Compare explainability. |
|
virtual assistant: a type of AI system that interacts with users and performs tasks in response to queries. Virtual assistants combine conversational ability with access to specific applications or services, so they can complete tasks beyond the chat itself. Eg Microsoft Copilot in Word/Excel, Siri with Apple Intelligence, Alexa+. See also chatbot and agent. |