Understand AI, ML & Co in Contact Centers: Definitions & Explanations

Written by Gennadiy Bezkorovayniy | December 8, 2023 at 5:55 AM

Whether you haven't officially dabbled with Contact Center AI yet or you are a trailblazer, you will have heard different, confusing, and sometimes conflicting things about what Artificial Intelligence (AI) can and cannot do.

At MiaRec, we have provided hundreds of contact centers with AI-based Voice Analytics and Quality Management solutions to monitor and improve customer service. We see firsthand how difficult it can be to sort through all the hype and noise out there. But it is imperative that you, as a contact center manager, truly understand what AI and its various subsets are and how they overlap and interact with each other.

To save you time (and sanity), we have defined and explained the most important terms for you in a non-technical, easy-to-understand way. By the end of this article, you will be more confident speaking about and evaluating AI solutions for your contact center.

First, let us define and differentiate Artificial Intelligence (AI) and Machine Learning (ML). Because there is a lot of overlap between these terms, many people use them interchangeably. However, in order to understand how AI supports contact centers, it is important for you to understand the nuance between these terms.

Artificial Intelligence (AI)

Columbia’s School of Engineering defines Artificial Intelligence (AI) as "the field of developing computers and robots that are capable of behaving in ways that both mimic and go beyond human capabilities".

"Today, artificial intelligence is at the heart of many technologies we use, including smart devices and voice assistants such as Siri on Apple devices. Companies are incorporating AI-driven techniques such as natural language processing and computer vision — the ability for computers to use human language and interpret images — to automate tasks, accelerate decision making, and enable customer conversations with chatbots."

Artificial Intelligence isn’t anything new. In fact, the father of computer science, Alan Turing, answered the question “Can computers think?” for the first time in 1950. In 1956, Stanford professor John McCarthy first coined the term “Artificial Intelligence” and defined it as “the science and engineering of making intelligent machines.”

Artificial Intelligence (AI) is a broad field that encompasses Machine Learning, Deep Learning, Natural Language Processing, Large Language Models, and more. To better understand the relationships between all these terms, check out the diagram in Figure 1. We will define each of these terms soon.

Figure 1. Relationship between AI, ML, DL, NLP, and Conversational AI terms.

Machine Learning (ML)

In Machine Learning, Explained by MIT's Sloan School of Management, MIT professor Thomas W. Malone posits that most people use AI and ML interchangeably because “most of the current advances in AI have involved Machine Learning”, but machine learning is a subset of the broader category of AI.

Machine learning (ML) is a subset of AI that uses algorithms to enable machines and computer systems to learn to recognize complex patterns and insights from data without explicit programming. Machine learns to recognize human speech, understand meaning of texts, or even generate new content, for example, write an essay.

Can we have AI system without using Machine Learning? Yes, indeed. Simple AI systems can be explicitly programmed by its creators (engineers and scientists). In fact, in the early 1960s, IBM developed and demonstrated "Shoebox", an innovative device capable to recognize and respond to 16 spoken words, including the ten digits from "0" through "9".

Such an explicit programming process is tedious and labor-intensive. That's why modern AI systems use machine learning algorithms to enable computers to self-learn from data. The more data an algorithm “learns” from, the better abilities it achieves.

Artificial General Intelligence (AGI)

If we define Artificial Intelligence, then we should also define Artificial General Intelligence, because for many people a distinction is not clear. Artificial General Intelligence (AGI) has also gotten a lot of hype recently with the remarkable achievements of Large Language Models, but its capabilities are often overstated in the media.

Artificial General Intelligence (AGI), also called "Strong AI" or "General AI", is a branch of (currently theoretical) AI research aiming to develop an artificial intelligence with a human level of cognitive function, including the ability to teach itself.

In contrast to narrow or weak AI, AGI can theoretically solve many complex problems autonomously across different knowledge specialties without having been specifically trained on them.

The key word here is “theoretical”. As of now, systems with general intelligence do not exist in practice. In contrast, AI systems are used in many areas to solve a single or a small number of problems, like recognizing speech and translating it into text, determining the sentiment of a conversation, extracting information from text, etc.

Difference between Artificial Intelligence and Artificial General Intelligence (AGI)

Artificial Intelligence (AI)	Artificial General Intelligence (AGI)
AI mimics human intelligence. AI is trained by data scientists on specific, singular, or a limited number of tasks. Examples: image recognition, speech-to-text transcription, text generation, chatbots. Is not self-aware; has no consciousness or ability to think on its own.	AGI equals human intelligence, theoretically. AGI possesses common sense and creativity and expresses emotions. Has the ability to learn, apply knowledge, generalize, and plan ahead. Not fully realized yet. Some doubt if it ever will be.

Deep Learning

Deep Learning is a subset of Machine Learning that uses artificial neural networks to mimic the learning process of the human brain. This enables these AI systems to achieve more complex outcomes than other types of AIs.

In traditional machine learning, the learning process is supervised, and the programmer must be extremely specific when telling the computer what types of things it should be looking for to decide, for example, if an image contains a dog or doesn't not contain a dog. This is a laborious process called feature extraction, and the computer's success rate depends entirely upon the programmer's ability to accurately define a feature set for dog.

The advantage of deep learning is the program builds the feature set by itself without supervision.

To understand this unsupervised learning process, let's analyze the computer vision model that predicts age of people from pictures. Computers learn by example. In the paper Identifying individual facial expression by deconstructing a neural network, by Adrabzadah et al,, authors demonstrated that the model learned the key indicators of age without human guidance, just by processing many pictures of old and not old people.

The Figure 2 shows "three basic features, that according to the
model, are indicative for advanced age. Indicator arrow (a) shows that the earlobe is seen as evidence for advanced age. Furthermore, indicator
arrow (b) points at a significant wrinkle around the eye region. Wrinkles around the eye are indeed indicators of advanced age. However, the most dominant feature indicating advanced age are saggy eyelids. Gray hair is partially identified as relevant, but not always as the most relevant indicator. Wrinkles, for example, are a more reliable indicator of age."

To achieve good results, Deep Learning requires both a large amount of labeled data and computing power. These two resources can often be in limited supply.

Natural Language Processing

Natural Language Processing (NLP) is a branch of AI that focuses on understanding human (natural) language. It studies how computers can interpret and manipulate human language.

Computers use NLP models to interpret and respond to text-based input, making it particularly useful for applications such as Sentiment Analysis and chatbots.

[Large] Language Model (LLM)

A Language Model is a probabilistic model of a natural language that can generate probabilities of a series of words, based on text corpora it was trained on. For example, for the sequence of words “She is wearing a red …”, the language model might predict the word “dress” as the most probable.

A Large Language Model (LLM) is a type of language model notable for its ability to achieve general-purpose language understanding and generation. LLMs acquire these abilities by using massive amounts of data to learn billions of parameters during training, and consuming large computational resources during their training and operation.

LLMs demonstrate impressive results on a wide variety of Natural Language Processing tasks like summarization, question answering, translation, etc. Prominent examples of LLMs include ChatGPT, FLAN-T5, Bard, BLOOM, PALM, OPT, LLAMA and other models.

Figure 3. Example of using ChatGPT 4 model to translate a sentence from Italian to English and explaining the grammar.

Conversational AI

Conversational AI is a subset of Natural Language Processing field that focuses on building chatbots or virtual agents, which can interact with humans in a form of dialogue. They are crucial for contact centers and customer service applications.

Modern Conversational AI systems incorporate multiple technologies under the hood:

Speech Recognition to transform spoken conversations into text format,
Natural Language Processing (NLP) to ingest and process written texts,
Natural Language Understanding (NLU) to analyze and understand intent,
Natural Language Generation (NLG) to generate an appropriate and human-like response, and
Text-to-Speech to transform a text response into speech, if necessary.

Contact centers use Conversational AI solutions to build virtual agents, chatbots, or real-time agent assist systems. Modern Conversational AI systems use Large Language Models to achieve phenomenal results in understanding human language. The goal is to improve customer service experiences, make interfaces more user friendly, and human-like.

Well-known household applications of Conversational AI are Siri, Alexa, and Google Assistant. In contact centers, you will find Conversational AI in the form of chat or voice bots, real-time translations, and live agent coaching.

Generative AI

According to Gartner, Generative AI is the subset of Machine Learning that “enables machines to learn from a representation of artifacts from data and models, and use it to generate brand-new, completely original artifacts that preserve a likeness to original data or models. Generative AI can produce totally novel content (including text, images, video, audio, and structures), computer code, synthetic data, workflows, and models of physical objects. Generative AI also can be used in art, drug discovery, or material design.”

To do so, Generative AI employs advanced Machine Learning technology, e.g., Generative Adversarial Networks (GANs) and transformer models like GPT-4. The most known examples of Generative AI models are DALL-E and ChatGPT from OpenAI. ChatGPT is a Large Language Model-based chatbot that uses Natural Language Processing to create human-like conversation and dialogue. DALL-E is an AI system that can create realistic images and art from a description in natural language.

Figure 4. An image generated by DALL-E from the description “An astronaut riding a horse in photorealistic style” from OpenAI's website.

Differences Between Conversational AI and Generative AI

Conversational AI and Generative AI are often used interchangeably in the contact center context. However, they are different in terms of their purpose and intent, their underlying technology and functionality, and how they interface with the user:

Conversational AI is all about interacting in a human-like manner, understanding and responding to spoken or written input, and having natural sounding conversations. To do so, it leverages NLP, NLU, and NLG and usually communicates with the user via a chat interface.
Generative AI focuses on creating new, original written, audio, visual, or other content. In Contact Center space, Generative AI is used to generate summaries of meetings or telephone conversations, extracting data from conversation or documents, answering questions from knowledge base.

While it is important to understand and appreciate the differences between these two AI-based technologies, we will see an increasing overlap and convergence between these two as Generative AI will be used to enhance the capabilities of Conversational AI solutions.

Should You Adopt an AI-based Contact Center Solution?

From traditional pre-programmed to modern machine learning-driven solutions, there are dozens of ways an AI-based solution can support your contact center operations. It can boost agent performances and improve customer experiences, increasing revenue and customer loyalty. It can automate repetitive processes, streamlining agent workloads.

For example, a recent McKinsey report suggests that contact centers adopting analytics solutions, such as AI-driven Voice Analytics, could save up to $5 million in employee costs, and those who deploy AI-driven chatbots and IVR could reduce Average Hold Time (AHT) by 40% while improving self-service containment rates by up to 20%.

Contact centers have a lot to benefit from adopting an AI-based contact center solution. We hope that these definitions made it easier for you to understand what AI is and enable you to diver deeper with your research.

If you feel inspired, we recommend you check out our related article "Best Contact Center AI Use Cases" to learn how Artificial Intelligence can be used in contact centers to improve customer service, increase operational efficiency, and much, much more.

If you are new to implementing AI-based solutions for your contact center, or even if you are a seasoned AI-user, we highly recommend checking out our AI Maturity Model. This model can help you to assess where you are in your AI journey and provide you with recommended next steps to further enhance your AI capabilities. You can download the maturity model as a mini 6-slide presentation here.

View full post