Preamble
I have always been keen on making sure my software products are the easiest possible to use. Human to machine interaction has long benefited from AI research. We have taken for granted voice interaction thanks to our phones (Siri, thanks to Luc Julia, has been around on Apple platform, for more than a decade now).
With ChatGPT, and OpenAI, end of 2022, the public realized AI seems to be able to comprehend questions, instructions and exacerbate reasoning-like behaviors. That bluffed everyone, including myself. Until then, interacting with chatbots always experienced a limited experience, to a point where chatbots have been subjects for jokes. We all stumbled on cases where we humans had to adapt to technology limitations. We’ve all experienced how very quickly the bot was not able to grasp some questions, yet respond to.
Since ChatGPT it almost feels like natural that the AI is able to understand and answer in meaningful ways (most of the time), to a point people use anthropomorphic qualities when talking about their experience, as if, they could not tell the interaction was with a human or a machine.
So it is an evidence to me, chatbots should be at the top of the generative AI use cases. But before talking about use cases, why do we speak about "Generative AI" ?
Generative AI definition
ChatGPT is based on the Transformer model architecture (neural network which by the way was invented by Google, not OpenAI). This type of neural network emerged from NLP research to better learn context by tracking relationships in sequential data like the words in a sentence. It has displaced RNN models since.
These models are trained on gigantic datasets. The training involves masking parts of the text and making the model guess the missing part and optimize the model weights until the prediction is good.
Some scientists simplify these models as "they are just trained to predict the next token" (or word) and "they is not intelligence" per say. I tend to agree, and at the same time, those LLMs (Large Language Models) must be learning something from all this content. There is knowledge in what authors write, even if it is not intelligence nor common sense as we think of the human brain.
This ability to generate a token, and by extension complete sentences, paragraphs, essays, … is what conferred its “generative” naming to this technology.
These models have been used on other data than just text since : image, audio, video, ... so Generative AI is so much more than chatbots.
Answering questions
Any sizeable organization has loads of products and knowledge base to support their customers’ use of these products. The customer support team is on the front line dealing with the customer pressure to get their questions answered. These employees can now interact with a generative AI-based “virtual expert” in their language. Unlike ChatGPT, the “virtual expert” has been populated with the enterprise proprietary documents and can deal with requests specifics with the company’s domain.
Answering questions is not limited to customer support. Bloomberg has recently announced they built their own LLM specifically trained with financial datasets to support needs of their customers in the financial industry. These customers can converse with BloombergGPT about financial news, transcripts of investor calls, company filling, press releases.
Not everyone has the means to create their own GPT-like model. We see now off the shelves solution allowing companies to index their own documents and provide a chabot for internal or customer use. Google has just recently announced the general availability oof its Duet AI for Google Workspace so chatting with your company’s documents may very well work with out of the box experience.
Generating essays
News from all over the world have been flooding our information streams as students were caught cheating as they turned down homework assignments done with ChatGPT.
Marketing teams are on the clock when it comes to generating content for the various sources of information they power. Tools like Jasper.ai help them import documents, templates and provides settings so they can generate content much faster, conserving their branding (what jasper calls Brand Voice). Marketers will follow advice from prompting experts to maximize their content generation efficiency.
Summarizing
Information overload is a well know phenomenon, impacting decision making throughput capacity and quality. Having a good summarizer at hand will save you from the daunting pressure of ingesting all this content. It will help you decide if such and such document is really worth and help prioritize your reading time.
Tools like Genei.io helps you add, organize and manage documents and automatically summarizes background reading and produces blogs, articles, or reports faster.
Translating
It’s never been so easy to reach out to the world and develop your business abroad.
We now have applications handling even live language to language translation, some already integrated in your preferred videocalls platform. APIs handling cloud based translations support creating automatic translate of living textual content so you can target multiple countries, as a small / medium business and not have to go thru all the hassle of manual translation. A human will likely be doing a much better job at handling a static web site translation, but the manual approach is likely not an appropriate solution if your content evolves several times a day and it definitely won’t scale if you are targeting a large number of countries / languages. APIs like DeepL will otherwise support your needs for internationalization.
For example, the European Commission has just recently announced an automatic translation of press releases. Human professional translation will pursue and replace the automatic version “as soon as the human versions are available”.
Another recent example is the video posted by Veolia's CEO to address all the company's customers, partners and employees for its 170th anniversary in Arabic, Chinese, Czech, Spanish, Italian, French, English, etc. Today, you can speak to the whole world without having the talents of a polyglot.
Certain open source models like SeamlessM4T from Meta handles all speech or text translations but is not yet available under a commercial license.
Another form of translation is from textual language to application code. As you learn a programming language or a new library or API, you spend a lot of time searching for code samples, perform lots of copy / paste to assemble the desired outcome in your application code. Now developers using GitHub Copilot can provide a textual description of what they are trying to achieve, and the tool will provide coding suggestions in response.
Comments