We are gonna get right to the lede, right away: ChatGPT creator OpenAI has announced GPT-4o.
Really, it was only a matter of time before OpenAI released something like this to the public.
Think of it this way: the text-generating ChatGPT, the image-generating DALL-E, and the computer vision wonder that is the video-generating Sora, are like the limbs of a transformer that goes by the name of GPT-4o.
Now that the limb-parts are being assembled into a whole, we are entering a new era of generative A.I.
This new era could be called the full-suite era, and Google has already been on this trend by releasing the multimodal Gemini some months back.
So, what is essential to know about this emerging technology, and what impact will it likely have on business owners? We cover all that below.
So, What Exactly Is GPT-4o (and Why Is It Called That?)
Alright, let us get the name thing out of the way first: the little “o” stands for “omni”, which is from the Latin word “omnis”, meaning “all”.
The choice of this little o is obvious once you realize that the all here refers to GPT-4o’s integration of the functions available across many separate OpenAI projects.
That includes text generation, image generation, and video generation. You can give this thing any combo of inputs and it will generate any combo of outputs, so you can truly experiment with text, image, and video inputs.
Overall, this is like a beefed-up version of any of those familiar A.I. voice assistants like Apple’s Siri (which is expected to get its very own gen A.I. upgrade pretty soon) and Amazon’s Alexa.
Its voice has that self-aware quality of an actor semi-naturally reading lines at a script reading. You can see a video of GPT-4o at work in the announcement video here.
Keep reading for a closer look at what it means for GPT-4o to be “multimodal” and what impact this will have on the business world.
The Multimodal Experience
When we say a generative A.I. platform is multimodal, we mean that it can work across multiple mediums such as text and image and video and audio, instead of being confined to just one.
The iteration of ChatGPT that was released in November of 2022, for instance, only took text inputs and only offered text outputs.
Whereas GPT-4o can take an input like a couple images of a dog and the text-written request to “create a video montage of these dog images with some bouncy music”, and out will pop the requested video montage.
Basically, this whole concept of multimodality encompasses A.I. systems that have flexibility in handling inputs and a wide range of abilities in creating outputs.
So, how will such technology affect those in the business world? We make our predictions below.
The Impact on Business Owners
Already, business owners everywhere are curious about just how A.I. will be able to improve the efficiency of operations.
Multimodal generative A.I. platforms like GPT-4o will help deliver on the promise of A.I. functioning like a multitalented assistant for workers all along a given company’s hierarchy.
In this case, multimodal gen A.I. could help employees quickly create things like emails, visual graphs for presentations, and videos that would normally cost a small business too much sweat equity to realistically create otherwise.
To put it simply, this technology will streamline the process of communications inside and outside an organization.
In addition to creating things from scratch, a somewhat unsung or maybe just lesser-sung use for this technology is the organization of information. We dive into this use for the technology in the next section.
An Illustrative Example
Let us suppose that you work for an accounting firm called The Number Crunchers.
Every Friday afternoon you are expected to send out to a few clients a summary report of recent market trends. Your company supplies the raw data, and you are expected to create reports that make the data consumable and understandable to the average non-accountant.
This can be time-consuming work, oftentimes leading to situations where you need to bring accounting work home to do over the weekend. So, you could use some help in this department.
You enlist GPT-4o to take the mass of findings about recent market trends and have it create reports with short paragraphs, data-visualizing graphs, and other eye-popping and consumable formatting of the information.
The Gen A.I. program indeed offers you this, to your joy. All you need to do is a round of editing along with fact-checking, and suddenly you are much more productive in creating these reports.
As you can see, then, the multimodal swiss army knife that is the next round of gen A.I. platforms will be game-changers for people across the world, whether they are in accounting or not.
Recent Comments