OpenAI has native image generation in ChatGPT and Sora.
In a livestream led by CEO Sam Altman as well as members of OpenAI team, the company demoed new capabilities in image generation that’s driven by the GPT-4o model.
A new AI test is outwitting OpenAI, Google models, among others
Previously, image generation relied on OpenAI’s DALL-E text-to-image model. Now, GPT-4o handles the image generation, meaning it has the world knowledge and contextual understanding to generate images more seamlessly and conversationally. The model’s responses will understand contextual prompts without specific reference to an image, can follow prompts for reiterating on a generated image, and OpenAI says it’s way better at rendering text.
Text rendering looks to be way better.
Credit: OpenAI
With image generation in ChatGPT, OpenAI’s goal is to make it more useful rather than just a novelty. That means it can generate diagrams, infographics, logos, social media posts, and other graphics. In Sora, there’s now a new section for generating images (in addition to videos) much like the Midjourney interface.
Mashable Light Speed
In the livestream, Altman said that the model leans into “creative freedom,” saying “what we’d like is for the model to not be offensive if you don’t want it to be, but if you want it to be within reason, really let people create what they want.”
Altman seemingly tried to clarify this in an X post, saying, “what we’d like to aim for is that the tool doesn’t create offensive stuff unless you want it to, in which case within reason it does. As we talk about in our model spec, we think putting this intellectual freedom and control in the hands of users is the right thing to do, but we will observe how it goes and listen to society.”
This Tweet is currently unavailable. It might be loading or has been removed.
In case that didn’t totally make sense to you either, OpenAI’s stance on blocking images that violate its content policy “such as child sexual abuse materials and sexual deepfakes,” remains the same.
According to the accompanying blog post, all images have C2PA metadata, which provides invisible watermarks detailing an image’s provenance.
Native image generation for ChatGPT is available today for ChatGPT Plus, Pro, Team, and Free users within the chat experience, with access rolling out to Enterprise and Edu users soon.
Read the full article here