Update, 11:50 a.m. Wednesday: DALL-E was previously only accessible to the public via invite, but the AI generator has now been made available to all without a waitlist. Read more about signing up to use DALL-E here.
Original post, 2:10 p.m. Monday: Think of something, type it in, and wait for it …
Imagine an image generated from your thoughts and words in just minutes, all ready for the world to see. That’s what OpenAI’s DALL-E — a neural network designed to convert text descriptions into images — and its newest version, DALL-E 2, is for.
This artificial intelligence-powered image generator seems futuristic and fun, but is it necessary? More importantly, is it safe? And how can technologies like these handle things like online disinformation, bias, violent content and other harms?
In a recent episode of KQED Forum, host Mina Kim hosted a conversation on the possibilities and pitfalls of this technology with:
- Lama Ahmad, policy researcher from DALL-E creator OpenAI;
- Hany Farid, deepfake expert at UC Berkeley.
The following interview has been edited for length and clarity.
MINA KIN: What is DALL-E, and how does it work?
LAMA AHMAD: DALL-E is an AI that basically does text-to-image. So someone can type in a prompt and the output would be anything you can imagine. It learned from pairs of lots and lots of images, and lots and lots of captions, and basically is able to construct those concepts entirely from scratch. Even though it learned from existing images, words and language, it will be able to construct what that image would look like.
You can also use Inpaint to select a region of an image and modify it also using words. And you can also create variations on existing images, or images generated by the software itself.
The images can look hyperrealistic, fantastical or abstract, if that’s what you would like. Sort of all in the words that you use and how you craft your prompts.