“One of the most powerful things about this technology is that, like DALL-E, it does what you tell it to do,” said Nate Bennett, one of the researchers working in the University of Washington lab. “From a single prompt, it can generate an endless number of designs.”
The Rise of OpenAI
The San Francisco company is one of the world’s most ambitious artificial intelligence labs. Here’s a look at some recent developments.
To generate images, DALL-E relies on what artificial intelligence researchers call a neural network, a mathematical system loosely modeled on the network of neurons in the brain. This is the same technology that recognizes the commands you bark into your smartphone, enables self-driving cars to identify (and avoid) pedestrians, and translates languages on services like Skype.
A neural network learns skills by analyzing vast amounts of digital data. By pinpointing patterns in thousands of corgi photos, for instance, it can learn to recognize a corgi. With DALL-E, researchers built a neural network that looked for patterns as it analyzed millions of digital images and the text captions that described what each of these images depicted. In this way, it learned to recognize the links between the images and the words.
When you describe an image for DALL-E, a neural network generates a set of key features that this image may include. One feature might be the curve of a teddy bear’s ear. Another might be the line at the edge of a skateboard. Then, a second neural network — called a diffusion model — generates the pixels needed to realize these features.
The diffusion model is trained on a series of images in which noise — imperfection — is gradually added to a photograph until it becomes a sea of random pixels. As it analyzes these images, the model learns to run this process in reverse. When you feed it random pixels, it removes the noise, transforming these pixels into a coherent image.
At the University of Washington, other academic labs, and new start-ups, researchers are using similar techniques in their effort to create new proteins.
Proteins begin as strings of chemical compounds, which then twist and fold into three-dimensional shapes that define how they behave. In recent years, artificial intelligence labs like DeepMind, owned by Alphabet, the same parent company as Google, have shown that neural networks can accurately guess the three-dimensional shape of any protein in the body based just on the smaller compounds it contains — an enormous scientific advance.