Spiria logo.

DALL·E, draw me a sheep

January 5, 2021.

Avocado Armchair.

© OpenAI.

The Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model that uses deep learning to produce texts that could have been written by humans. Not only can it create news articles, but it can also write poetry, and even code. Its creator, OpenAI, unveiled a version of GPT-3 that is trained to generate images from text descriptions. Called DALL·E (a portmanteau of the surrealist artist Salvador Dalí and Pixar’s WALL·E), this 12-billion parameter version of GPT-3 can produce very convincing images from text descriptions, such as “an illustration of a baby daikon radish in a tutu walking a dog”, “an armchair in the shape of an avocado”, or “an emoji of a baby penguin wearing a blue hat, red gloves, green shirt, and yellow pants”. Unlike a 3D rendering engine, whose inputs must be specified unambiguously and in complete detail, DALL·E is often able to “fill in the blanks” when the caption implies that the image must contain a certain detail that is not explicitly stated. DALL·E also has the ability to combine disparate ideas to synthesize objects, some of which are unlikely to exist in the real world, like a snail-harp, for example. Dalí would approve.

OpenAI, “DALL·E: Creating images from text.”

2021-01-05