Groundbreaking ideas and research for engaged leaders
Rotman Insights Hub | University of Toronto - Rotman School of Management Groundbreaking ideas and research for engaged leaders
Rotman Insights Hub | University of Toronto - Rotman School of Management

Automating creativity: Using AI as a creative engine

Read time:

Ethan Mollick

An astonishing change is occurring in the landscape of human creativity: there is now evidence that AI can help make us more innovative — if we use it properly.

The core irony of generative AIs like ChatGPT and Bing is that they were supposed to be all logic and no imagination. Instead, we get AIs that make up information, engage in (seemingly) emotional discussions and are intensely creative. And that last fact is one that makes many people deeply uncomfortable.

To be clear, there is no one definition of creativity, but researchers have developed a number of tests that are widely used to measure the ability of humans to come up with diverse and meaningful ideas. The fact that many of these tests were flawed wasn’t that big a deal until, suddenly, AIs were able to surpass all of them. But now, ChatGPT-4 beats 91 per cent of humans on a variation of the Alternative Uses Test for creativity, and exceeds 99 per cent of people on the Torrance Tests of Creative Thinking. We are running out of creativity tests that AIs cannot ace.

While these psychological tests are interesting, applying human tests to AI can be challenging. There is always the chance that the AI has previously been exposed to the results of similar tests and is just repeating answers (although the researchers in these studies have taken steps to minimize that risk). And, of course, psychological testing is not necessarily proof that AI can actually come up with useful ideas in the real world.

Yet, we have learned from three recent academic papers — all of which are available at — that AI really can be creative in settings with real-world implications. Each paper directly compares AI-powered creativity and human creative effort in controlled experiments. The first major paper is from my colleagues at Wharton: “Ideas Are Dimes a Dozen: Large Language Models for Idea Generation in Innovation.” In it, they describe how they staged an idea generation contest; pitting ChatGPT-4 against students in a popular innovation class that has historically led to many start-ups.

The researchers — Karan Girotra (Cornell University) and colleagues used human judges to assess idea quality, and found that ChatGPT-4 generated more, cheaper and better ideas than the students. Even more impressive from a business perspective was that the purchase intent from outside judges was higher for the AI-generated ideas. Of the 40 best ideas rated by the judges, 35 came from ChatGPT.

In the second paper, “The Crowdless Future? How Generative AI Is Shaping the Future of Human Crowdsourcing,” researchers conducted a wide-ranging crowdsourcing contest, asking people to come up with business ideas based on reusing, recycling or sharing products as part of the circular economy. Léonard Boussioux (University of Washington) and his colleagues then had judges rate those ideas and compared them to the ones generated by GPT-4. The overall quality level of the AI and human-generated ideas were similar, but the AI was judged to be better on feasibility and impact, while the humans generated more novel ideas.

The final paper, “Generative AI Enhances Creativity,” did something a bit different, focusing on creative writing ideas rather than business ideas. The study by Anil R. Doshi (University College London) and Oliver P. Hauser (University of Exeter Business School) compared humans working alone to write short stories to humans who used AI to suggest three to five possible topics.

Again, the AI proved helpful; humans with AI help created stories that were judged as significantly more novel and more interesting than those written by humans alone. There were, however, two interesting caveats. First, the most creative people were helped the least by the AI, and AI ideas were generally judged to be more similar to each other than ideas generated by people. Though again, this was using AI purely for generating a small set of ideas, not for writing tasks.

Collectively, what does all of this mean? Having digested these studies, there are a few clear conclusions:

• AI can generate creative ideas in real-life, practical situations. It can also help people generate better ideas.

• The ideas AI generates are better than what most people can come up with, but the most creative people can beat the AI (at least for now), and may benefit less from using AI to generate ideas.

• There is more underlying similarity in the ideas that the current generation of AIs produce than among ideas generated by a large number of humans.

All of this suggests that humans still have a large role to play in innovation — but that they would be foolish not to include AI in that process, especially if they don’t consider themselves highly creative.

So, how should we use AI to help generate ideas? Fortunately, the papers, and other research on innovation, have some good suggestions.

Prompting for Ideas

People get very hung up on the idea that you have to be great at prompting AIs with specific wording to get them to accomplish anything. But this just doesn’t seem to be the case in idea generation.

In the paper comparing AI to crowdsourcing, the authors tested three kinds of prompts: basic ones that stated the problem, more advanced ones that gave the AI a persona to be more like a human solver (“You are a technical and creative professional, located in Europe”) and a very advanced one that asked the AI to take the perspective of particular famous experts. While there were some differences between these groups, no one approach clearly dominated. So don’t worry too much about the exact wording of the prompt. You can experiment to see what might work best.

Truthfully, simple prompts seem to work fine. The paper on innovation contests, for example, provided a simple system prompt as context: "You are a creative entrepreneur looking to generate new product ideas. The product will target college students in the United States. It should be a physical good, not a service or software. I’d like a product that could be sold at a retail price of less than about US$50. The ideas are just ideas. The product need not yet exist, nor may it necessarily be clearly feasible. Number all ideas and give them a name. The name and idea are separated by a colon."

It also provided a second user prompt: "Please generate 10 ideas as 10 separate paragraphs. The idea should be expressed as a paragraph of 40 to 80 words."

They repeated this process several times, because generating a lot of ideas is useful. They also compared the value of this ChatGPT-4 prompt to one using ‘few-shot learning.’ Few-shot learning is easy to do: you simply provide the AI with examples of the kind of results you would like to see (rather than ‘zero shot’ learning, where you provide no examples) before you ask it to generate ideas. In the study, while the AI generated more and better ideas with few-shot approaches, the difference was not statistically significant. At the same time, I generally would still suggest using few-shot techniques because they seem to help subjectively, other research has found them valuable, and they are easy to implement.

As an example of the difference, I asked natural-language AI model Claude 2 to “generate 15 original start-up ideas that would be a good fit for Y Combinator” (the famous accelerator). This was a zero-shot approach. Then I tried a few-shot method where I gave the AI a list of 400 recent Y Combinator start-ups, with one-sentence descriptions of each, and prompted: "Here are 400 of the latest startup ideas from Y Combinator. Come up with trends, then generate 15 original ideas combining these concepts."

You can see the difference, and why I prefer the few-shot approach.

PROMPT 1: “Generate 15 original start-up ideas that could be a good fit for Y Combinator” (Zero-shot approach)

1. An app that provides real-time translation for conversations: Use AI and speech recognition to offer live translation during in-person conversations.

2. VR physical therapy: Offer immersive and gamified physical therapy through virtual reality.

3. Automated grocery shopping: Enable fully automated grocery shopping by having users input preferences and recipes.

4. AI reading companion: An app that summarizes books and generates thoughtful discussion questions.

5. Smart garden sensors: Internet-connected sensors that monitor soil, light, moisture for houseplants.

6. AR interior design: Allow users to visualize virtual furniture and decor in their actual rooms via augmented reality.

7. Wearable health insights: A smartwatch app that provides personalized health advice and goals based on wearer’s vitals and habits.

8. Custom printable electronics: Allow consumers to easily customize and print basic electronic devices, like sensors and controllers.

9. Automated meal planning: App that generates personalized weekly meal plans based on family size, diet, schedule, etc.

10. Backyard mushroom farming: Simplified kits for easily growing gourmet mushrooms at home.

11. AI exercise coach: App that acts as a personalized trainer, adjusting workouts and coaching based on goals and feedback.

12. AR furniture assembly: Use AR to guide users through assembling furniture step-by-step.

13. Smart home security drones: Drones that can autonomously survey a property when alarm is triggered or suspicious motion detected.

14. Custom vitamins: Offer personalized vitamin packs based on individual health needs and blood analysis.

15. Remote team bonding: An app with virtual team activities and events to help remote teams connect on a social level.

PROMPT 2: “Here is a list of 400 recent Y Combinator start-ups with a short description of what they do. Come up with trends, and based on those, generate 15 original ideas combining these concepts.” (Few-shot approach)

1. AI assistant for therapists to automate documentation and billing.

2. Supply chain optimization for perishable goods using demand forecasting AI.

3. Computer vision and multilingual NLP for automated vehicle damage assessment.

4. AI-powered wealth management for Millennials combining robo-advising and social investing.

5. Voice assistant for multilingual households with automatic language translation.

6. AI tool for customized real estate investment recommendations and portfolio optimization.

7. Intelligent virtual care platform tailored for senior citizens leveraging multimodal AI.

8. Automated sous chef assisting home cooks with meal planning, instructions and shopping lists via voice.

9. AI fashion design assistant generating trendy clothing designs combining visual and language models.

10. Smart electric vehicle charging optimized dynamically using forecasting algorithms and user preferences.

11. AI tool for developers to get coding help in natural language for faster problem solving.

12. Machine-learning pipeline monitoring platform for automated model testing, documentation and reproducibility.

13. No-code platform for building personalized health chatbots for patient engagement and treatment adherence.

14. Virtual interior design assistant generating customized 3D room rendering from user prompts.

15. AI-generated motivational audio coaching personalized to user’s fitness level and goals.

Outside of these suggestions, I have a couple of my own. First, don’t just ask the AI to generate ideas; use constraints as well. In general, and contrary to what most people expect, AI works best to generate ideas when it is most constrained (as do humans!). Force it to give you less likely answers, and you are going to find more original combinations, which may solve the originality problem. You might want to ask: "You are an expert at problem-solving and idea generation. When asked to solve a problem you come up with novel and creative ideas. Here is your first task: Tell me 10 detailed ways an AI (or a superhero, an astronaut or any other odd profession) might do. Describe the details of each way."

You can also use other techniques that take advantage of the ways that AI can ‘hallucinate’ plausible, but interesting, material, and use that as a seed of creativity. Consider asking it for interview transcripts for fake interviews: "Create an interview transcript between a product designer and a dentist about the problems the dentist has," for example. Or ask it to describe non-existent products: "Walk me through the interface for a fictional new water pump that has exciting new features." There is an art to this that you can learn from experimentation, and you should feel free to share other prompt techniques that work in the comments.

In the real world, most new ideas do not come from the ether; they are based on combinations of existing concepts, which is why innovation scholars have long pointed to the importance of recombination in generating ideas. And LLMs are very good at this, acting as connection machines between unexpected concepts. They are trained by generating relationships between tokens that may seem unrelated to humans but represent some deeper connections. Add in the randomness that comes with AI output, and the result is, effectively, a powerful creative ability.

In the most practical sense, we are now much less limited by ideas than ever before. Even people who don’t consider themselves creative now have access to a machine that will generate innovative concepts that beat those of most humans (though not the most creative ones). Where previously, there were only a few people who had the ability to come up with good ideas, now there are many. This is an astonishing change in the landscape of human creativity, and one that likely makes execution, not raw creativity, a more distinguishing factor for future innovations.

We still don’t know how original AIs can actually be, and I still hear people arguing that large-language models can’t generate any new ideas. To me, it is increasingly clear that this is not true, at least in a practical, rather than philosophical, sense.

This article originally appeared in the Winter 2024 issue of the Rotman Management Magazine. Subscribe now for the latest thinking on leadership and innovation. 

Ethan Mollick is an associate professor at the Wharton School of the University of Pennsylvania, where he studies and teaches innovation and entrepreneurship and examines the effects of artificial intelligence on work and education.