ExploreGen: Large Language Models for Envisioning the Uses and Risks of AI Technologies


ExploreGen
Overview of our method. A three LLM-prompt pipeline for generating mobile and wearable uses of AI (prompt #1), classifying each use's risks according to the EU AI Act (prompt #2), and determining whether each generated use is beneficial according to the UN's Sustainability Development Goals (prompt #3). Out of 138 generated uses, as many as 80 were considered high risk according to the EU AI Act, primarily aligning with Sustainable Development Goals 3 (good health and well-being), 10 (reduced inequalities), and 16 (peace and justice). Our method was validated by two experts in mobile and wearable technologies, a legal and compliance expert, and a cohort of nine individuals with legal backgrounds who were recruited from Prolific, confirming its accuracy to be over 85%.

Business developers and engineers seek opportunities to employ the latest AI trends ahead of their competitors, while researchers take part in a similarly fast-paced environment to publish their latest AI discoveries. In both roles, these AI practitioners are faced with increased need to envision potential uses, as well as risks and benefits of the technologies they are developing, and to produce impact assessment reports. Recent research shows that AI developers struggle with detailing uses and impacts for model cards and data cards, as well as for the broader societal impacts sections now mandated by some of the top AI conferences. Recommendations to support AI practitioners with envisioning the impacts of their technology include encouraging reflexivity, including constructive and data-driven deliberation.

Our research responds to this challenge by exploring the use of Large Language Models (LLMs) to generate AI technology uses and their risk assessments based on the EU AI Act. This aims to support AI practitioners during the initial phases of the AI design process, including reflexivity, brainstorming, and deliberation. Our aim is not to produce an exhaustive list of uses for a given AI technology, nor to provide a definitive risk classification. Instead, we aim to investigate whether LLMs can generate outputs of sufficient quality to support AI practitioners in envisioning the impacts of their technology, particularly focusing on less well-researched uses.

We developed an LLM framework, ExploreGen, which generates realistic and varied uses of AI technology, including those overlooked by research, and classifies their risk level based on the EU AI Act regulation. We evaluated our framework using the case of Facial Recognition and Analysis technology in nine user studies with 25 AI practitioners. Our findings show that ExploreGen is helpful to both developers and compliance experts. They rated the uses as realistic and their risk classification as accurate 94.5%. Moreover, while unfamiliar with many of the uses, they rated them as having high adoption potential and transformational impact.


Publications

  • ExploreGen: Large Language Models for Envisioning the Uses and Risks of AI Technologies. AIES 2024 PDF

Code and data


We'll never share your email with anyone else.

N.B.: If you do not receive the instruction message within a few hours, please check your junk/spam e-mail folder just in case the email was moved there.