22 Feb 2024

Launching an ‘AI moonshot’ to develop a European large language model is the game changer that Europe needs

0


The changes in society being led by Artificial Intelligence (AI) are immense. Among the available AI technologies, language models are right now most universally beneficial for citizens, industry and governments. Yet the most powerful language models are being developed outside Europe by companies such as OpenAI, Google DeepMind and Anthropic.This is why the European Commission should launch a dedicated mission for a collaborative European language model to ensure Europe doesn’t miss the AI boat.

AI is currently an industry where scale matters. If there hasn’t been enough computation spent on understanding the patterns in the data, or the data isn’t comprehensive enough, then the outputs will lack quality. Larger investments tend to pay off. In usual settings there are diminishing returns to investment, but for language models there’s a threshold of investment that needs to be reached before the model can even be considered a relevant addition to the market.

Thus, the concentration of power in the AI industry is concerning. Developing an AI model involves many important decisions that fundamentally shape the model, and are only available for those with deep pockets. The high infrastructure and development costs of generative AIs such as ChatGPT prevents the wider availability of generative AI. ‘Open models’ that rely on corporations owning computing infrastructure and the capture of academic research mostly contribute to an ‘AI monoculture’.

Consequently, Europeans are becoming increasingly dependent on how these developers build and price their technology.

The current state of generative AI is like the 1950s when Russia was leading the space race, and the United States needed the moonshot Apollo mission to catch up. When a disruptive technology is developed rapidly, geopolitical power structures are on the line. The outcome of these races leads to path dependence – once Russia owns the space race or the United States owns the AI space, the winner can leverage market dominance and significantly determine the technology’s trajectory. While the space race was deeply entangled with the Cold War, the impact of generative AI could be much more transformative to broader society today than reaching the moon was in the 1960s.

But is it too late to invert this trend? No, but make no mistake about it, Europe’s window of opportunity will soon close.

Ideally, DG CNECT and DG RTD should take the lead. They can bring to the table the various ways in which AI is already being incentivised, such as through the AI, Data and Robotics Association, Horizon projects, AI in Science and the Large AI Grand Challenge, and bring them all under one mission umbrella. The EU’s top supercomputers can also be used in the process.

This operation could then spin off into a ‘CERN for AI’ that would cover next-generation technologies. This, however, will only be possible over a longer timeframe. For example, CLAIRE estimates that around EUR 100 billion over six years would be needed to build such a ‘CERN for AI’. Building a European language model can be achieved much sooner, and at a fraction of the cost.

The large language model should feature certain characteristics. First, it should be open source. For research purposes, it’s key to disclose the data that the model was trained on. Industry needs to be able to adapt the model to specific use cases. This flexibility increases the risks of misuse, so the mission should prioritise secure open-source AI. To promote wider adoption, there needs to be carefully priced hosted access paired with adequate maintenance and customer support services.

Second, the large language model should be ethically aligned and legally compliant. Most current language models do not comply with the requirements of the EU’s AI Act. On the contrary, the European language model should be exemplary and closely follow the ethics guidelines for trustworthy AI. This means traceability for transparency, enhanced societal impact assessments and engagement with people affected by the technology through a dedicated Citizens’ Assembly on AI. Important technological challenges such as cybersecurity, privacy-enhancing mechanisms and automated fact-checking must also be addressed along the way.

Third, the model should come in three sizes. One small and efficient version that can run on devices such as smartphones and tablets. One medium-sized and low-cost model that can handle frequent and less precise tasks. And finally, one very large model in the range of GPT-4 and Gemini should be developed for the most challenging tasks.

The mission’s success could be tracked by looking at the model’s performance vis-à-vis its market competitors, achieving a significant adoption rate, and embedding state-of-the-art transparency and efficiency in designing and training the model. To achieve this, sufficient resources must go to data collection and algorithmic design. Finetuning, which amplifies an existing model with new data, needs to be made as comfortable as possible to incentivise downstream use.

Moreover, there should be working groups on auditability and trustworthiness. Similarly, industry and civil servants should be able to connect the AI to their internal databases so that the AI can work better for their specific purposes.

The spillover effects of such a ‘mission’ could be huge (and hugely beneficial). The Apollo mission catalysed technological development, surpassing the original goal of space exploration. The mission paved the way for over 1 500 successful spin-offs including breakthroughs in technologies such as the CT scan, heart monitors, and solar panels. Generally, publicly funded research has on average estimated long-term returns of 20 % per year, and since AI adoption has only just started, there’s so much untapped potential to explore and take advantage of.

Like the Apollo mission, a mission to develop a European large language model could benefit so many industries and governments. It could empower citizens in their daily lives and mark new breakthroughs in secure, transparent and open-source AI.

That’s why the European Commission shouldn’t hesitate any further – this really is an opportunity too promising to ignore.

This Expert Commentary is part of a series that will be published prior to the CEPS Ideas Lab on 4-5 March 2024 to showcase some of the innovative ideas we’ll be rigorously debating with our participants. More info can be found on the official Ideas Lab 2024 website.