Tech giants fail as AI fails to meet new EU rules
Image Credits: Curto News/Bing Image Creator

Tech giants fail as AI fails to meet new EU rules

Some of the models of artificial intelligence (AI) are falling short of European regulations in key areas such as cybersecurity resilience and discriminatory exit, according to data seen by Reuters.

ADVERTISING

A European Union (EU) had long been debating new AI regulations before the OpenAI launched the ChatGPT to the public in late 2022. The record popularity and subsequent public debate over the alleged existential risks of such models have encouraged lawmakers to craft specific rules around “general purpose” AIs (GPAIs).

Now, a new tool, which has been welcomed by European Union officials, has tested generative AI models developed by major tech companies such as Meta e OpenAI across dozens of categories, in line with the bloc's sweeping AI law, which is coming into force in stages over the next two years.

Designed by Swiss startup LatticeFlow AI and its partners at two research institutes, ETH Zurich and Bulgaria’s INSAIT, The tool awards AI models a score between 0 and 1 in dozens of categories, including technical robustness and security.

ADVERTISING

A ranking published by LatticeFlow on Wednesday (16) showed that models developed by Alibaba, Anthropic, OpenAI, Meta and Mistral received average scores of 0,75 or higher.

However, the “Large Language Model (LLM) Checker” The company’s research uncovered some shortcomings of the models in key areas, highlighting where companies may need to direct resources to ensure compliance.

Companies that do not comply with the AI law will face fines of 35 million euros ($38 million) or 7% of annual global turnover.

ADVERTISING

Mixed results

Currently, the EU is still trying to establish how AI Law rules around generative AI tools like ChatGPT will be applied, calling on experts to draw up a code of practice to regulate the technology by spring 2025.

But the test offers an early indicator of specific areas where tech companies risk falling short of the law.

For example, discriminatory output has been a persistent problem in the development of generative AI models, reflecting human biases around gender, race, and other areas when prompted.

ADVERTISING

When testing discriminatory output, LatticeFlow’s LLM Checker gave LatticeFlow’s “GPT-3.5 Turbo” OpenAI a relatively low score of 0,46. For the same category, Alibaba Cloud’s “Qwen1.5 72B Chat” model only received 0,37.

Testing for “prompt hijacking,” a type of cyberattack in which hackers disguise a malicious prompt as legitimate to extract sensitive information, LLM Checker awarded Meta’s “Llama 2 13B Chat” model a score of 0,42. In the same category, French startup Mistral’s “8x7B Instruct” model received a score of 0,38.

“Claude 3 Opus”, a model developed by Anthropic, supported by Google, received the highest average score, 0,89.

ADVERTISING

The test is designed in accordance with the text of the AI ​​Act and will be extended to include other enforcement measures as they are introduced. LatticeFlow said the LLM Checker will be freely available for developers to test the compliance of their models online.

While the European Commission cannot verify external tools, the body has been kept informed throughout the development of the LLM Checker and described it as a “first step” in implementing the new laws.

A spokesperson for the European Commission said: “The Commission welcomes this study and AI model assessment platform as a first step in translating the EU AI Law into technical requirements.”

Read also

Leave a comment

Your email address will not be published. Required fields are marked with *

Scroll up