A new 2024 AI & ML Report: Evolution of Models & Solutions from AI control platform Aporia reveals staggering data and insights from machine learning (ML) engineers working with generative AI and large language models (LLMs). The survey indicates a widespread trend of hallucinations and bias in AI products, highlighting a critical challenge in an industry that is experiencing rapid maturation.

In the past year, enterprises spanning diverse industries have increasingly integrated AI products, such as LLMs, into their business operations. As production models become integral for value creation, they must be monitored and observed for peak performance. And notably, production ML is evolving beyond its infancy, and must gain clarity by defining its responsibilities and streamlined workflows.

Aporia surveyed 1,000 ML professionals based in North America and the United Kingdom working at companies with 500-7,000 employees across vast industries such as finance, healthcare, travel, insurance, software and retail. The findings emphasize the challenges and opportunities faced by ML production leaders and shed light on the increasing importance of AI optimization for efficiency and value creation. Key findings include:

  • Malfunctions are arising: 93% of ML engineers encounter production model challenges on a daily or weekly basis, emphasizing the critical need for robust monitoring and control tools to ensure seamless operations. From this group, 5% encounter issues 2-3 times a day, and 14% responded they face problems once a day.
  • You’re not hallucinating, the AI is!: 89% of ML engineers in companies that use LLMs and generative AI models (chatbots, virtual assistants) say their models show signs of hallucination. The severity of these hallucinations can range from factual errors to content that’s biased and even dangerous.
  • Bias is real: 83% of respondents prioritize monitoring for AI bias in projects, despite facing challenges such as identifying the biased data, inadequate monitoring tools, and a lack of understanding of the implications of bias.
  • Let’s observe: 88% of ML practitioners view real-time observability as crucial, stating they are oblivious to any issues that may occur in production without it, despite some enterprises not utilizing automation tools for monitoring and observability.
  • Timesuck: Another significant challenge is the time it takes for enterprises to develop production monitoring tools and dashboards, with companies spending an average of 4 months on these projects. This raises important questions about the efficiency and cost-effectiveness of in-house tool development.

“Our report shows a clear consensus amongst the industry, AI products are being deployed at a rapid pace, and there will be consequences if these ML models are not being monitored,” said Liran Hason, CEO of Aporia. “The engineers who are behind these tools have spoken– there are problems with the technology and they can be fixed. But the correct observability tools are needed to ensure enterprises and consumers alike are receiving the best possible product, free of hallucinations and bias.”

To read the full report, click here.