Patronus Unveils Diagnostic Tool for Trustworthy AI

Patronus Develops Diagnostic Tool for Reliable AI

Startup company Patronus has recently developed a diagnostic tool aimed at ensuring the reliability of artificial intelligence (AI) systems. With the rapid advancement of genAI platforms like ChatGPT and Dall-E2, the challenge of preventing errors and aggressive responses in AI models has become increasingly difficult. Until now, ensuring the accuracy of information generated by large language models (LLMs) that underpin genAI has been limited.

Addressing the Need for Trustworthy genAI

As the demand for reliable genAI continues to grow, Patronus, a startup company, has launched an automated evaluation and security platform designed to enable safe usage of LLMs. This platform allows for the detection of inconsistencies, inaccuracies, delusions, and biases within AI models through adversarial testing.

Resolving Trust and Safety Issues

Anand Kannanappan, CEO of Patronus, emphasizes the importance of building trust in AI systems. He states, “The reason people lack trust in AI is because they cannot be certain if it is hallucinating or not. This product is a validation for validation.”

Patronus’ Simple Safety Test diagnostic tool utilizes 100 test prompts to investigate whether AI systems pose significant safety risks. Through extensive testing, Patronus evaluated popular genAI platforms, including OpenAI’s ChatGPT and other AI chatbots, revealing a failure rate of approximately 70%. This highlights the need for improved reliability.

Gartner’s Insights on AI Hallucination

Aviva Litan, Vice President and renowned analyst at Gartner, emphasizes the severity of AI hallucination rates, which can range from 3% to 30%. However, comprehensive data on this issue is still limited.

Enhancing Security and Trust in AI Deployments

Gartner predicts a 15% increase in spending on cybersecurity resources for next-generation AI by 2025, indicating the growing need to protect these systems. The report also stresses the importance of human intervention in AI deployment, as relying solely on automated processes can lead to risks.

Microsoft’s Copilot, previously known as Bing Chatbot, is expected to raise awareness of the need for human supervision in AI systems. While Copilot meets only one of the five criteria suggested by Gartner, it guarantees accurate information output when provided with personal data. However, there is a possibility of incorrect information being generated when using internet-derived data.

Patronus Ensuring Trustworthy AI

Patronus aims to become a trusted third-party evaluator for genAI models, addressing the lack of trust in AI systems. By providing verification checks and safety testing, Patronus supports companies in detecting large language model errors through an automated approach.

Through their diagnostic tool, Patronus discovered significant safety vulnerabilities in various open language models. While some LLMs did not generate unsafe responses, the majority exhibited unsafe responses in over 20% of cases, with extreme cases exceeding 50%.

Patronus primarily serves customers in regulated industries such as healthcare, law, and financial services. These industries require reliable AI systems to avoid errors that could lead to lawsuits or regulatory fines.

Conclusion

As genAI continues to advance, the need for tools like Patronus’ diagnostic platform becomes increasingly evident. Patronus aims to provide companies with a means to ensure the accuracy, safety, and trustworthiness of genAI deployments.

Disclaimer: The information provided in this article is based on the news article titled “It’s Time to Ditch the ChatGPT Habit” published by Lucas Marian on November 29, 2024. The propositions and content presented in this document are extracted from the original news article.

If you’re wondering where the article came from!
#