Introducing an LLM-as-a-Judge–Based Validation Framework at AWS Summit 2026
Choi Byung-joo, Solutions Architect at Smileshark, giving a presentation titled “Strategies for Securing the Reliability of Generative AI: MKAX’s Hallucination Control Case” at AWS Summit Seoul 2026. Photo courtesy of Smileshark
Smileshark (CEO Jang Jin-hwan) announced that by applying the “LLM-as-a-Judge” approach—where the output generated by one AI model is evaluated and reviewed by another AI model—to an AI podcast operated by MKAX, the digital organization under Maeil Business Newspaper, it reduced factual errors from a monthly average of 15–20 cases to fewer than one case per month and cut the human review time required in the verification process by more than 90%.
Smileshark Solutions Architect Choi Byung-joo presented these results on the 20th at the “AWS Summit Seoul 2026” Industry Day session under the theme “MKAX’s Hallucination Control Case.”
The project was conducted to address hallucination—the most significant obstacle when introducing generative AI, referring to the generation of information that is not factually correct. Under this approach, when an AI model produces an output, a separate evaluation AI verifies the reliability of the result by comparing it against the original text and assessing it according to predefined evaluation criteria.
Architect Choi stated, “A quality evaluation framework in the operation phase is essential for generative AI services,” adding, “Stable service operation is only possible when not only the core model performance but also error verification and the management of failure cases are carried out together.”
Visitors at the Smileshark booth at AWS Summit Seoul 2026. Photo courtesy of Smileshark
The cloud industry also expects that, along with the spread of generative AI, demand will grow rapidly for technologies that verify the reliability of AI outputs and operational stability. CEO Jang Jin-hwan said, “Inquiries from companies seeking to apply AI to real work environments are continuously increasing,” and added, “We plan to expand the sharing of case studies and know-how based on operational experience.”
Meanwhile, marking its 12th edition this year, “AWS Summit Seoul” is a flagship domestic AI and cloud event hosted by AWS. This year, a variety of sessions were held focusing on generative AI technologies and their applications in industrial settings. Smileshark participated as a platinum sponsor and presented its AI commercialization strategies and operational experience.
In addition, at its on-site booth, the company operated content recommending cloud services based on a baseball team position test, as well as a customized merchandise program. During the event period, Smileshark also provided consulting on AI adoption for startups and small and medium-sized enterprises.
ⓒ dongA.com. All rights reserved. Reproduction, redistribution, or use for AI training prohibited.
Popular News