RWS's TrainAI LLM Benchmarking Study Ranks Claude Sonnet, GPT and Gemini Pro as Leaders in Synthetic Data Generation
TrainAI’s LLM synthetic data generation study benchmarks nine popular large language models on six data generation tasks across eight languages using human expert evaluators
Unlike typical automated LLM benchmarks that assess performance on closed questions, TrainAI’s LLM Synthetic Data Generation Study used human expert evaluators to test the ability of popular LLMs to generate sentences and conversations, assessing their general natural language processing (NLP) skills across a variety of languages.
“We conducted this study because reports suggest that the largest companies behind today’s state-of-the-art LLMs are running out of data1 to train their newest models,” explains Tomáš
Nine LLMs were tested on six data generation tasks varying in complexity, across eight carefully selected languages with varying representation. For each language, three native speaking language specialists evaluated the LLM-generated outputs against specific criteria (such as grammar and naturalness). Overall, 38,000 sentences were generated, 115,000 annotations submitted, and 250,000 ratings from 1 (very poor) to 5 (very good) provided by 27 linguists across the globe.
“Because AI is built for humans, we chose humans – not AI – to evaluate LLM performance. Our study found that no single model outperformed the rest when generating synthetic data across languages and tasks, but some models performed better than others on key criteria like language proficiency, instruction adherence, creativity, speed and cost,” said Vasagi Kothandapani, President of Enterprise Services at RWS. “The study underscores the importance of assessing the strengths and limitations of multiple LLMs for specific AI use cases or applications. Only then can genuine value and positive business impact be realized.”
Notes to editors:
- Download your copy of TrainAI’s LLM Synthetic Data Generation Study.
- TrainAI by RWS provides complete, end-to-end data collection, annotation validation, and generative AI training and fine-tuning services for all types of AI data, in any language, at any scale, based on the principles of responsible AI.
About RWS
Our purpose is unlocking global understanding. By combining cultural understanding, client understanding and technical understanding, our services and technology assist our clients to acquire and retain customers, deliver engaging user experiences, maintain compliance and gain actionable insights into their data and content.
Over the past 20 years we’ve been evolving our own AI solutions as well as helping clients to explore, build and use multilingual AI applications. With 45+ AI-related patents and more than 100 peer-reviewed papers, we have the experience and expertise to support clients on their AI journey.
We work with over 80% of the world’s top 100 brands, more than three-quarters of Fortune’s 20 ‘Most Admired Companies’ and almost all of the top pharmaceutical companies, investment banks, law firms and patent filers. Our client base spans
Founded in 1958, RWS is headquartered in the
For further information, please visit: www.rws.com.
______________________ |
1 Villalobos, P., Ho, A., |
View source version on businesswire.com: https://www.businesswire.com/news/home/20250425786534/en/
RWS
Corporate Communications
ddavies@rws.com
+44 1628 410105
Source: