The best language models for digital products
The Trustbit LLM Leaderboards
LLM Recommendations from Trustbits Machine Learning Experts
The monthly LLM Leaderboards help to find the best Large Language Model for digital product development.
Based on real benchmark data from our own software products, we re-evaluate each month the performance of different LLM models in addressing specific challenges. We examine specific categories such as document processing, CRM integration, external integration, marketing support, and code generation.
Rely on us to take your projects to the next level!
Benchmarks for April 2024
This month you can expect the following insights & highlights:
Gemini Pro 1.5 from Google - Improvement of Pro 1.0, now available in the EU
Command-R and Command-R Plus from Cohere - mediocre results
New GPT-4 Turbo - OpenAI has done it again!
Llama 3: 70B is fine, but 8B is really promising
Long-term trends
The benchmark categories in detail
These categories describe the capabilities of the Trustbit LLM Leaderboard
-
How well can the model work with large documents and knowledge bases?
-
How well does the model support work with product catalogs and marketplaces?
-
Can the model easily interact with external APIs, services and plugins?
-
How well can the model support marketing activities, e.g. brainstorming, idea generation and text generation?
-
How well can the model reason and draw conclusions in a given context?
-
Can the model generate code and help with programming?
-
The estimated cost of running the workload. For cloud-based models, we calculate the cost according to the pricing. For on-premises models, we estimate the cost based on GPU requirements for each model, GPU rental cost, model speed, and operational overhead.
-
The "Speed" column indicates the estimated speed of the model in requests per second (without batching). The higher the speed, the better.
Curious about how the scores have evolved? Here you can find all links to previously published leaderboards
LLM PERFORMANCE DEEP DIVE
Batching strategies for optimal LLM performance
In this series, our Innovation & Machine Learning expert Rinat Abdullin explores how to use batching strategies to maximize the performance of Large Language Models (LLMs), increasing efficiency and quality in various applications.
More business value through the use of ChatGPT and Co.
Learn how Trustbit deploys Large Language Models in enterprises, what to consider and why our customers strongly benefit from our partnerships in this context.