The best language models for digital products

The Trustbit LLM Leaderboards

LLM Recommendations from Trustbits Machine Learning Experts

The monthly LLM Leaderboards help to find the best Large Language Model for digital product development.

Based on real benchmark data from our own software products, we re-evaluate each month the performance of different LLM models in addressing specific challenges. We examine specific categories such as document processing, CRM integration, external integration, marketing support, and code generation.

Rely on us to take your projects to the next level!

Benchmarks for April 2024

This month you can expect the following insights & highlights:

  • Gemini Pro 1.5 from Google - Improvement of Pro 1.0, now available in the EU

  • Command-R and Command-R Plus from Cohere - mediocre results

  • New GPT-4 Turbo - OpenAI has done it again!

  • Llama 3: 70B is fine, but 8B is really promising

  • Long-term trends

The benchmark categories in detail

These categories describe the capabilities of the Trustbit LLM Leaderboard

  • How well can the model work with large documents and knowledge bases?

  • How well does the model support work with product catalogs and marketplaces?

  • Can the model easily interact with external APIs, services and plugins?

  • How well can the model support marketing activities, e.g. brainstorming, idea generation and text generation?

  • How well can the model reason and draw conclusions in a given context?

  • Can the model generate code and help with programming?

  • The estimated cost of running the workload. For cloud-based models, we calculate the cost according to the pricing. For on-premises models, we estimate the cost based on GPU requirements for each model, GPU rental cost, model speed, and operational overhead.

  • The "Speed" column indicates the estimated speed of the model in requests per second (without batching). The higher the speed, the better.

LLM PERFORMANCE DEEP DIVE

Batching strategies for optimal LLM performance

In this series, our Innovation & Machine Learning expert Rinat Abdullin explores how to use batching strategies to maximize the performance of Large Language Models (LLMs), increasing efficiency and quality in various applications.

More business value through the use of ChatGPT and Co.

Learn how Trustbit deploys Large Language Models in enterprises, what to consider and why our customers strongly benefit from our partnerships in this context.

Would you like to learn more about the use of ChatGPT and Co?

Then we look forward to hearing from you.

christoph.hasenzagl@trustbit.tech

+43 664 88454881