August 2023

Benchmarks for ChatGPT & Co:

A white tablet displaying a table showing the Large Language Model Leaderboard values for August 2023.

Updated monthly: The Trustbit LLM Leaderboard provides you with an up-to-date comparison of various Large Language Models such as ChatGPT and more to evaluate their suitability for use in product development.

Trustbit Leaderboard
August 2023

model code crm docs integrate marketing reason final
OpenAI GPT4 v2-0613 💰 85 94 100 67 88 60 82
OpenAI GPT4 v1-0314 💰 76 97 89 67 75 76 80
Claude v1 💰 62 77 69 58 88 61 69
OpenAI GPT3.5 v2-0613 💰 49 77 84 83 84 39 69
Open Models 46 62 62 100 84 22 63
Llama2 13B Nous Hermes q5_K_M ✅ 46 62 62 100 56 21 58
Claude v2 💰 38 58 41 67 82 51 56
Claude v1 instant 💰 72 54 47 67 55 17 52
Vicuna v1.1 13B q4_1 30 45 57 83 71 19 51
Vicuna v1.1 13B q8_0 31 45 52 42 84 16 45
Vicuna v1.3 13B q5_1 36 51 47 50 61 19 44
Vicuna v1.1 13B q5_1 31 45 42 33 84 18 42
Puffin v1.3 13B q5_K_M ✅ 28 48 53 33 25 22 35
Wizard Vicuna 13B Unlocked q5_K_M 22 39 53 33 56 0 34
Llama2 13B Guanaco q5_1 ✅ 19 42 62 17 38 0 30
Llama 7B q8_0 25 30 28 25 50 0 26
Llama 13B q5_1 34 9 38 17 44 9 25
Llama2 7B chat ✅ 7 33 11 17 62 14 24
Llama2 7B chat Unlocked q8_0 ✅ 14 33 33 33 25 0 23
Llama2 13B chat q8_0 ✅ 7 33 17 0 66 11 22
Open Llama 7B instruct q8_0 16 17 38 17 22 14 21
Llama 13B q2_K 0 5 47 33 25 0 19
Llama2 7B ✅ 18 0 0 0 0 0 3