MacFoo LogoMacFoo

Evaluation Results

LLM Response Quality Evaluation

Date:Jan 20, 2024, 10:30 AMID:eval-2024-01-20-001
Total Tests
25
Pass Rate by Provider
gpt-4:96.0%
claude-3:92.0%
llama-3:76.0%
Passed Tests by Provider
gpt-4:24/25
claude-3:23/25
llama-3:19/25
Total Tests by Provider
gpt-4:25
claude-3:25
llama-3:25

Test Results

Comparing 3 providers: gpt-4, claude-3, llama-3

Uncategorized

2 tests1 passed

50.0%
Showing 2 test results across 1 category