Chia Jeng Yang
1 min readOct 14, 2024

--

Hi Bob, thanks for the comment! RAGAS is used for the underlying evaluation, and if you've used RAGAS before (or any eval tools), I think you'll find that unless you want only exactly the words that are in the golden dataset (which isn't realistic in many qualitative RAG scenarios), there will be some absolute deviation from the golden testset, and what you want to measure is relative deviation. Our system also produced 85% in that scenario , not 76% :)

We used RAGAS here to provide an objective evaluation as we thought having human assessors in the process would introduce a separate level of arbitrariness (even though I personally think that's the most meaningful way to evaluate).

It's certainly a tricky pickle as you can imagine.

To your original question - yes, we can achieve 100% accuracy depending on the scenario and effort. This is very much a matter of tweaking it to specific scenarios. The great thing about graph structures is that they are auditable, human-readable representations of answers that can be tweaked, unlike black box vector chunk-only systems.

--

--

No responses yet