Discussion about this post

User's avatar
ToxSec's avatar

‘For tasks with deterministic correct answers, you can use your system’s schema or rules to generate both the input and the ground truth’

makes sense. this was a good read ty :)

Andreea's avatar

Shouldn’t the synthetic questions generated for a RAG system be reviewed by a domain expert? Sometimes the synthetic questions don’t make sense to a real user that knows the knowledge base. This happenes even for the simple case where you generate a question just from one chunk.

But what for a more complex situation where for answering a question you might need context from 2 chunks that are in different parts of a document, or even from different documents. You as an engineer don’t know if generating a question from 2 random chunks is a valid question to be asked. Whats your opinion on this?

5 more comments...

No posts

Ready for more?