Extensibility points for Microsoft.Extensions.AI.Evaluation.Reporting.Formats.Html reports #6826
mikeholczer
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I really like the AI evaluation reports that are generated by
Microsoft.Extensions.AI.Evaluation.Reporting.Formats.Html.HtmlReportWriter, but it would be really nice if there was some extensibility that would allow for making them more actionable and integrated with a larger system. I'm currently working on an internal tool for our folks that are doing the prompt engineering. It will let them set up a set of messages and allow them to test out how various changes to the system prompt (or a tool description) would impact the model's response in a interactive manner and I'd like to be able to integrate it with some other internal tools we're building.We're plan to run evaluations on the interactions users have with the system, and my plan is to automate going through those results to find the ones were the evaluation results where below a threshold, use
HtmlReportWriterto make a nice human readable report of them, and send it over to the prompt engineering folks. I haven't built that all out yet, but I think that much would work as things currently are.The piece that I'd like to have, is for there to be a button/link on each test iteration on the report that would open up the prompt playgroup UI and pass in the execute, scenario and iteration names on the query string, so that the playground app, can go load the messages for the conversation and show the original evaluation scores and statistics. To do that, I'd like to be able to pass into
HtmlReportWriter.WriteReportAsync()(or a new method/property) a URL and label value and have the html template just use it to add a ink/button next to each test result with those query string arguments appended.I hope this makes sense and that this was the right place to submit an idea like that. Thanks.
Beta Was this translation helpful? Give feedback.
All reactions