-
Notifications
You must be signed in to change notification settings - Fork 82
Add trace ID to datasets, update UTs accordingly #31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…g csv's using pandas.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
||
The JSON file is expected to have the following format: | ||
{ | ||
"ground_truths": [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does ground truths mean in relation to the examples, I thought the examples has the ground truth (with the actual_output
field)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks
@@ -195,17 +252,21 @@ def add_from_csv( | |||
"Please install pandas to use this method. 'pip install pandas'" | |||
) | |||
|
|||
df = pd.read_csv(file_path) | |||
df = pd.read_csv(file_path, dtype={'trace_id': str}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What else can dtype
be?
A trace_id
or trace_file
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pandas naturally reads numbers in data files as ints, not strings (can lead to unexpected behavior)
"ground_truths": [ | ||
{ | ||
"input": "test input", | ||
"actual_output": null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the difference between ground_truths
and examples
that ground_truths
don't have an actual_output
(this is for the LLM system to produce)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM again, grape UT changes!
…add_trace_to_datasets
…ues are resolved and now UTs pass.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, great UT fixes
When adding the
trace_id
field toExample
s, I forgot to add them toEvalDataset
s as well. This PR takes care of that issue and also updates the UTs for theEvalDataset
class accordingly.