-
Notifications
You must be signed in to change notification settings - Fork 83
Ahh/new error format (#273) #288
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* feat: start of new record_error * feat: error property on TraceSpan * chore(docs): update observe also records Error * chore: cleanup capture exception impl * chore: better comment on _serialize_value --------- Co-authored-by: Minh Pham <62208322+Mandolaro@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @abhishekg999, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!
Summary of Changes
Hello team, gemini-code-assist here to provide a summary of this pull request. This PR, titled "Ahh/new error format", introduces a dedicated mechanism for capturing and storing error information within trace spans. The primary goal is to move away from embedding error details within the span's output and instead provide a structured error
field. This involves adding the new field to the TraceSpan
data model, implementing methods to record errors on spans, and modifying the tracing decorators (@observe
) to catch exceptions and record them using the new format. Documentation is also updated to reflect that errors are now captured.
Highlights
- Dedicated Error Field: A new optional
error
field (dictionary) has been added to theTraceSpan
data model (src/judgeval/data/trace.py
, line 17) to store structured error information, separating it from the function's output. - Record Error Method: A
record_error
method is added to theTraceClient
class (src/judgeval/common/tracer.py
, line 501) allowing errors to be explicitly associated with the current active span. - Exception Capturing Logic: A new helper function
_capture_exception_for_trace
(src/judgeval/common/tracer.py
, line 680) is introduced to format Python exception details (type, message, traceback) into the dictionary structure expected by the newerror
field. The tracing decorators (async_wrapper
andwrapper
insrc/judgeval/common/tracer.py
, lines 1157, 1178, 1224, 1246) are updated to catch exceptions and call this helper, recording the error on the span before re-raising the exception. Thesys.settrace
handler (_trace
insrc/judgeval/common/tracer.py
, line 871) is also updated to use this helper and now explicitly ignores common generator-related exceptions (StopIteration
,StopAsyncIteration
,GeneratorExit
). - Serialization Refactor: The serialization logic within
TraceSpan
(src/judgeval/data/trace.py
, line 74) is refactored. The separate_serialize_inputs
and_serialize_output
methods are removed, replaced by a single_serialize_value
method that handles serialization for inputs, output, and the new error field, supporting Pydantic models and gracefully handling non-serializable types. - Documentation Update: The documentation for the
@observe
decorator (docs/monitoring/tracing.mdx
, line 90) is updated to explicitly state that it captures errors in addition to input and output.
Changelog
Click here to see the changelog
- docs/monitoring/tracing.mdx
- Updated
@observe
documentation to mention error capture.
- Updated
- src/judgeval/common/tracer.py
- Added
record_error
method toTraceClient
. - Added
_capture_exception_for_trace
helper for formatting exceptions. - Modified
_trace
handler to use_capture_exception_for_trace
and ignore generator exceptions. - Wrapped function calls in
async_wrapper
andwrapper
with try/except blocks to capture and record exceptions using the new mechanism.
- Added
- src/judgeval/data/trace.py
- Added
error
field toTraceSpan
model. - Refactored serialization logic: removed
_serialize_inputs
and_serialize_output
, added_serialize_value
. - Updated
model_dump
to use_serialize_value
for inputs, output, and error.
- Added
Using Gemini Code Assist
The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.
Invoking Gemini
You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command>
or @gemini-code-assist <command>
. Below is a summary of the supported commands.
Feature | Command | Description |
---|---|---|
Code Review | /gemini review |
Performs a code review for the current pull request in its current state. |
Pull Request Summary | /gemini summary |
Provides a summary of the current pull request in its current state. |
Comment | @gemini-code-assist | Responds in comments when explicitly tagged, both in issue comments and review comments. |
Help | /gemini help |
Displays a list of available commands. |
Customization
To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/
folder in the base of the repository. Detailed instructions can be found here.
Limitations & Feedback
Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.
You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.
Code runs, then it breaks,
A traceback error makes,
Now spans hold the fault,
No longer in output vault,
For easier debug stakes.
Footnotes
-
Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩
This PR was merged into main into staging, backmerging to update staging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces a new error recording mechanism, which is a valuable addition for tracing and debugging. Key changes include adding an error
field to TraceSpan
, a new record_error
method, and a helper function _capture_exception_for_trace
for consistent error formatting. The exception handling in the @observe
decorator and _DeepTracer
has also been improved.
Overall, the changes are well-implemented and enhance the robustness and utility of the tracing system. The refactoring of serialization logic into a unified _serialize_value
method is a notable improvement for code clarity and maintainability. The code generally adheres to PEP 8 style guidelines.
I have one suggestion for improving type clarity, detailed in the comments.
Summary of Findings
- Type Clarity for
TraceSpan.error
: Suggested usingTypedDict
for theTraceSpan.error
field to provide a more specific type definition thanOptional[Dict[str, Any]]
, enhancing type safety and clarity for the structured error information. This was provided as a review comment. - Comment Clarity in
_DeepTracer._trace
: Insrc/judgeval/common/tracer.py
, line 875 (RIGHT), the comment// exception handling will take priority.
within thereturn
event handling block could be slightly confusing as theexception
event would typically handle exceptions. This is a minor clarity point and was not commented on due to review settings (low severity).
Merge Readiness
The pull request is in good shape and introduces valuable improvements to error handling in the tracing system. The code is clear and the refactoring efforts are commendable. There is one medium-severity suggestion regarding the use of TypedDict
for the error
field in TraceSpan
to enhance type safety, which I recommend considering.
I am not authorized to approve pull requests. However, from my review, after addressing or considering the provided feedback, this PR should be ready for merging pending further reviews and approvals from the team.
@@ -14,6 +14,7 @@ class TraceSpan(BaseModel): | |||
parent_span_id: Optional[str] = None | |||
span_type: Optional[str] = "span" | |||
inputs: Optional[Dict[str, Any]] = None | |||
error: Optional[Dict[str, Any]] = None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error
field is currently typed as Optional[Dict[str, Any]]
. While flexible, this doesn't fully convey the expected structure of the error dictionary ({'type': str, 'message': str, 'traceback': List[str]}
).
Consider using a TypedDict
to define the structure of the formatted exception. This would improve type safety and make it clearer to consumers of the trace data what to expect in the error
field. What are your thoughts on this approach for better type definition?
from typing import TypedDict, List, Optional, Dict, Any # Ensure TypedDict is imported
class FormattedException(TypedDict):
type: str
message: str
traceback: List[str]
class TraceSpan(BaseModel):
span_id: str
trace_id: str
function: Optional[str] = None
depth: int
created_at: Optional[Any] = None
parent_span_id: Optional[str] = None
span_type: Optional[str] = "span"
inputs: Optional[Dict[str, Any]] = None
error: Optional[FormattedException] = None # Use TypedDict here
output: Optional[Any] = None
duration: Optional[float] = None
annotation: Optional[List[Dict[str, Any]]] = None
evaluation_runs: Optional[List[EvaluationRun]] = []
expected_tools: Optional[List[Tool]] = None
additional_metadata: Optional[Dict[str, Any]] = None
feat: start of new record_error
feat: error property on TraceSpan
chore(docs): update observe also records Error
chore: cleanup capture exception impl
chore: better comment on _serialize_value
📝 Summary
🎯 Purpose
🎥 Demo of Changes
🧪 Testing
✅ Checklist
📌 Linear Issue
✏️ Additional Notes