Ahh/new error format (#273) #288

abhishekg999 · 2025-05-29T02:54:23Z

feat: start of new record_error
feat: error property on TraceSpan
chore(docs): update observe also records Error
chore: cleanup capture exception impl
chore: better comment on _serialize_value

📝 Summary

🎯 Purpose

🎥 Demo of Changes

🧪 Testing

✅ Checklist

Self-review
Video demo of changes
Unit Tests and CI/CD tests are passing
Reviewers assigned

📌 Linear Issue

✏️ Additional Notes

* feat: start of new record_error * feat: error property on TraceSpan * chore(docs): update observe also records Error * chore: cleanup capture exception impl * chore: better comment on _serialize_value --------- Co-authored-by: Minh Pham <62208322+Mandolaro@users.noreply.github.com>

gemini-code-assist

Hello @abhishekg999, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello team, gemini-code-assist here to provide a summary of this pull request. This PR, titled "Ahh/new error format", introduces a dedicated mechanism for capturing and storing error information within trace spans. The primary goal is to move away from embedding error details within the span's output and instead provide a structured error field. This involves adding the new field to the TraceSpan data model, implementing methods to record errors on spans, and modifying the tracing decorators (@observe) to catch exceptions and record them using the new format. Documentation is also updated to reflect that errors are now captured.

Highlights

Dedicated Error Field: A new optional error field (dictionary) has been added to the TraceSpan data model (src/judgeval/data/trace.py, line 17) to store structured error information, separating it from the function's output.
Record Error Method: A record_error method is added to the TraceClient class (src/judgeval/common/tracer.py, line 501) allowing errors to be explicitly associated with the current active span.
Exception Capturing Logic: A new helper function _capture_exception_for_trace (src/judgeval/common/tracer.py, line 680) is introduced to format Python exception details (type, message, traceback) into the dictionary structure expected by the new error field. The tracing decorators (async_wrapper and wrapper in src/judgeval/common/tracer.py, lines 1157, 1178, 1224, 1246) are updated to catch exceptions and call this helper, recording the error on the span before re-raising the exception. The sys.settrace handler (_trace in src/judgeval/common/tracer.py, line 871) is also updated to use this helper and now explicitly ignores common generator-related exceptions (StopIteration, StopAsyncIteration, GeneratorExit).
Serialization Refactor: The serialization logic within TraceSpan (src/judgeval/data/trace.py, line 74) is refactored. The separate _serialize_inputs and _serialize_output methods are removed, replaced by a single _serialize_value method that handles serialization for inputs, output, and the new error field, supporting Pydantic models and gracefully handling non-serializable types.
Documentation Update: The documentation for the @observe decorator (docs/monitoring/tracing.mdx, line 90) is updated to explicitly state that it captures errors in addition to input and output.

Changelog

Click here to see the changelog

docs/monitoring/tracing.mdx
- Updated @observe documentation to mention error capture.
src/judgeval/common/tracer.py
- Added record_error method to TraceClient.
- Added _capture_exception_for_trace helper for formatting exceptions.
- Modified _trace handler to use _capture_exception_for_trace and ignore generator exceptions.
- Wrapped function calls in async_wrapper and wrapper with try/except blocks to capture and record exceptions using the new mechanism.
src/judgeval/data/trace.py
- Added error field to TraceSpan model.
- Refactored serialization logic: removed _serialize_inputs and _serialize_output, added _serialize_value.
- Updated model_dump to use _serialize_value for inputs, output, and error.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Code runs, then it breaks,
A traceback error makes,
Now spans hold the fault,
No longer in output vault,
For easier debug stakes.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

abhishekg999 · 2025-05-29T02:55:52Z

This PR was merged into main into staging, backmerging to update staging

gemini-code-assist

Code Review

This pull request introduces a new error recording mechanism, which is a valuable addition for tracing and debugging. Key changes include adding an error field to TraceSpan, a new record_error method, and a helper function _capture_exception_for_trace for consistent error formatting. The exception handling in the @observe decorator and _DeepTracer has also been improved.

Overall, the changes are well-implemented and enhance the robustness and utility of the tracing system. The refactoring of serialization logic into a unified _serialize_value method is a notable improvement for code clarity and maintainability. The code generally adheres to PEP 8 style guidelines.

I have one suggestion for improving type clarity, detailed in the comments.

Summary of Findings

Type Clarity for TraceSpan.error: Suggested using TypedDict for the TraceSpan.error field to provide a more specific type definition than Optional[Dict[str, Any]], enhancing type safety and clarity for the structured error information. This was provided as a review comment.
Comment Clarity in _DeepTracer._trace: In src/judgeval/common/tracer.py, line 875 (RIGHT), the comment // exception handling will take priority. within the return event handling block could be slightly confusing as the exception event would typically handle exceptions. This is a minor clarity point and was not commented on due to review settings (low severity).

Merge Readiness

The pull request is in good shape and introduces valuable improvements to error handling in the tracing system. The code is clear and the refactoring efforts are commendable. There is one medium-severity suggestion regarding the use of TypedDict for the error field in TraceSpan to enhance type safety, which I recommend considering.

I am not authorized to approve pull requests. However, from my review, after addressing or considering the provided feedback, this PR should be ready for merging pending further reviews and approvals from the team.

gemini-code-assist · 2025-05-29T02:56:05Z

src/judgeval/data/trace.py

@@ -14,6 +14,7 @@ class TraceSpan(BaseModel):
    parent_span_id: Optional[str] = None
    span_type: Optional[str] = "span"
    inputs: Optional[Dict[str, Any]] = None
+    error: Optional[Dict[str, Any]] = None


The error field is currently typed as Optional[Dict[str, Any]]. While flexible, this doesn't fully convey the expected structure of the error dictionary ({'type': str, 'message': str, 'traceback': List[str]}).

Consider using a TypedDict to define the structure of the formatted exception. This would improve type safety and make it clearer to consumers of the trace data what to expect in the error field. What are your thoughts on this approach for better type definition?

from typing import TypedDict, List, Optional, Dict, Any # Ensure TypedDict is imported class FormattedException(TypedDict): type: str message: str traceback: List[str] class TraceSpan(BaseModel): span_id: str trace_id: str function: Optional[str] = None depth: int created_at: Optional[Any] = None parent_span_id: Optional[str] = None span_type: Optional[str] = "span" inputs: Optional[Dict[str, Any]] = None error: Optional[FormattedException] = None # Use TypedDict here output: Optional[Any] = None duration: Optional[float] = None annotation: Optional[List[Dict[str, Any]]] = None evaluation_runs: Optional[List[EvaluationRun]] = [] expected_tools: Optional[List[Tool]] = None additional_metadata: Optional[Dict[str, Any]] = None

gemini-code-assist bot reviewed May 29, 2025

View reviewed changes

abhishekg999 enabled auto-merge (squash) May 29, 2025 02:55

abhishekg999 requested a review from JCamyre May 29, 2025 02:56

gemini-code-assist bot reviewed May 29, 2025

View reviewed changes

justinsheu merged commit fa1bfe3 into staging May 30, 2025
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ahh/new error format (#273) #288

Ahh/new error format (#273) #288

Uh oh!

abhishekg999 commented May 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

abhishekg999 commented May 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot May 29, 2025

Uh oh!

Uh oh!

Uh oh!

Ahh/new error format (#273) #288

Ahh/new error format (#273) #288

Uh oh!

Conversation

abhishekg999 commented May 29, 2025

📝 Summary

🎯 Purpose

🎥 Demo of Changes

🧪 Testing

✅ Checklist

📌 Linear Issue

✏️ Additional Notes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

abhishekg999 commented May 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

gemini-code-assist bot May 29, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!