Ahh/new error format #273

abhishekg999 · 2025-05-22T03:47:39Z

📝 Summary

Adds recording of errors and traceback in deep trace and regular, in the new TraceSpan error field.

🎯 Purpose

🎥 Demo of Changes

🧪 Testing

✅ Checklist

Self-review
Video demo of changes
Unit Tests and CI/CD tests are passing
Reviewers assigned

📌 Linear Issue

✏️ Additional Notes

gemini-code-assist

Hello @abhishekg999, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

Hello! Gemini here, providing a summary of this pull request to help everyone get oriented. This PR, titled "Ahh/new error format", aims to enhance the tracing system by introducing a dedicated field for recording errors and their tracebacks within trace spans. The primary goal is to capture detailed error information during both regular and deep tracing, for both synchronous and asynchronous function calls, and store it directly within the relevant TraceSpan object rather than potentially mixing it with regular output.

Highlights

Dedicated Error Recording Method: A new method record_error(self, error: Any) was added to the tracer class (src/judgeval/common/tracer.py line 502) to easily associate an error dictionary with the current trace span.
Error Capture in Tracing Wrappers: The tracing wrappers (async_wrapper and wrapper) for both regular and deep tracing have been modified to include try...except Exception blocks. When an exception occurs, the exception details (type, message, traceback) are captured, formatted into a dictionary, and recorded using the new record_error method before the exception is re-raised (src/judgeval/common/tracer.py lines 1163-1174, 1195-1206, 1252-1263, 1285-1296).
Serialization Refactor: The serialization logic within TraceSpan was refactored. The separate _serialize_inputs and _serialize_output methods were removed (src/judgeval/data/trace.py lines 46-68, 97-121) and replaced by a single generic _serialize_value method (src/judgeval/data/trace.py line 74) used for inputs, outputs, and the new error field.

Changelog

src/judgeval/common/tracer.py
- Added record_error method to associate errors with the current span (lines 502-508).
- Updated _trace method to use record_error instead of record_output for exceptions (lines 886-887).
- Added try...except blocks in async_wrapper to capture and record exceptions using record_error for both deep and regular async paths (lines 1163-1174, 1195-1206).
- Added try...except blocks in wrapper to capture and record exceptions using record_error for both deep and regular sync paths (lines 1252-1263, 1285-1296).
src/judgeval/data/trace.py
- Added error: Optional[Dict[str, Any]] = None field to TraceSpan model (line 17).
- Updated model_dump to include the new error field and use the refactored _serialize_value for inputs, outputs, and errors (lines 31-33).
- Removed the old _serialize_inputs method (lines 46-68).
- Renamed _serialize_output to _serialize_value and refactored it to handle any value (lines 74-98).

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Code runs, then it breaks,
A traceback error makes,
Now spans hold the pain,
So we can debug again,
No more guessing mistakes.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request introduces a new error field to the TraceSpan model and updates the tracing mechanism to record exceptions and their tracebacks into this field. This is a valuable addition for debugging and monitoring. The core logic for capturing errors is implemented in several places.

The main areas for improvement are adding documentation for the new record_error method, updating a docstring for a refactored method, and addressing significant code duplication in the exception handling logic within the tracer's wrappers. Consolidating the error formatting and recording into a helper function would greatly enhance maintainability.

Summary of Findings

Code Duplication in Exception Handling: The logic for formatting exception details (type, message, traceback) and recording them using current_trace.record_error() is duplicated in four places within the async_wrapper and wrapper methods in src/judgeval/common/tracer.py. Similar logic also exists in _DeepTracer._trace. This should be refactored into a common helper function to improve maintainability.
Missing Docstring: The new public method record_error in src/judgeval/common/tracer.py lacks a docstring, which makes it harder to understand its purpose and usage without reading the code.
Outdated Docstring: The docstring for the refactored method _serialize_value in src/judgeval/data/trace.py (formerly _serialize_output) still refers to serializing only 'output data', whereas it's now used for inputs, outputs, and errors. The docstring should be updated to reflect its broader utility.

Merge Readiness

The pull request introduces a useful feature for error tracking. However, there's a significant instance of code duplication in the exception handling logic that should be addressed by refactoring it into a helper function. Additionally, some docstrings for new and modified methods need to be added or updated for clarity.

Due to the high-severity code duplication issue, I recommend that these changes be addressed before merging. I am not authorized to approve pull requests, so please ensure other reviewers approve these changes once the feedback is incorporated.

src/judgeval/common/tracer.py

gemini-code-assist · 2025-05-22T03:49:26Z

src/judgeval/common/tracer.py

@@ -498,7 +498,15 @@ def record_output(self, output: Any):
            return span # Return the created entry
        # Removed else block - original didn't have one
        return None # Return None if no span_id found
-
+
+    def record_error(self, error: Any):


The new record_error method is a good addition for specifically handling errors within a span. Could we add a docstring to explain its purpose, parameters, and what it returns? This would improve code clarity and maintainability for other developers.

src/judgeval/data/trace.py

yifan1207

This is awesome Abhi! If you have time can you also update our Docs in judeval with these new changes?

abhishekg999 added 2 commits May 20, 2025 19:43

feat: start of new record_error

4d8484e

feat: error property on TraceSpan

b93b65f

gemini-code-assist bot reviewed May 22, 2025

View reviewed changes

gemini-code-assist bot suggested changes May 22, 2025

View reviewed changes

yifan1207 reviewed May 23, 2025

View reviewed changes

chore(docs): update observe also records Error

0adb534

abhishekg999 requested a review from yifan1207 May 24, 2025 00:30

abhishekg999 and others added 3 commits May 23, 2025 17:39

chore: cleanup capture exception impl

1918bf7

chore: better comment on _serialize_value

781644f

Merge branch 'main' into ahh/new-error-format

c973649

yifan1207 approved these changes May 25, 2025

View reviewed changes

abhishekg999 merged commit e2e30e1 into main May 26, 2025
2 of 3 checks passed

abhishekg999 deleted the ahh/new-error-format branch May 26, 2025 08:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Ahh/new error format #273

Ahh/new error format #273

Uh oh!

abhishekg999 commented May 22, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

gemini-code-assist bot May 22, 2025

Uh oh!

Uh oh!

yifan1207 left a comment

Uh oh!

Uh oh!

Uh oh!

Ahh/new error format #273

Ahh/new error format #273

Uh oh!

Conversation

abhishekg999 commented May 22, 2025

📝 Summary

🎯 Purpose

🎥 Demo of Changes

🧪 Testing

✅ Checklist

📌 Linear Issue

✏️ Additional Notes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Changelog

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Summary of Findings

Merge Readiness

Uh oh!

Uh oh!

gemini-code-assist bot May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yifan1207 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!