-
Notifications
You must be signed in to change notification settings - Fork 50
Description
Overview
As described by #558 (comment), API for evaluating the WCS
's transforms is not the same between different methods for evaluation such as .__call__
, .invert
, and .transform
. Moreover, there is quite a bit of code
path duplication related to all these methods that should be streamlined.
Note that #559, #557, #445, #550 are likely related to this.
Goals
For GWCS 1.0 we should have the goals to:
- Have a well defined API for all the different methods for evaluating the GWCS.
- This API should attempt to be intuitive in its operation by default whenever possible. The output type should follow from the input type.
- The API should be as explicit as possible, especially when having it work in a no default fashion.
- Conform to the APE 14 specifications where necessary.
- Be well documented.
Background
All this being said the main crux of the issue follows from an "oversight" in APE 14's description of its interface. In APE 14, the notion of "high-level" and "low-level" objects is introduced, where
- "low-level" objects are pure scalars and arrays (NumPy arrays) without any descriptive metadata. It is not entirely clear, if it allows more descriptive objects which can "act" like scalars and array, where in those interfaces will ignore any of this metadata.
- "high-level" objects contain not only the data but also descriptive information. Effectively, this is describing things like
astropy
'sQuantity
andastropy.coordinates
"coordinate" objects.
Additionally, APE 14 only discusses the "pixel" and "world" systems with no discussion of intermediates like GWCS does; meaning that, all it discusses is the pixel-to-world and world-to-pixel coordinate transforms. The "oversight" in APE 14 lies in the fact that it is implicitly assuming that there are no "high-level" objects for the "pixel" coordinates. Namely, it describes
Essentially, this means that APE 14 is assuming the pixel coordinates are just arrays whose index conveys all the metadata. In some sense, it is assuming that the pixel inputs are always in u.pix
units. Though it is not explicitly stating this. The consequence of this is that APE 14 creates an ambiguity about what "high-level" means in the pixel coordinates.
This oversight, does not effect APE 14 itself because it is limited to only the "pixel" and "world" coordinates. This however creates an ambiguity for GWCS. Since GWCS is describing more than just pixel and world coordinates, namely "intermediate" coordinates, but is framing all the transforms through the APE 14 "pixel" and "world" coordinates. This means that transforms moving "forward" (towards the world) assume that Quantity
objects inputs are "low-level" in a lot of ways, while at the same time assuming they are "high-level" objects as outputs. This means that for the non APE 14 methods like .__call__
and .invert
, Quantity
inputs are not treated the same which is not intuitive.
A partial attempt to resolve this issue was introduced to GWCS with the with_units
key word argument, which is a simple boolean True
/False
. As this argument is implemented, it creates more issues than it solves. For example if one passes a Quantity
to __call__
, intuitively one might expect to get a Quantity
back, but with_units
defaults to False
meaning that it in theory should not be returning a Quantity
by default. However, as #558 has uncovered, this is not always the case. Moreover, if one then looks at .invert
things are even more muddy. Essentially, with_units
is doing more harm than good.
Moreover, there are code path issues involving .invert
involving:
Lines 302 to 303 in bfedbdf
if is_high_level(*args, low_level_wcs=self): | |
args = high_level_objects_to_values(*args, low_level_wcs=self) |
.__call__
does not. There are also duplicate code paths, and even more things muddying up behavior. Further analysis also leads to subtle inconsistencies arising in all this with respect to the APE 14 defined API.
Proposal
As @chris-simpson has pointed out in #558 (comment) the behavior in all these cases should not be different. I would also interpret some of the concerns raised in #558 as falling into the category of GWCS's API being "surprising" in its behavior.
Thus I propose that several actions be taken:
- Create specific, unambiguous, and easy to find documentation defining "high-level" and "low-level" objects for GWCS.
- We should attempt to indicate this somehow in the API "type" listing in the docstrings/API docs.
- For clear reasons, this is important for us to lay out as in some sense is what has caused the API issues to start with.
- Possibly submit and update to APE 14 so that it is more clear about this as well?
- All GWCS "evaluation" API calls should try to funnel into a single code path rather than many duplicate ones. That is there are many cases of functions duplicating the same or similar code paths. This should be centralized into a single generalized "evaluation" method that the specific "evaluation" methods all create inputs for.
- I propose
Lines 1060 to 1082 in bfedbdf
def transform( self, from_frame: str | CoordinateFrame, to_frame: str | CoordinateFrame, *args, with_units: bool = False, **kwargs, ): """ Transform positions between two frames. Parameters ---------- from_frame : str or `~gwcs.coordinate_frames.CoordinateFrame` Initial coordinate frame. to_frame : str, or instance of `~gwcs.coordinate_frames.CoordinateFrame` Coordinate frame into which to transform. args : float or array-like Inputs in ``from_frame``, separate inputs for each dimension. with_bounding_box : bool, optional If True(default) values in the result which correspond to any of the inputs being outside the bounding_box are set to ``fill_value``. """ - By choosing this function any pre and post processing such as the
high_level_objects_to_value
calls can be centralized so that there is no behavioral deviations in other specific calls. - Basically, one expects GWCS's evaluation to act the same way no matter what "kind" of evaluation one is taking about.
- I propose
- Remove (deprecate?) and replace the
with_units
key word boolean with anoutput_level
key word string.- The term
with_units
is a bit misleading because it can in fact do more than just outputQuantity
objects, it basically casts to "high-level" objects that are referenced in theCoordinateFrame
object for the target (such as "world") frame. Something likeoutput_level
more clearly describes the intent as controlling what you get out and references the notion "object" level woven throughout this discussion.- I am open to alternate names to
output_level
.
- I am open to alternate names to
output_level
should be a string literal rather than a boolean so that it is descriptive and has more than two options. I suggest:"match"
as the default. Meaning the "level" of the output matches the "level" of the input. Meaning if you pass aQuantity
you expect to get the listed "high-level" object for theto_frame
, which maybe aQuantity
but could also be something likeSkyCoord
. Similarly, passing a pure array or scalar would result in a pure array or scalar."high"
, meaning the level of the output should be the "high-level" listed by theto_frame
."low"
, meaning the level of the output should be a scalar or array.
- By being a literal string, this leaves us the opportunity to extend this behavior easily if we come across more cases to account for.
- By having
"match"
as the default we follow the "least surprising" idea.
- The term
- Add something like an
.evaluate
method that acts so that.evaluate(..., direction="forward")
evaluates the full forward transform from the pixel frame to the world frame and.evaluate(..., direction="backward")
evaluates the full backward transform from the world frame to the pixel frame.- For those wishing be explicit in their usage, this acts as a clear way to indicate how evaluation is occurring. It also indicates a symmetry in the API.
.__call__
and.invert
should effectively be aliases of.evaluate(direction="forward")
and.evaluate(direction="backward")
respectively.- Default should be
direction="forward
, so this acts as a "match" to.invert
in some sense.
- Review and update the APE 14 related API so that it conforms with APE 14 AND uses the same code paths as the rest of GWCS.
- Provide a detailed documentation of the internal implementation mechanisms. Much of this is about keep the internal implementation consistent. I don't want us to deviate from this by accident or misunderstanding of the code.
Additional Notes
- Since it is possible to define astropy models which require quantities, there are complexities around dealing with the pure scalar/array inputs and passing them to the transforms themselves. This also may result in
Quantity
outputs which need to be accounted for as well.- I know PR Fixes for bounding box units #554 and issue Numerical Inverse needs to be a bit careful about units #550 are related to this because the specific implementations of the numerical inverse and FITS SIP stuff calls into methods that are not clear about their expectations.
- The origin of Identity and Mapping transforms always return units (wrongly) in v0.24 #558 is rooted in an issue related to models requiring units or not, so this will require special consideration.
Warning
It is important to note that despite its age, GWCS has still not made a stable release. This mean's its API and functionality is still evolving. While reasonable attempts will be made to avoid breaking changes with older versions, this may happen accidentally or deliberately. Our goal is to move towards a clean and consistent API that we can maintain in a stable version soon.
- Hopefully, finalizing this work will be (one) of the last steps prior to a stable release.
- Downstream users are strongly encouraged to provide their opinions and feedback towards the goals laid out here.
- We recommend that downstream users test their code against the API changes and provide prompt feedback. We are more than willing to assist you if we break things and cannot easily find a way to provide a bridging method for backwards compatibility.
- Indeed, we strongly encourage our downstream users to request (or submit a PR) adding their package to the GWCS downstream testing. "Real" use cases are extremely useful and this will help us avoid disruption to those users as much as possible.