Skip to content

Commit 1396273

Browse files
committed
Implementation, docs, tests
1 parent baf3b49 commit 1396273

File tree

6 files changed

+146
-34
lines changed

6 files changed

+146
-34
lines changed

astroquery/mast/missions.py

Lines changed: 10 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -475,8 +475,16 @@ def filter_products(self, products, *, extension=None, **filters):
475475
Column-based filters to apply to the products table.
476476
477477
Each keyword corresponds to a column name in the table, with the argument being one or more
478-
acceptable values for that column. AND logic is applied between filters, OR logic within
479-
each filter set. For example: type="science", extension=["fits", "jpg"]
478+
acceptable values for that column. AND logic is applied between filters.
479+
480+
Within each column's filter set:
481+
482+
- Positive (non-negated) values are combined with OR logic.
483+
- Any negated values (prefixed with "!") are combined with AND logic against the ORed positives.
484+
This results in: (NOT any_negatives) AND (any_positives)
485+
Examples:
486+
``file_suffix=['A', 'B', '!C']`` → (file_suffix != C) AND (file_suffix == A OR file_suffix == B)
487+
``size=['!14400', '<20000']`` → (size != 14400) AND (size < 20000)
480488
481489
For columns with numeric data types (int or float), filter values can be expressed
482490
in several ways:

astroquery/mast/observations.py

Lines changed: 9 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -560,10 +560,16 @@ def filter_products(self, products, *, mrp_only=False, extension=None, **filters
560560
`here <https://masttest.stsci.edu/api/v0/_productsfields.html>`__.
561561
562562
Each keyword corresponds to a column name in the table, with the argument being one or more
563-
acceptable values for that column. AND logic is applied between filters, OR logic within
564-
each filter set.
563+
acceptable values for that column. AND logic is applied between filters.
565564
566-
For example: type="science", extension=["fits", "jpg"]
565+
Within each column's filter set:
566+
567+
- Positive (non-negated) values are combined with OR logic.
568+
- Any negated values (prefixed with "!") are combined with AND logic against the ORed positives.
569+
This results in: (NOT any_negatives) AND (any_positives)
570+
Examples:
571+
``productType=['A', 'B', '!C']`` → (productType != C) AND (productType == A OR productType == B)
572+
``size=['!14400', '<20000']`` → (size != 14400) AND (size < 20000)
567573
568574
For columns with numeric data types (int or float), filter values can be expressed
569575
in several ways:

astroquery/mast/tests/test_mast.py

Lines changed: 35 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -374,7 +374,7 @@ def test_missions_filter_products(patch_post):
374374
filtered = mast.MastMissions.filter_products(products, extension='fits')
375375
assert len(filtered) > 0
376376

377-
# Numeric filtering
377+
# -------- Numeric filtering tests --------
378378
# Single integer value
379379
filtered = mast.MastMissions.filter_products(products, size=11520)
380380
assert all(filtered['size'] == 11520)
@@ -404,6 +404,40 @@ def test_missions_filter_products(patch_post):
404404
filtered = mast.MastMissions.filter_products(products, size=[14400, '>20000'])
405405
assert all((filtered['size'] == 14400) | (filtered['size'] > 20000))
406406

407+
# -------- Negative operator tests --------
408+
# Negate a single numeric value
409+
filtered = mast.MastMissions.filter_products(products, size='!11520')
410+
assert all(filtered['size'] != 11520)
411+
412+
# Negate a comparison
413+
filtered = mast.MastMissions.filter_products(products, size='!<15000')
414+
assert all(filtered['size'] >= 15000)
415+
416+
# Negate one element in a list with one other condition
417+
filtered = mast.MastMissions.filter_products(products, size=['!14400', '>20000'])
418+
assert all((filtered['size'] != 14400) & (filtered['size'] > 20000))
419+
420+
# Negate one element in a list with two other conditions
421+
filtered = mast.MastMissions.filter_products(products, size=['!14400', '<20000', '>30000'])
422+
assert all((filtered['size'] != 14400) & (filtered['size'] < 20000) | (filtered['size'] > 30000))
423+
424+
# Negate a range
425+
filtered = mast.MastMissions.filter_products(products, size='!14400..17280')
426+
assert all(~((filtered['size'] >= 14400) & (filtered['size'] <= 17280)))
427+
428+
# Negate a string match
429+
filtered = mast.MastMissions.filter_products(products, category='!CALIBRATED')
430+
assert all(filtered['category'] != 'CALIBRATED')
431+
432+
# Negate one string in a list
433+
filtered = mast.MastMissions.filter_products(products, category=['!CALIBRATED', 'UNCALIBRATED'])
434+
assert all((filtered['category'] != 'CALIBRATED') & (filtered['category'] == 'UNCALIBRATED'))
435+
436+
# Negate two strings in a list
437+
filtered = mast.MastMissions.filter_products(products, category=['!CALIBRATED', '!UNCALIBRATED'])
438+
assert all((filtered['category'] != 'CALIBRATED') & (filtered['category'] != 'UNCALIBRATED'))
439+
# ------------------------------------------
440+
407441
with pytest.raises(InvalidQueryError, match="Could not parse numeric filter 'invalid' for column 'size'"):
408442
# Invalid filter value
409443
mast.MastMissions.filter_products(products, size='invalid')

astroquery/mast/utils.py

Lines changed: 74 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -347,7 +347,41 @@ def remove_duplicate_products(data_products, uri_key):
347347
return unique_products
348348

349349

350-
def parse_numeric_product_filter(val):
350+
def _combine_positive_negative_masks(mask_funcs):
351+
"""
352+
Combines a list of mask functions into a single mask according to:
353+
- OR logic among positive masks
354+
- AND logic among negative masks applied after positives
355+
356+
Parameters
357+
----------
358+
mask_funcs : list of tuples (func, is_negated)
359+
Each element is a tuple where:
360+
- func: a callable that takes an array and returns a boolean mask
361+
- is_negated: boolean, True if the mask should be negated before combining
362+
363+
Returns
364+
-------
365+
combined_mask : np.ndarray
366+
Combined boolean mask.
367+
"""
368+
def combined(col):
369+
positive_masks = [f(col) for f, neg in mask_funcs if not neg]
370+
negative_masks = [~f(col) for f, neg in mask_funcs if neg]
371+
372+
# Use OR logic between positive masks
373+
pos_mask = np.logical_or.reduce(positive_masks) if positive_masks else np.ones(len(col), dtype=bool)
374+
375+
# Use AND logic between negative masks
376+
neg_mask = np.logical_and.reduce(negative_masks) if negative_masks else np.ones(len(col), dtype=bool)
377+
378+
# Use AND logic to combine positive and negative masks
379+
return pos_mask & neg_mask
380+
381+
return combined
382+
383+
384+
def parse_numeric_product_filter(vals):
351385
"""
352386
Parses a numeric product filter value and returns a function that can be used to filter
353387
a column of a product table.
@@ -369,30 +403,35 @@ def parse_numeric_product_filter(val):
369403
# Regular expression to match range patterns
370404
range_pattern = re.compile(r'[+-]?(\d+(\.\d*)?|\.\d+)\.\.[+-]?(\d+(\.\d*)?|\.\d+)')
371405

372-
def single_condition(cond):
373-
"""Helper function to create a condition function for a single value."""
374-
if isinstance(cond, (int, float)):
375-
return lambda col: col == float(cond)
376-
if cond.startswith('>='):
377-
return lambda col: col >= float(cond[2:])
378-
elif cond.startswith('<='):
379-
return lambda col: col <= float(cond[2:])
380-
elif cond.startswith('>'):
381-
return lambda col: col > float(cond[1:])
382-
elif cond.startswith('<'):
383-
return lambda col: col < float(cond[1:])
384-
elif range_pattern.fullmatch(cond):
385-
start, end = map(float, cond.split('..'))
406+
def base_condition(cond_str):
407+
"""Create a mask function for a numeric condition string (no negation handling here)."""
408+
if isinstance(cond_str, (int, float)):
409+
return lambda col: col == float(cond_str)
410+
elif cond_str.startswith('>='):
411+
return lambda col: col >= float(cond_str[2:])
412+
elif cond_str.startswith('<='):
413+
return lambda col: col <= float(cond_str[2:])
414+
elif cond_str.startswith('>'):
415+
return lambda col: col > float(cond_str[1:])
416+
elif cond_str.startswith('<'):
417+
return lambda col: col < float(cond_str[1:])
418+
elif range_pattern.fullmatch(cond_str):
419+
start, end = map(float, cond_str.split('..'))
386420
return lambda col: (col >= start) & (col <= end)
387421
else:
388-
return lambda col: col == float(cond)
422+
return lambda col: col == float(cond_str)
389423

390-
if isinstance(val, list):
391-
# If val is a list, create a condition for each value and combine them with logical OR
392-
conditions = [single_condition(v) for v in val]
393-
return lambda col: np.logical_or.reduce([cond(col) for cond in conditions])
394-
else:
395-
return single_condition(val)
424+
vals = [vals] if not isinstance(vals, list) else vals
425+
mask_funcs = []
426+
for v in vals:
427+
# Check if the value is negated and strip the negation if present
428+
is_negated = isinstance(v, str) and v.startswith('!')
429+
v = v[1:] if is_negated else v
430+
431+
func = base_condition(v)
432+
mask_funcs.append((func, is_negated))
433+
434+
return _combine_positive_negative_masks(mask_funcs)
396435

397436

398437
def apply_column_filters(products, filters):
@@ -426,11 +465,20 @@ def apply_column_filters(products, filters):
426465
except ValueError:
427466
raise InvalidQueryError(f"Could not parse numeric filter '{vals}' for column '{colname}'.")
428467
else: # Assume string or list filter
429-
if isinstance(vals, str):
430-
vals = [vals]
431-
this_mask = np.isin(col_data, vals)
468+
vals = [vals] if isinstance(vals, str) else vals
469+
mask_funcs = []
470+
for val in vals:
471+
# Check if the value is negated and strip the negation if present
472+
is_negated = isinstance(val, str) and val.startswith('!')
473+
v = val[1:] if is_negated else val
474+
475+
def func(col, v=v):
476+
return np.isin(col, [v])
477+
mask_funcs.append((func, is_negated))
478+
479+
this_mask = _combine_positive_negative_masks(mask_funcs)(col_data)
432480

433-
# Combine the current column mask with the overall mask
481+
# AND logic across different columns
434482
col_mask &= this_mask
435483

436484
return col_mask

docs/mast/mast_missions.rst

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -242,7 +242,15 @@ In many cases, you will not need to download every product that is associated wi
242242
`~astroquery.mast.MastMissionsClass.filter_products` function allows for filtering based on file extension (``extension``)
243243
and any other of the product fields.
244244

245-
The **AND** operation is performed for a list of filters, and the **OR** operation is performed within a filter set.
245+
The **AND** operation is applied between filters, and the **OR** operation is applied within each filter set, except in the case of negated values.
246+
247+
A filter value can be negated by prefiing it with ``!``, meaning that rows matching that value will be excluded from the results.
248+
When any negated value is present in a filter set, any positive values in that set are combined with **OR** logic, and the negated
249+
values are combined with **AND** logic against the positives.
250+
251+
For example:
252+
- ``file_suffix=['A', 'B', '!C']`` → (file_suffix != C) AND (file_suffix == A OR file_suffix == B)
253+
- ``size=['!14400', '<20000']`` → (size != 14400) AND (size < 20000)
246254

247255
For columns with numeric data types (``int`` or ``float``), filter values can be expressed in several ways:
248256
- A single number: ``size=100``

docs/mast/mast_obsquery.rst

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -285,7 +285,15 @@ In many cases, you will not need to download every product that is associated wi
285285
`~astroquery.mast.ObservationsClass.filter_products` function allows for filtering based on minimum recommended products
286286
(``mrp_only``), file extension (``extension``), and any other of the `CAOM products fields <https://mast.stsci.edu/api/v0/_productsfields.html>`_.
287287

288-
The **AND** operation is performed for a list of filters, and the **OR** operation is performed within a filter set.
288+
The **AND** operation is applied between filters, and the **OR** operation is applied within each filter set, except in the case of negated values.
289+
290+
A filter value can be negated by prefiing it with ``!``, meaning that rows matching that value will be excluded from the results.
291+
When any negated value is present in a filter set, any positive values in that set are combined with **OR** logic, and the negated
292+
values are combined with **AND** logic against the positives.
293+
294+
For example:
295+
- ``productType=['A', 'B', '!C']`` → (productType != C) AND (productType == A OR productType == B)
296+
- ``size=['!14400', '<20000']`` → (size != 14400) AND (size < 20000)
289297

290298
For columns with numeric data types (``int`` or ``float``), filter values can be expressed in several ways:
291299
- A single number: ``size=100``

0 commit comments

Comments
 (0)