Use omics_processing_output_association for data object queries #1711

naglepuff · 2025-07-24T18:41:42Z

No description provided.

Copilot

Pull Request Overview

This PR migrates the codebase from using a direct foreign key relationship between DataObject and OmicsProcessing to using a many-to-many association table called omics_processing_output_association. This change allows data objects to be associated with multiple omics processing records instead of just one.

Key changes:

Replace direct foreign key queries with association table joins in query and aggregation functions
Update the data ingestion pipeline to populate the new association table
Remove the legacy data generation relation update function

Reviewed Changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
nmdc_server/query.py	Updates data object filter queries to use association table joins
nmdc_server/models.py	Adds new many-to-many relationship using association table
nmdc_server/ingest/pipeline.py	Populates association table during data ingestion
nmdc_server/ingest/data_object.py	Removes legacy foreign key update function
nmdc_server/ingest/all.py	Removes call to legacy foreign key update function
nmdc_server/crud.py	Updates to use new relationship property
nmdc_server/aggregations.py	Updates aggregation queries to use association table
tests/test_download.py	Adds test data to populate new relationship

nmdc_server/crud.py

nmdc_server/ingest/pipeline.py

naglepuff · 2025-07-24T21:06:02Z

I'm over-assigning reviewers to this one in an effort to get eyes on it ASAP. I believe the changes here are straightforward but I'm happy to add details if needed. @aclum this should address the data object aggregation issue on the bulk download widget, as well as some other things that needed to be changed for the bulk download workflow to accommodate the data model changes in #1701.

eecavanna

From skimming the diff, I think this PR does the following:

Updates some SQLAlchemy queries that used to involve the DataObject model, so that they now involve the omics_processing_output_association table
Updates an attribute access to get the first "omics_processing" from a list, instead of getting the "omics_processing" from a scalar variable
Removes code that would update references between the "data object" and "data generation" tables during ingest. Adds ingest code that establishes those references in a different way.
Defines an omics_processings relationship (model)
Adds something to a test (this change, I don't really understand)

eecavanna · 2025-07-24T21:20:46Z

I approved in the interest of facilitating a merge once the changes are finalized (e.g. when the tests are passing). I am not familiar with the code that was changed and so my approval represents me saying, "I looked at all the changes and didn't see anything that looks to me like it was done by mistake."

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

tests/test_download.py

naglepuff added 3 commits July 24, 2025 14:54

Use existing assoc. table for OP to DO

b6b8009

Hook up test data correctly

e764dc0

Use correct column name for DataObject schema

6f961b4

naglepuff force-pushed the fix-1701-bulk-download-summary branch from fdfb24b to 6f961b4 Compare July 24, 2025 20:30

Fix bulk download path generation

12c4786

naglepuff force-pushed the fix-1701-bulk-download-summary branch from db7a028 to 12c4786 Compare July 24, 2025 21:03

naglepuff requested review from marySalvi, eecavanna and pkalita-lbl July 24, 2025 21:03

eecavanna requested a review from Copilot July 24, 2025 21:05

eecavanna assigned naglepuff Jul 24, 2025

Copilot AI reviewed Jul 24, 2025

View reviewed changes

nmdc_server/crud.py Show resolved Hide resolved

nmdc_server/ingest/pipeline.py Show resolved Hide resolved

eecavanna approved these changes Jul 24, 2025

View reviewed changes

pkalita-lbl approved these changes Jul 24, 2025

View reviewed changes

Check for empty omics_processings list

01e8948

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

naglepuff force-pushed the fix-1701-bulk-download-summary branch from e05ab13 to 01e8948 Compare July 24, 2025 21:41

naglepuff commented Jul 24, 2025

View reviewed changes

tests/test_download.py Show resolved Hide resolved

naglepuff merged commit da517e7 into main Jul 24, 2025
2 checks passed

naglepuff deleted the fix-1701-bulk-download-summary branch July 24, 2025 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use omics_processing_output_association for data object queries #1711

Use omics_processing_output_association for data object queries #1711

Uh oh!

naglepuff commented Jul 24, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

naglepuff commented Jul 24, 2025

Uh oh!

eecavanna left a comment

Uh oh!

eecavanna commented Jul 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Use omics_processing_output_association for data object queries #1711

Use omics_processing_output_association for data object queries #1711

Uh oh!

Conversation

naglepuff commented Jul 24, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

naglepuff commented Jul 24, 2025

Uh oh!

eecavanna left a comment

Choose a reason for hiding this comment

Uh oh!

eecavanna commented Jul 24, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!