-
Notifications
You must be signed in to change notification settings - Fork 0
Use omics_processing_output_association for data object queries #1711
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
fdfb24b
to
6f961b4
Compare
db7a028
to
12c4786
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR migrates the codebase from using a direct foreign key relationship between DataObject and OmicsProcessing to using a many-to-many association table called omics_processing_output_association
. This change allows data objects to be associated with multiple omics processing records instead of just one.
Key changes:
- Replace direct foreign key queries with association table joins in query and aggregation functions
- Update the data ingestion pipeline to populate the new association table
- Remove the legacy data generation relation update function
Reviewed Changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.
Show a summary per file
File | Description |
---|---|
nmdc_server/query.py | Updates data object filter queries to use association table joins |
nmdc_server/models.py | Adds new many-to-many relationship using association table |
nmdc_server/ingest/pipeline.py | Populates association table during data ingestion |
nmdc_server/ingest/data_object.py | Removes legacy foreign key update function |
nmdc_server/ingest/all.py | Removes call to legacy foreign key update function |
nmdc_server/crud.py | Updates to use new relationship property |
nmdc_server/aggregations.py | Updates aggregation queries to use association table |
tests/test_download.py | Adds test data to populate new relationship |
I'm over-assigning reviewers to this one in an effort to get eyes on it ASAP. I believe the changes here are straightforward but I'm happy to add details if needed. @aclum this should address the data object aggregation issue on the bulk download widget, as well as some other things that needed to be changed for the bulk download workflow to accommodate the data model changes in #1701. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From skimming the diff, I think this PR does the following:
- Updates some SQLAlchemy queries that used to involve the
DataObject
model, so that they now involve theomics_processing_output_association
table - Updates an attribute access to get the first "omics_processing" from a list, instead of getting the "omics_processing" from a scalar variable
- Removes code that would update references between the "data object" and "data generation" tables during ingest. Adds ingest code that establishes those references in a different way.
- Defines an
omics_processings
relationship (model) - Adds something to a test (this change, I don't really understand)
I approved in the interest of facilitating a merge once the changes are finalized (e.g. when the tests are passing). I am not familiar with the code that was changed and so my approval represents me saying, "I looked at all the changes and didn't see anything that looks to me like it was done by mistake." |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
e05ab13
to
01e8948
Compare
No description provided.