Skip to content

Conversation

BryanCutler
Copy link
Contributor

@BryanCutler BryanCutler commented Dec 18, 2024

Description

This adds support for row filtering and column masking to access control as part of governance requirements. Filters/masks are supplied as part of an access control implementation and then applied to queries according to matching table and column names supplied by access control.

These changes are cherry-picked from the following commits:

Adding support for row filters
Cherry-pick of trinodb/trino@fae3147
Co-authored-by: Martin Traverso mtraverso@gmail.com

Adding support for column masks
Cherry-pick of trinodb/trino@7e0d88e
Co-authored-by: Martin Traverso mtraverso@gmail.com

Disallow multiple masks on a given column
Cherry-pick of trinodb/trino@bdd1cb5
Co-authored-by: Martin Traverso mtraverso@gmail.com

Add access control SPI function for table column masks.
Cherry-pick of trinodb/trino@5da64a9
Co-authored-by: Cristian Osiac mosiac@bloomberg.net

Update core access control to fetch column masks in bulk
Cherry-pick of trinodb/trino@1dbbcb3
Co-authored-by: Cristian Osiac mosiac@bloomberg.net

For original IBM Presto port
Co-authored-by: Reetika Agrawal agrawal.reetika786@gmail.com

From #24278

Motivation and Context

Governance requirements include the ability to restrict access to certain rows of tables or sensitive data in certain columns. This adds the ability to provide filters/masks as expressions to Presto where they will then be applied to relevant queries.

Impact

Additions to SPI include access control interfaces to get row filters and column masks.

ConnectorAccessControl

    /**
     * Get row filters associated with the given table and identity.
     * <p>
     * Each filter must be a scalar SQL expression of boolean type over the columns in the table.
     *
     * @return the list of filters, or empty list if not applicable
     */
    default List<ViewExpression> getRowFilters(ConnectorTransactionHandle transactionHandle, ConnectorIdentity identity, AccessControlContext context, SchemaTableName tableName)
    {
        return Collections.emptyList();
    }

    /**
     * Bulk method for getting column masks for a subset of columns in a table.
     * <p>
     * Each mask must be a scalar SQL expression of a type coercible to the type of the column being masked. The expression
     * must be written in terms of columns in the table.
     *
     * @return a mapping from columns to masks, or an empty map if not applicable. The keys of the return Map are a subset of {@code columns}.
     */
    default Map<ColumnMetadata, ViewExpression> getColumnMasks(ConnectorTransactionHandle transactionHandle, ConnectorIdentity identity, AccessControlContext context, SchemaTableName tableName, List<ColumnMetadata> columns)
    {
        return Collections.emptyMap();
    }

Test Plan

Unit tests added.

Contributor checklist

  • Please make sure your submission complies with our contributing guide, in particular code style and commit standards.
  • PR description addresses the issue accurately and concisely. If the change is non-trivial, a GitHub Issue is referenced.
  • Documented new properties (with its default value), SQL syntax, functions, or other functionality.
  • If release notes are required, they follow the release notes guidelines.
  • Adequate tests were added if applicable.
  • CI passed.

Release Notes

Please follow release notes guidelines and fill in the release notes below.

== RELEASE NOTES ==

Security Changes
* Add support for row filtering and column masking in access control.

@prestodb-ci prestodb-ci added the from:IBM PR from IBM label Dec 18, 2024
@prestodb-ci prestodb-ci requested review from a team, imjalpreet and wanglinsong and removed request for a team December 18, 2024 19:07
@BryanCutler
Copy link
Contributor Author

There are some followup commits that I did not include here, we can add these in or look into as a followup:

Apply filters and masks for view object
trinodb/trino@6721cfa

Apply filters and masks for tables referenced by views
trinodb/trino@84429d8

Add test for join on masked column
trinodb/trino@705ed21

Record filter and mask information in query event
trinodb/trino@1b55b86

@jaystarshot
Copy link
Member

@tdcmeehan Its high time that we consider merging this stuff, as we discussed last time in the meeting doing in connector optimizer is not feasible for all cases and offerrs no benefit

@BryanCutler BryanCutler force-pushed the qaag-forward-fit-trino branch 2 times, most recently from 0c28887 to 3d1ecda Compare December 19, 2024 19:35
@tdcmeehan
Copy link
Contributor

I'm in favor of this approach. While it does increase our SPI footprint, I think it's nice to align with Trino's approach in this case because it makes porting plugins easier for folks. I think this is worth the small increase in footprint.

@BryanCutler
Copy link
Contributor Author

Updated to include followup commits to retrieve a list of row filters and multiple column masks with a single call to the interface.

{
String column = columnMetadata.getName();
if (analysis.hasColumnMask(tableName, column, currentIdentity)) {
throw new PrestoException(INVALID_ROW_FILTER, format("Column mask for '%s.%s' is recursive", tableName, column), null);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Can you please use INVALID_COLUMN_MASK error code in this method?


planBuilder = subqueryPlanner.handleSubqueries(planBuilder, mask, mask, context);

Map<VariableReferenceExpression, RowExpression> assignments = new LinkedHashMap<>();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please move the creation of assignments and the new planBuilder outside the for loop? This will allow invoking planBuilder.withNewRoot only once per method invocation, rather than creating a new planBuilder for each mask. It will improve efficiency and make the logic cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that sounds better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I got it working, but still new working with these parts so let me know if it can be improved.

Map<String, Expression> columnMasks = analysis.getColumnMasks(table);

List<VariableReferenceExpression> mappings = plan.getFieldMappings();
TranslationMap translations = new TranslationMap(plan, analysis, lambdaDeclarationToVariableMap);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We can simplify this section by reusing some code. Instead of directly using translations.rewrite(mask), you could initialize the PlanBuilder with PlanBuilder planBuilder = initializePlanBuilder(plan);. Then, use planBuilder.rewrite(mask) to leverage the translations map already stored internally within planBuilder.

@BryanCutler BryanCutler force-pushed the qaag-forward-fit-trino branch from 74b9a9f to f1039f8 Compare February 18, 2025 19:42
@BryanCutler
Copy link
Contributor Author

Thanks for reviewing @uarturm , there is still some discussion on the RFC, then I'll remove from draft.

I am debating including the cherry pick of trinodb/trino@6721cfa with this. I don't want to deviate too much from the initial design, but this could be useful as well. What are you thoughts on it? cc @tdcmeehan @aaneja

@BryanCutler BryanCutler force-pushed the qaag-forward-fit-trino branch 2 times, most recently from 3321f0f to 83db787 Compare March 7, 2025 22:59
@BryanCutler BryanCutler marked this pull request as ready for review March 8, 2025 00:55
@BryanCutler BryanCutler requested a review from presto-oss March 8, 2025 00:55
@prestodb-ci prestodb-ci requested a review from a team March 8, 2025 00:55
@ScrapCodes ScrapCodes self-requested a review March 13, 2025 15:26
@BryanCutler BryanCutler force-pushed the qaag-forward-fit-trino branch from 83db787 to 1ebad17 Compare March 13, 2025 16:57
@BryanCutler
Copy link
Contributor Author

The test failure seems to be sporadic, or maybe from some state not getting reset. I'll have to look into it further.

column names. The returned filters and masks will be in the form of a ``ViewExpression``
that is then applied to the query plan before execution. Filters and masks can also be
supplied at the connector level from a ``ConnectorAccessControl`` implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveburnett please take a look at the doc additions, thanks!

steveburnett
steveburnett previously approved these changes Apr 18, 2025
Copy link
Contributor

@steveburnett steveburnett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (docs)

Pull branch, local doc build, looks good. Thanks!

@BryanCutler BryanCutler force-pushed the qaag-forward-fit-trino branch from b8b256f to 76f2b2b Compare April 18, 2025 18:19
@BryanCutler BryanCutler requested a review from aaneja April 21, 2025 17:04
aaneja
aaneja previously approved these changes Apr 22, 2025
@BryanCutler
Copy link
Contributor Author

@jaystarshot is this something you are available to review and help merge?

@BryanCutler BryanCutler force-pushed the qaag-forward-fit-trino branch from 76f2b2b to d95968f Compare April 24, 2025 18:21
BryanCutler and others added 4 commits April 24, 2025 11:24
Cherry-pick of trinodb/trino@fae3147

This adds support for getting row filters from access control and
applying them to a query plan.

Co-authored-by: Martin Traverso <mtraverso@gmail.com>
Cherry-pick of trinodb/trino@7e0d88e

This adds support for getting column masks from access control and
applying them to a query plan.

With additional new tests for plan validation of row filters and column masks.

Co-authored-by: Martin Traverso <mtraverso@gmail.com>
Cherry-pick of trino trinodb/trino@bdd1cb5

Co-authored-by: Martin Traverso <mtraverso@gmail.com>
Cherry-pick of trinodb/trino@827de57
Cherry-pick of trinodb/trino@ae66a8b

Allow returning multiple filters and masks in ConnectorAccessControl
and SystemAccessControl.

Co-authored-by: Krzysztof Sobolewski <krzysztof.sobolewski@starburstdata.com>
@BryanCutler BryanCutler force-pushed the qaag-forward-fit-trino branch from d95968f to f043bdc Compare April 24, 2025 18:34
Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great improvement, just a few nits and a couple tests for complex types would be great

assertions.assertFails("DELETE FROM orders", "\\Qline 1:1: Delete from table with column mask is not supported\\E");
});
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also do a few tests with complex/nested types such as ROW and ARRAY types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose it would be good to make sure a ViewExpression can apply to a complex type. It would take me some time to work this out, would that be ok to work on as a followup?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you at least open a ticket and link it to this discussion so we don't lose track of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue created: #24998

BryanCutler and others added 2 commits April 25, 2025 13:51
Cherry-pick of trinodb/trino@1dbbcb3

With related commit:
Add access control SPI function for table column masks
Cherry-pick of trinodb/trino@5da64a9

Co-authored-by: Cristian Osiac <mosiac@bloomberg.net>
This improves plan rewrite for column masks to only build a new root
node once, instead of creating a plan builder for each mask.

Add documentation for row filters and column masks access control implementation.
@BryanCutler BryanCutler dismissed stale reviews from aaneja and steveburnett via 5af314f April 25, 2025 20:51
@BryanCutler BryanCutler force-pushed the qaag-forward-fit-trino branch from f043bdc to 5af314f Compare April 25, 2025 20:52
Copy link
Contributor

@ZacBlanco ZacBlanco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just open the ticket for complex types first

assertions.assertFails("DELETE FROM orders", "\\Qline 1:1: Delete from table with column mask is not supported\\E");
});
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you at least open a ticket and link it to this discussion so we don't lose track of this?

@BryanCutler BryanCutler merged commit 1abad21 into prestodb:master Apr 28, 2025
99 checks passed
@BryanCutler BryanCutler deleted the qaag-forward-fit-trino branch April 28, 2025 20:29
@ZacBlanco ZacBlanco mentioned this pull request May 29, 2025
21 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
from:IBM PR from IBM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants