-
Notifications
You must be signed in to change notification settings - Fork 5.5k
Add support for row filtering and column masking access control #24277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for row filtering and column masking access control #24277
Conversation
There are some followup commits that I did not include here, we can add these in or look into as a followup: Apply filters and masks for view object Apply filters and masks for tables referenced by views Add test for join on masked column Record filter and mask information in query event |
@tdcmeehan Its high time that we consider merging this stuff, as we discussed last time in the meeting doing in connector optimizer is not feasible for all cases and offerrs no benefit |
0c28887
to
3d1ecda
Compare
I'm in favor of this approach. While it does increase our SPI footprint, I think it's nice to align with Trino's approach in this case because it makes porting plugins easier for folks. I think this is worth the small increase in footprint. |
3d1ecda
to
1aefdf0
Compare
Updated to include followup commits to retrieve a list of row filters and multiple column masks with a single call to the interface. |
{ | ||
String column = columnMetadata.getName(); | ||
if (analysis.hasColumnMask(tableName, column, currentIdentity)) { | ||
throw new PrestoException(INVALID_ROW_FILTER, format("Column mask for '%s.%s' is recursive", tableName, column), null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can you please use INVALID_COLUMN_MASK error code in this method?
|
||
planBuilder = subqueryPlanner.handleSubqueries(planBuilder, mask, mask, context); | ||
|
||
Map<VariableReferenceExpression, RowExpression> assignments = new LinkedHashMap<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please move the creation of assignments and the new planBuilder outside the for loop? This will allow invoking planBuilder.withNewRoot only once per method invocation, rather than creating a new planBuilder for each mask. It will improve efficiency and make the logic cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that sounds better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I got it working, but still new working with these parts so let me know if it can be improved.
Map<String, Expression> columnMasks = analysis.getColumnMasks(table); | ||
|
||
List<VariableReferenceExpression> mappings = plan.getFieldMappings(); | ||
TranslationMap translations = new TranslationMap(plan, analysis, lambdaDeclarationToVariableMap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We can simplify this section by reusing some code. Instead of directly using translations.rewrite(mask)
, you could initialize the PlanBuilder with PlanBuilder planBuilder = initializePlanBuilder(plan);
. Then, use planBuilder.rewrite(mask)
to leverage the translations map already stored internally within planBuilder.
74b9a9f
to
f1039f8
Compare
Thanks for reviewing @uarturm , there is still some discussion on the RFC, then I'll remove from draft. I am debating including the cherry pick of trinodb/trino@6721cfa with this. I don't want to deviate too much from the initial design, but this could be useful as well. What are you thoughts on it? cc @tdcmeehan @aaneja |
3321f0f
to
83db787
Compare
83db787
to
1ebad17
Compare
The test failure seems to be sporadic, or maybe from some state not getting reset. I'll have to look into it further. |
987a1b5
to
b8b256f
Compare
column names. The returned filters and masks will be in the form of a ``ViewExpression`` | ||
that is then applied to the query plan before execution. Filters and masks can also be | ||
supplied at the connector level from a ``ConnectorAccessControl`` implementation. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@steveburnett please take a look at the doc additions, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull branch, local doc build, looks good. Thanks!
b8b256f
to
76f2b2b
Compare
@jaystarshot is this something you are available to review and help merge? |
76f2b2b
to
d95968f
Compare
Cherry-pick of trinodb/trino@fae3147 This adds support for getting row filters from access control and applying them to a query plan. Co-authored-by: Martin Traverso <mtraverso@gmail.com>
Cherry-pick of trinodb/trino@7e0d88e This adds support for getting column masks from access control and applying them to a query plan. With additional new tests for plan validation of row filters and column masks. Co-authored-by: Martin Traverso <mtraverso@gmail.com>
Cherry-pick of trino trinodb/trino@bdd1cb5 Co-authored-by: Martin Traverso <mtraverso@gmail.com>
Cherry-pick of trinodb/trino@827de57 Cherry-pick of trinodb/trino@ae66a8b Allow returning multiple filters and masks in ConnectorAccessControl and SystemAccessControl. Co-authored-by: Krzysztof Sobolewski <krzysztof.sobolewski@starburstdata.com>
d95968f
to
f043bdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great improvement, just a few nits and a couple tests for complex types would be great
presto-main-base/src/main/java/com/facebook/presto/security/FileBasedSystemAccessControl.java
Outdated
Show resolved
Hide resolved
presto-main-base/src/main/java/com/facebook/presto/security/AllowAllSystemAccessControl.java
Outdated
Show resolved
Hide resolved
presto-main-base/src/main/java/com/facebook/presto/security/AllowAllSystemAccessControl.java
Outdated
Show resolved
Hide resolved
presto-main-base/src/main/java/com/facebook/presto/security/FileBasedSystemAccessControl.java
Outdated
Show resolved
Hide resolved
presto-main-base/src/test/java/com/facebook/presto/security/TestAccessControlManager.java
Show resolved
Hide resolved
presto-main-base/src/test/java/com/facebook/presto/sql/query/TestColumnMask.java
Show resolved
Hide resolved
assertions.assertFails("DELETE FROM orders", "\\Qline 1:1: Delete from table with column mask is not supported\\E"); | ||
}); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we also do a few tests with complex/nested types such as ROW and ARRAY types
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it would be good to make sure a ViewExpression
can apply to a complex type. It would take me some time to work this out, would that be ok to work on as a followup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you at least open a ticket and link it to this discussion so we don't lose track of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Issue created: #24998
presto-main-base/src/test/java/com/facebook/presto/sql/query/TestRowFilter.java
Show resolved
Hide resolved
...n-toolkit/src/main/java/com/facebook/presto/plugin/base/security/FileBasedAccessControl.java
Outdated
Show resolved
Hide resolved
...n-toolkit/src/main/java/com/facebook/presto/plugin/base/security/FileBasedAccessControl.java
Outdated
Show resolved
Hide resolved
Cherry-pick of trinodb/trino@1dbbcb3 With related commit: Add access control SPI function for table column masks Cherry-pick of trinodb/trino@5da64a9 Co-authored-by: Cristian Osiac <mosiac@bloomberg.net>
This improves plan rewrite for column masks to only build a new root node once, instead of creating a plan builder for each mask. Add documentation for row filters and column masks access control implementation.
f043bdc
to
5af314f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just open the ticket for complex types first
assertions.assertFails("DELETE FROM orders", "\\Qline 1:1: Delete from table with column mask is not supported\\E"); | ||
}); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you at least open a ticket and link it to this discussion so we don't lose track of this?
Description
This adds support for row filtering and column masking to access control as part of governance requirements. Filters/masks are supplied as part of an access control implementation and then applied to queries according to matching table and column names supplied by access control.
These changes are cherry-picked from the following commits:
Adding support for row filters
Cherry-pick of trinodb/trino@fae3147
Co-authored-by: Martin Traverso mtraverso@gmail.com
Adding support for column masks
Cherry-pick of trinodb/trino@7e0d88e
Co-authored-by: Martin Traverso mtraverso@gmail.com
Disallow multiple masks on a given column
Cherry-pick of trinodb/trino@bdd1cb5
Co-authored-by: Martin Traverso mtraverso@gmail.com
Add access control SPI function for table column masks.
Cherry-pick of trinodb/trino@5da64a9
Co-authored-by: Cristian Osiac mosiac@bloomberg.net
Update core access control to fetch column masks in bulk
Cherry-pick of trinodb/trino@1dbbcb3
Co-authored-by: Cristian Osiac mosiac@bloomberg.net
For original IBM Presto port
Co-authored-by: Reetika Agrawal agrawal.reetika786@gmail.com
From #24278
Motivation and Context
Governance requirements include the ability to restrict access to certain rows of tables or sensitive data in certain columns. This adds the ability to provide filters/masks as expressions to Presto where they will then be applied to relevant queries.
Impact
Additions to SPI include access control interfaces to get row filters and column masks.
ConnectorAccessControl
Test Plan
Unit tests added.
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.