Skip to content

Add customizable PARTITION BY support for ClickHouse tables #164

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jun 29, 2025

Conversation

bakwc
Copy link
Owner

@bakwc bakwc commented Jun 29, 2025

Summary

Adds support for customizable PARTITION BY expressions in ClickHouse table creation to address issues with Snowflake IDs creating too many partitions.

Changes

  • New config option: partition_bys with database/table filtering (similar to indexes)
  • Custom expressions: Override default intDiv(id, 4294967) with user-defined partition logic
  • Backward compatible: Falls back to existing behavior when not configured
  • Test coverage: Modified existing test to verify custom partition functionality

Configuration Example

partition_bys:
  - databases: '*'
    tables: ['test_table']
    partition_by: 'toYYYYMM(created_at)'

Problem Solved

Fixes issue where Snowflake-style IDs (e.g., 1849360358546407424) with default partitioning create excessive partitions, triggering max_partitions_per_insert_block limits. Users can now specify time-based partitioning like toYYYYMM(created_at).

Fixes #161

bakwc added 4 commits June 29, 2025 21:08
- Add partition_bys config option similar to indexes with database/table filtering
- Support custom PARTITION BY expressions to override default intDiv(id, 4294967)
- Useful for time-based partitioning like toYYYYMM(created_at) for Snowflake IDs
- Maintains backward compatibility with existing default behavior
- Add test verification for custom partition_by functionality

Fixes #161
- Add proper deterministic partition_by expression: intDiv(id, 1000000)
- Update test to verify custom vs default partition expressions
- Ensure both CONFIG_FILE and CONFIG_FILE_MARIADB tests pass
- Fix CI failures caused by non-deterministic partition expressions
@bakwc bakwc merged commit 3727e3d into master Jun 29, 2025
1 check passed
@bakwc bakwc deleted the feature/custom-partition-by branch June 29, 2025 18:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[FEATURE] Allow Custom PARTITION BY When Using Snowflake ID (bigint) in Initial Migration
1 participant