[FLINK-36546] Handle batch sources in DataSinkTranslator #3646

morozov · 2024-10-16T02:15:29Z

There's no public API in Flink to detect the boundedness of a stream, so this patch duplicates the code from StreamGraphGenerator that Flink itself uses to instantiate CommitterOperatorFactory.

lvyanquan · 2024-10-16T09:54:27Z

I think a test for this is still necessary.
We have two bound sources(values, MySQL when setting scan.startup.mode to snapshot), and I don't get the point that you said "the sink never receives the end-of-input signal", can you provide a more detailed description? we can add some logs in flush method if endOfInput is true to verify this.

morozov · 2024-10-16T19:17:48Z

@lvyanquan please see the temporary reproducer in the second commit.

and I don't get the point that you said "the sink never receives the end-of-input signal", can you provide a more detailed description?

I got confused. The problem is not that the sink never receives the end-of-input signal but that the two-phase committing sink doesn't commit the last checkpoint.

In order to reproduce the issue I had to make the following temporary changes:

Make ValuesSink a TwoPhaseCommittingSink.
Bypass the usage of reflection in DataSinkTranslator. Otherwise, the test would fail with an IllegalAccessException, which I didn't know how to address.

The point is that with the fix from the first commit, when DataSinkTranslatorBatchModeIT runs, it produces the following log message:

flink-cdc/flink-cdc-connect/flink-cdc-pipeline-connectors/flink-cdc-pipeline-connector-values/src/main/java/org/apache/flink/cdc/connectors/values/sink/ValuesDataSink.java

Line 123 in 2274b7e

    
           LOG.info("Find me in the logs. Committing {} committables.", committables.size());

4890 [PostPartition -> Sink Writer: Value Sink -> Sink Committer: Value Sink (1/1)#0] INFO org.apache.flink.cdc.connectors.values.sink.ValuesDataSink$ValuesSink - Find me in the logs. Committing 1 committables.

If the first commit is reverted, this message won't be produced because the sink won't commit.

morozov · 2024-11-15T02:35:25Z

@lvyanquan how do we proceed with this change? I propose that you verify the correctness of the fix by reverting it and running the updated test. Then I drop the commit that modifies the test. Otherwise, can you recommend an approach that would work for you?

lvyanquan · 2024-12-09T11:53:58Z

Hi, @morozov, sorry for my late reply and I've verified you fix and it look good to me, so you can revert the last commit if you want.

By the way, the bound source is MySQLCDC Source that run in snapshot-only mode?

morozov · 2024-12-09T17:41:17Z

Hi, @morozov, sorry for my late reply and I've verified you fix and it look good to me, so you can revert the last commit if you want.

Thank you! Done.

By the way, the bound source is MySQLCDC Source that run in snapshot-only mode?

No. I'm working on a proprietary DataSink implementation and I use ValuesDataSource for integration testing – that's the bounded source.

BTW, is master the right target branch for this? Please let me know if you want it to be retargeted against release-3.2.

lvyanquan

LGTM.

I guess master branch is enough as we plan to release 3.3.0 in the following weeks.

lvyanquan · 2025-01-16T07:13:16Z

Hi, @leonardBang, PTAL.
The CI failure is unrelated to this pr as the failure happened in sqlserver connector.

lvyanquan · 2025-01-16T12:01:13Z

Hi @morozov.
Could you add a test that use doris or paimon as sink as they are TwoPhaseCommittingSinks?

morozov · 2025-04-02T16:46:01Z

@lvyanquan I made a couple of approaches at writing a test and realized that there are no Doris or Paimon pipeline integration tests which I could reuse (or, more specifically, which would fail without this change if they existed).

In fact, there's only a single connector test (MySqlParallelizedPipelineITCase) that tests the connector as part of the pipeline. I don't think I can write such a test myself in a reasonable time. The test that I did write earlier (see 2274b7e) clearly demonstrated the issue.

Given all that, could we proceed without a test for now? I will be happy to assist if someone else is willing to write such a test.

lvyanquan · 2025-04-14T03:55:40Z

Hi, @morozov.

We can base on the extensive test cases in #3812 to test it.
After 3812 merged, we can rebase this and run CI to verify it. If CI passed, this PR should no longer have a blocker.

lvyanquan · 2025-04-22T05:54:04Z

Hi @morozov. You can rebase to master and test it with existed case in batch mode refer to #3812.

morozov · 2025-04-24T18:51:45Z

@lvyanquan on the one hand, it looks like #3812 itself could be used as a solution to my problem. In my PR, I'm trying to detect if the pipeline needs to run in the batch mode based on the source, while the PR you mentioned adds an explicit configuration to enable the batch mode. I could use explicit configuration and withdraw my PR.

What I don't understand is, how schema change events are handled in the batch mode. I see that there is a new BatchSchemaOperator with the following code:

flink-cdc/flink-cdc-runtime/src/main/java/org/apache/flink/cdc/runtime/operators/schema/regular/BatchSchemaOperator.java

Lines 101 to 115 in c2230d5

    
           public void processElement(StreamRecord<Event> streamRecord) throws Exception { 
        
               Event event = streamRecord.getValue(); 
        
               // Only catch create table event and data change event in batch mode 
        
               if (event instanceof CreateTableEvent) { 
        
                   handleCreateTableEvent((CreateTableEvent) event); 
        
               } else if (event instanceof DataChangeEvent) { 
        
                   if (!alreadyMergedCreateTableTables) { 
        
                       handleFirstDataChangeEvent(); 
        
                       alreadyMergedCreateTableTables = true; 
        
                   } 
        
                   handleDataChangeEvent((DataChangeEvent) event); 
        
               } else { 
        
                   throw new RuntimeException("Unknown event type in Batch record: " + event); 
        
               } 
        
           }

It looks like it will fail to process any other schema change event than CreateTableEvent. Won't it?

Internally, I'm using the code changes from my other PR (#3999), and I'm trying to understand how this logic should apply to the batch schema operator.

Could you clarify how the batch mode is meant to handle schema changes?

morozov · 2025-06-10T16:10:32Z

@lvyanquan could you take a look at my previous comment?

[FLINK-36546] Handle batch sources in DataSinkTranslator

a0a1c3e

github-actions bot added the composer label Oct 16, 2024

github-actions bot added the values-pipeline-connector label Oct 16, 2024

morozov force-pushed the FLINK-36546-handle-batch-sources branch from 2274b7e to a0a1c3e Compare December 9, 2024 17:36

github-actions bot removed the values-pipeline-connector label Dec 9, 2024

lvyanquan approved these changes Dec 10, 2024

View reviewed changes

github-actions bot added the reviewed label Dec 10, 2024

lvyanquan self-assigned this Apr 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[FLINK-36546] Handle batch sources in DataSinkTranslator #3646

[FLINK-36546] Handle batch sources in DataSinkTranslator #3646

Uh oh!

morozov commented Oct 16, 2024

Uh oh!

lvyanquan commented Oct 16, 2024

Uh oh!

morozov commented Oct 16, 2024 •

edited

Loading

Uh oh!

morozov commented Nov 15, 2024 •

edited

Loading

Uh oh!

lvyanquan commented Dec 9, 2024 •

edited

Loading

Uh oh!

morozov commented Dec 9, 2024 •

edited

Loading

Uh oh!

lvyanquan left a comment

Uh oh!

lvyanquan commented Jan 16, 2025

Uh oh!

lvyanquan commented Jan 16, 2025

Uh oh!

morozov commented Apr 2, 2025

Uh oh!

lvyanquan commented Apr 14, 2025

Uh oh!

lvyanquan commented Apr 22, 2025

Uh oh!

morozov commented Apr 24, 2025

Uh oh!

morozov commented Jun 10, 2025

Uh oh!

Uh oh!

[FLINK-36546] Handle batch sources in DataSinkTranslator #3646

Are you sure you want to change the base?

[FLINK-36546] Handle batch sources in DataSinkTranslator #3646

Uh oh!

Conversation

morozov commented Oct 16, 2024

Uh oh!

lvyanquan commented Oct 16, 2024

Uh oh!

morozov commented Oct 16, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

morozov commented Nov 15, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lvyanquan commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

morozov commented Dec 9, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lvyanquan left a comment

Choose a reason for hiding this comment

Uh oh!

lvyanquan commented Jan 16, 2025

Uh oh!

lvyanquan commented Jan 16, 2025

Uh oh!

morozov commented Apr 2, 2025

Uh oh!

lvyanquan commented Apr 14, 2025

Uh oh!

lvyanquan commented Apr 22, 2025

Uh oh!

morozov commented Apr 24, 2025

Uh oh!

morozov commented Jun 10, 2025

Uh oh!

Uh oh!

morozov commented Oct 16, 2024 •

edited

Loading

morozov commented Nov 15, 2024 •

edited

Loading

lvyanquan commented Dec 9, 2024 •

edited

Loading

morozov commented Dec 9, 2024 •

edited

Loading