Skip to content

Commit 4f73a6f

Browse files
author
linjc13
committed
[FLINK-36796][pipeline-connector][oracle]add oracle pipeline connector.
1 parent 95fe4d3 commit 4f73a6f

File tree

43 files changed

+7392
-13
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

43 files changed

+7392
-13
lines changed
Lines changed: 287 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,287 @@
1+
---
2+
title: "ORACLE"
3+
weight: 2
4+
type: docs
5+
aliases:
6+
- /connectors/pipeline-connectors/oracle
7+
---
8+
<!--
9+
Licensed to the Apache Software Foundation (ASF) under one
10+
or more contributor license agreements. See the NOTICE file
11+
distributed with this work for additional information
12+
regarding copyright ownership. The ASF licenses this file
13+
to you under the Apache License, Version 2.0 (the
14+
"License"); you may not use this file except in compliance
15+
with the License. You may obtain a copy of the License at
16+
17+
http://www.apache.org/licenses/LICENSE-2.0
18+
19+
Unless required by applicable law or agreed to in writing,
20+
software distributed under the License is distributed on an
21+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
22+
KIND, either express or implied. See the License for the
23+
specific language governing permissions and limitations
24+
under the License.
25+
-->
26+
27+
# Oracle Connector
28+
29+
Oracle CDC Pipeline 连接器允许从 Oracle 数据库读取快照数据和增量数据,并提供端到端的整库数据同步能力。 本文描述了如何设置 Oracle CDC Pipeline 连接器。
30+
31+
32+
## 示例
33+
34+
从 Oracle 读取数据同步到 Doris 的 Pipeline 可以定义如下:
35+
36+
```yaml
37+
source:
38+
type: oracle
39+
name: Oracle Source
40+
hostname: 127.0.0.1
41+
port: 1521
42+
username: debezium
43+
password: dbz
44+
database: ORCLDB
45+
tables: adb.\.*, bdb.user_table_[0-9]+, [app|web].order_\.*
46+
47+
sink:
48+
type: doris
49+
name: Doris Sink
50+
fenodes: 127.0.0.1:8030
51+
username: root
52+
password: pass
53+
54+
pipeline:
55+
name: Oracle to Doris Pipeline
56+
parallelism: 4
57+
```
58+
59+
## 连接器配置项
60+
61+
<div class="highlight">
62+
<table class="colwidths-auto docutils">
63+
<thead>
64+
<tr>
65+
<th class="text-left" style="width: 10%">Option</th>
66+
<th class="text-left" style="width: 8%">Required</th>
67+
<th class="text-left" style="width: 7%">Default</th>
68+
<th class="text-left" style="width: 10%">Type</th>
69+
<th class="text-left" style="width: 65%">Description</th>
70+
</tr>
71+
</thead>
72+
<tbody>
73+
<tr>
74+
<td>hostname</td>
75+
<td>required</td>
76+
<td style="word-wrap: break-word;">(none)</td>
77+
<td>String</td>
78+
<td> Oracle 数据库服务器的 IP 地址或主机名。</td>
79+
</tr>
80+
<tr>
81+
<td>port</td>
82+
<td>optional</td>
83+
<td style="word-wrap: break-word;">1521</td>
84+
<td>Integer</td>
85+
<td>Oracle 数据库服务器的整数端口号。</td>
86+
</tr>
87+
<tr>
88+
<td>username</td>
89+
<td>required</td>
90+
<td style="word-wrap: break-word;">(none)</td>
91+
<td>String</td>
92+
<td>连接到 Oracle 数据库服务器时要使用的 Oracle 用户的名称。</td>
93+
</tr>
94+
<tr>
95+
<td>password</td>
96+
<td>required</td>
97+
<td style="word-wrap: break-word;">(none)</td>
98+
<td>String</td>
99+
<td>连接 Oracle 数据库服务器时使用的密码。</td>
100+
</tr>
101+
<tr>
102+
<td>tables</td>
103+
<td>required</td>
104+
<td style="word-wrap: break-word;">(none)</td>
105+
<td>String</td>
106+
<td>需要监视的 Oracle 数据库的表名。表名支持正则表达式,以监视满足正则表达式的多个表。<br>
107+
需要注意的是,点号(.)被视为数据库和表名的分隔符。 如果需要在正则表达式中使用点(.)来匹配任何字符,必须使用反斜杠对点进行转义。<br>
108+
例如,db0.\.*, db1.user_table_[0-9]+, db[1-2].[app|web]order_\.*</td>
109+
</tr>
110+
<tr>
111+
<td>schema-change.enabled</td>
112+
<td>optional</td>
113+
<td style="word-wrap: break-word;">true</td>
114+
<td>Boolean</td>
115+
<td>是否发送模式更改事件,下游 sink 可以响应模式变更事件实现表结构同步,默认为true。</td>
116+
</tr>
117+
<tr>
118+
<td>scan.incremental.snapshot.chunk.size</td>
119+
<td>optional</td>
120+
<td style="word-wrap: break-word;">8096</td>
121+
<td>Integer</td>
122+
<td>表快照的块大小(行数),读取表的快照时,捕获的表被拆分为多个块。</td>
123+
</tr>
124+
<tr>
125+
<td>scan.snapshot.fetch.size</td>
126+
<td>optional</td>
127+
<td style="word-wrap: break-word;">1024</td>
128+
<td>Integer</td>
129+
<td>读取表快照时每次读取数据的最大条数。</td>
130+
</tr>
131+
<tr>
132+
<td>scan.startup.mode</td>
133+
<td>optional</td>
134+
<td style="word-wrap: break-word;">initial</td>
135+
<td>String</td>
136+
<td> Oracle CDC 消费者可选的启动模式,
137+
合法的模式为 "initial","latest-offset"。</td>
138+
</tr>
139+
<tr>
140+
<td>debezium.*</td>
141+
<td>optional</td>
142+
<td style="word-wrap: break-word;">(none)</td>
143+
<td>String</td>
144+
<td>将 Debezium 的属性传递给 Debezium 嵌入式引擎,该引擎用于从 Oracle 服务器捕获数据更改。
145+
例如: <code>'debezium.snapshot.mode' = 'never'</code>.
146+
查看更多关于 <a href="https://debezium.io/documentation/reference/1.9/connectors/oracle.html#oracle-connector-properties"> Debezium 的 Oracle 连接器属性</a></td>
147+
</tr>
148+
<tr>
149+
<td>scan.incremental.close-idle-reader.enabled</td>
150+
<td>optional</td>
151+
<td style="word-wrap: break-word;">false</td>
152+
<td>Boolean</td>
153+
<td>是否在快照结束后关闭空闲的 Reader。 此特性需要 flink 版本大于等于 1.14 并且 'execution.checkpointing.checkpoints-after-tasks-finish.enabled' 需要设置为 true。<br>
154+
若 flink 版本大于等于 1.15,'execution.checkpointing.checkpoints-after-tasks-finish.enabled' 默认值变更为 true,可以不用显式配置 'execution.checkpointing.checkpoints-after-tasks-finish.enabled' = true。</td>
155+
</tr>
156+
<tr>
157+
<td>metadata.list</td>
158+
<td>optional</td>
159+
<td style="word-wrap: break-word;">false</td>
160+
<td>String</td>
161+
<td>
162+
可额外读取的SourceRecord中元数据的列表,后续可直接使用在transform模块,英文逗号 `,` 分割。目前可用值包含:op_ts。
163+
</td>
164+
</tr>
165+
</tbody>
166+
</table>
167+
</div>
168+
169+
## 启动模式
170+
171+
配置选项`scan.startup.mode`指定 Oracle CDC 使用者的启动模式。有效枚举包括:
172+
- `initial` (默认):在第一次启动时对受监视的数据库表执行初始快照,并继续读取最新的归档日志。
173+
- `latest-offset`:首次启动时,从不对受监视的数据库表执行快照, 连接器仅从归档日志的结尾处开始读取,这意味着连接器只能读取在连接器启动之后的数据更改。
174+
175+
例如,可以在 YAML 配置文件中这样指定启动模式:
176+
177+
```yaml
178+
source:
179+
type: oracle
180+
scan.startup.mode: earliest-offset # Start from earliest offset
181+
scan.startup.mode: latest-offset # Start from latest offset
182+
# ...
183+
```
184+
185+
## 数据类型映射
186+
187+
<div class="wy-table-responsive">
188+
<table class="colwidths-auto docutils">
189+
<thead>
190+
<tr>
191+
<th class="text-left" style="width:30%;">Oracle type<a href="https://docs.oracle.com/cd/B13789_01/win.101/b10118/o4o00675.htm"></a></th>
192+
<th class="text-left" style="width:10%;">CDC type</th>
193+
<th class="text-left" style="width:60%;">NOTE</th>
194+
</tr>
195+
</thead>
196+
<tbody>
197+
<tr>
198+
<td>NUMBER(p,s)</td>
199+
<td>DECIMAL(p, s)/BIGINT</td>
200+
<td>当s 大于 0 时,使用 DECIMAL(p, s);否则使用 BIGINT。</td>
201+
</tr>
202+
<tr>
203+
<td>
204+
LONG<br>
205+
</td>
206+
<td>BIGINT</td>
207+
<td></td>
208+
</tr>
209+
<tr>
210+
<td>
211+
DATE
212+
</td>
213+
<td>TIMESTAMP [(6)]</td>
214+
<td></td>
215+
</tr>
216+
<tr>
217+
<td>
218+
FLOAT<br>
219+
BINARY_FLOAT<br>
220+
</td>
221+
<td>FLOAT</td>
222+
<td></td>
223+
</tr>
224+
<tr>
225+
<td>
226+
BINARY_DOUBLE<br>
227+
DOUBLE
228+
</td>
229+
<td>DOUBLE</td>
230+
<td></td>
231+
</tr>
232+
<tr>
233+
<td>
234+
TIMESTAMP(p)
235+
</td>
236+
<td>TIMESTAMP [(p)]</td>
237+
<td></td>
238+
</tr>
239+
<tr>
240+
<td>
241+
TIMESTAMP(p) WITH TIME ZONE
242+
</td>
243+
<td>TIMESTAMP_TZ [(p)]</td>
244+
<td></td>
245+
</tr>
246+
<tr>
247+
<td>
248+
TIMESTAMP(p) WITH LOCAL TIME ZONE
249+
</td>
250+
<td>TIMESTAMP_LTZ [(p)]</td>
251+
<td></td>
252+
</tr>
253+
<tr>
254+
<td>
255+
INTERVAL YEAR(2) TO MONTH<br>
256+
INTERVAL DAY(3) TO SECOND(2)
257+
</td>
258+
<td>BIGINT</td>
259+
<td></td>
260+
</tr>
261+
<tr>
262+
<td>
263+
VARCHAR(n)<br>
264+
VARCHAR2(n)<br>
265+
NVARCHAR2(n)<br>
266+
NCHAR(n)<br>
267+
CHAR(n)<br>
268+
</td>
269+
<td>VARCHAR(n)</td>
270+
<td></td>
271+
</tr>
272+
<tr>
273+
<td>
274+
CLOB<br>
275+
BLOB<br>
276+
TEXT<br>
277+
NCLOB<br>
278+
SDO_GEOMETRY<br>
279+
XMLTYPE
280+
</td>
281+
<td>STRING</td>
282+
<td>目前,对于 Oracle 中的 BLOB 数据类型,仅支持长度不大于 2147483647(2**31-1)的 blob。 </td>
283+
</tr>
284+
</tbody>
285+
</table>
286+
</div>
287+
{{< top >}}

0 commit comments

Comments
 (0)