Skip to content

feat: list common disconnect reasons (r58) #3128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: release-5.8
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 45 additions & 4 deletions en_US/admin/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,9 +102,9 @@ Loaded 'Mod' module on []: ok

## conf cluster_sync

This command is mostly for troubleshooting when there is something wrong with cluster-calls used to sync configuration changes between the nodes in the cluster.
This command is mostly for troubleshooting when there is something wrong with cluster-calls used to sync configuration changes between the nodes in the cluster.

::: tip
::: tip

In EMQX 5.0.x, this command was named `cluster_call`. This old command is still available in EMQX 5.1 but it is not displayed in the help information.

Expand Down Expand Up @@ -704,7 +704,7 @@ This command is like the `trace` command, but applies on all nodes in the cluste

## traces

This command is similar to the `trace` command, but it starts or stops a tracer across all nodes in the cluster.
This command is similar to the `trace` command, but it starts or stops a tracer across all nodes in the cluster.

| Command | Description |
| ------------------------------------------------------- | --------------------------------- |
Expand Down Expand Up @@ -787,6 +787,7 @@ tcp:default
running : true
current_conn : 12
max_conns : 5000000
shutdown_count : [{takenover,2},{discarded,1}]
ws:default
listen_on : 0.0.0.0:8083
acceptors : 16
Expand All @@ -803,6 +804,46 @@ wss:default
max_conns : 5000000
```

#### Common Shutdown Reasons

For TCP listeners, EMQX reports a `shutdown_count` field, which records how many client disconnections occurred, grouped by reason. This can help identify why clients are getting disconnected from the TCP listener.

```bash
shutdown_count : [{takenover,2},{discarded,1}]
```

In the example above:

- 2 clients were disconnected because a new session took over the same `clientid`.
- 1 client was discarded because a new clean session replaced it.

Below is a list of common shutdown reasons that may appear in this field.

| Reason | Description |
| ----------------------------- | ------------------------------------------------------------ |
| `banned` | The client is blacklisted due to ACL violations, rate limiting, or IP restrictions. |
| `closed` | The connection was closed either by the server or the client. |
| `discarded` | A new client with the same `clientid` and `clean_start = true` connected while the previous session was still active. |
| `takenover` | A new client with the same `clientid` and `clean_start = false` connected while the previous session was still active. |
| `einval` | An invalid argument or socket error occurred, often caused by attempting to write to a socket that has already been closed (a race condition between socket state change notification and data write). |
| `frame_too_large` | The MQTT packet exceeds the maximum allowed frame size. |
| `idle_timeout` | No `CONNECT` packet was received within the allowed time after the TCP/SSL connection was established. |
| `invalid_proto_name` | The protocol name in the `CONNECT` packet is invalid or not `"MQTT"`. |
| `invalid_topic` | The client used an invalid topic (e.g., containing illegal characters or forbidden by the broker). |
| `keepalive_timeout` | The client failed to send any packets within the keepalive interval. |
| `malformed_packet` | The MQTT packet is corrupted or does not conform to the MQTT specification. |
| `not_authorized` | The client attempted an unauthorized action, rejected by ACL. |
| `ssl_closed` | The SSL/TLS connection was closed by the peer. |
| `ssl_error` | An error occurred during the SSL/TLS handshake or data transmission. |
| `ssl_upgrade_timeout` | The SSL/TLS handshake did not complete within the allowed time. |
| `unexpected_packet` | The client sent a packet that was unexpected in the current connection state. |
| `zero_remaining_len` | The packet has a zero remaining length field, which is invalid in most contexts. |
| `bad_username_or_password` | Authentication failed due to incorrect username or password. |
| `client_identifier_not_valid` | The provided `clientid` is invalid or locked by another client during login. |
| `protocol_error` | A generic MQTT protocol violation occurred. |
| `tcp_closed` | The TCP connection was closed by the client or due to a network issue. |
| `timeout` | A general timeout occurred (e.g., during authentication, etc.). |

### listeners stop \<Identifier\>

```bash
Expand Down Expand Up @@ -832,7 +873,7 @@ Restarted tcp:default listener successfully.
Restarting a listener causes all the connected clients to disconnect.
:::

### listeners enable \<Identifier\> <true/false>
### listeners enable \<Identifier\> <true/false>

```bash
$ emqx ctl listeners enable tcp:default true
Expand Down
41 changes: 41 additions & 0 deletions zh_CN/admin/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -782,6 +782,7 @@ tcp:default
running : true
current_conn : 12
max_conns : 5000000
shutdown_count : [{takenover,2},{discarded,1}]
ws:default
listen_on : 0.0.0.0:8083
acceptors : 16
Expand All @@ -798,6 +799,46 @@ wss:default
max_conns : 5000000
```

#### 常见连接关闭原因

对于 TCP 监听器,EMQX 会报告一个 `shutdown_count` 字段,用于统计客户端断开连接的次数,并按原因分类。该字段有助于分析客户端为何被断开连接。

```
shutdown_count : [{takenover,2},{discarded,1}]
```

上述示例中:

- 有 2 个客户端因为新的会话接管了相同的 `clientid` 而被断开连接(`takenover`)。
- 有 1 个客户端因为被新的 clean session 替换而被丢弃(`discarded`)。

以下是 `shutdown_count` 字段中可能出现的一些常见断开原因:

| 原因 | 描述 |
| ----------------------------- | ------------------------------------------------------------ |
| `banned` | 客户端因违反 ACL 规则、触发限流或 IP 被限制而被拉入黑名单并断开连接。 |
| `closed` | 连接被服务器或客户端主动关闭。 |
| `discarded` | 新客户端使用相同的 `clientid` 且设置了 `clean_start = true`,在旧会话仍然活跃的情况下连接,导致旧会话被丢弃。 |
| `takenover` | 新客户端使用相同的 `clientid` 且设置了 `clean_start = false`,在旧会话仍然活跃的情况下连接,导致旧会话被接管。 |
| `einval` | 出现无效参数或 socket 错误,通常是因为尝试向已被系统关闭的 socket 写入数据(可能是 socket 状态变更通知与数据写入之间的竞争条件所致)。 |
| `frame_too_large` | 收到的 MQTT 报文超过了允许的最大帧大小。 |
| `idle_timeout` | TCP/SSL 连接建立后,在配置的超时时间内未收到 `CONNECT` 报文。 |
| `invalid_proto_name` | `CONNECT` 报文中的协议名称无效或不是 `"MQTT"`。 |
| `invalid_topic` | 客户端使用了无效或被禁止的主题(如包含非法字符或被 Broker 限制)。 |
| `keepalive_timeout` | 客户端在 Keep Alive 时间间隔内未发送任何报文。 |
| `malformed_packet` | MQTT 报文损坏或不符合 MQTT 协议规范。 |
| `not_authorized` | 客户端尝试执行未被授权的操作,被 ACL 拒绝。 |
| `ssl_closed` | SSL/TLS 连接被对端关闭。 |
| `ssl_error` | SSL/TLS 握手或数据传输过程中发生错误。 |
| `ssl_upgrade_timeout` | SSL/TLS 握手未在允许时间内完成。 |
| `unexpected_packet` | 客户端在当前连接状态下发送了不应出现的报文。 |
| `zero_remaining_len` | MQTT 报文的剩余长度字段为 0,在大多数场景下是非法的。 |
| `bad_username_or_password` | 客户端身份认证失败,用户名或密码错误。 |
| `client_identifier_not_valid` | 提供的 `clientid` 无效,或被另一个正在登录的客户端锁定。 |
| `protocol_error` | 出现通用的 MQTT 协议错误。 |
| `tcp_closed` | TCP 连接被客户端或网络层关闭。 |
| `timeout` | 发生通用超时错误(如在认证或握手过程中)。 |

### listeners stop \<Identifier\>

```bash
Expand Down