Label Matching Problem in PromQL Queries for TCP Retransmission and Syn Retransmission Rates

### Issue Description
There is a label matching problem in the following PromQL queries:
```promql
sum by (instance) (rate(node_netstat_Tcp_RetransSegs{%(clusterLabel)s="$cluster"}[%(grafanaIntervalVar)s]) / rate(node_netstat_Tcp_OutSegs{%(clusterLabel)s="$cluster"}[%(grafanaIntervalVar)s]) * on (%(clusterLabel)s,namespace,pod) kube_pod_info{host_network="false"})
sum by (instance) (rate(node_netstat_TcpExt_TCPSynRetrans{%(clusterLabel)s="$cluster"}[%(grafanaIntervalVar)s]) / rate(node_netstat_Tcp_RetransSegs{%(clusterLabel)s="$cluster"}[%(grafanaIntervalVar)s]) * on (%(clusterLabel)s,namespace,pod) kube_pod_info{host_network="false"})
```
### Problem
The `sum by (instance)` aggregation is applied to the ratio calculations, but the `kube_pod_info` metric is not aggregated on the `instance` label, and it does not appear in the `on` clause. As a result, the join operation is performed on the `cluster`, `namespace`, and `pod` labels, which might lead to incorrect comparisons or misleading results.

### Steps to Reproduce
1. Execute the above PromQL queries in Prometheus:
    ```promql
    sum by (instance) (rate(node_netstat_Tcp_RetransSegs{%(clusterLabel)s="$cluster"}[%(grafanaIntervalVar)s]) / rate(node_netstat_Tcp_OutSegs{%(clusterLabel)s="$cluster"}[%(grafanaIntervalVar)s]) * on (%(clusterLabel)s,namespace,pod) kube_pod_info{host_network="false"})
    ```
    ```promql
    sum by (instance) (rate(node_netstat_TcpExt_TCPSynRetrans{%(clusterLabel)s="$cluster"}[%(grafanaIntervalVar)s]) / rate(node_netstat_Tcp_RetransSegs{%(clusterLabel)s="$cluster"}[%(grafanaIntervalVar)s]) * on (%(clusterLabel)s,namespace,pod) kube_pod_info{host_network="false"})
    ```
2. Observe the results, which are shown as:
    ```promql
    sum by (instance) (rate(node_netstat_Tcp_RetransSegs{%(clusterLabel)s="$cluster"}[1m0s]) / rate(node_netstat_Tcp_OutSegs{%(clusterLabel)s="$cluster"}[1m0s]) * on (%(clusterLabel)s,namespace,pod) kube_pod_info{host_network="false"})
    ```
    ```promql
    sum by (instance) (rate(node_netstat_TcpExt_TCPSynRetrans{%(clusterLabel)s="$cluster"}[1m0s]) / rate(node_netstat_Tcp_RetransSegs{%(clusterLabel)s="$cluster"}[1m0s]) * on (%(clusterLabel)s,namespace,pod) kube_pod_info{host_network="false"})
    ```
3. Notice that the results are incorrect due to the mismatch in labels used in the join operation.

### Expected Behavior
The queries should correctly aggregate and join the metrics on the appropriate labels to avoid misleading results.

### Possible Solution
To fix the issue, ensure that the `instance` label is considered in the join operation or modify the aggregation strategy. One possible solution might be to aggregate `kube_pod_info` on the `instance` label as well.

### Changes
The label matching problem was introduced in the following commit:

[d63872c](https://github.com/kubernetes-monitoring/kubernetes-mixin/commit/d63872c76e241aec57413fa0d07d4907c37b60f0#diff-a5b2c3fdeb5b2ee8cc465db32527a2c0bef17d1586d9b74ce2040f937a2814baR283)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Label Matching Problem in PromQL Queries for TCP Retransmission and Syn Retransmission Rates #949

Issue Description

Problem

Steps to Reproduce

Expected Behavior

Possible Solution

Changes

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Label Matching Problem in PromQL Queries for TCP Retransmission and Syn Retransmission Rates #949

Description

Issue Description

Problem

Steps to Reproduce

Expected Behavior

Possible Solution

Changes

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions