Istio-熔断测试

配置熔断规则，然后通过有意的使熔断器“跳闸”来测试配置。本实验来自于Istio官方熔断实验环境沿用上一篇Bookinfo环境

启动 httpbin 样例程序

将httpbin启动在default namespace，该Namespace已启用了sidecar注入

# Copyright Istio Authors
#
#   Licensed under the Apache License, Version 2.0 (the "License");
#   you may not use this file except in compliance with the License.
#   You may obtain a copy of the License at
#
#       http://www.apache.org/licenses/LICENSE-2.0
#
#   Unless required by applicable law or agreed to in writing, software
#   distributed under the License is distributed on an "AS IS" BASIS,
#   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
#   See the License for the specific language governing permissions and
#   limitations under the License.

##################################################################################################
# httpbin service
##################################################################################################
apiVersion: v1
kind: ServiceAccount
metadata:
  name: httpbin
---
apiVersion: v1
kind: Service
metadata:
  name: httpbin
  labels:
    app: httpbin
    service: httpbin
spec:
  ports:
  - name: http
    port: 8000
    targetPort: 80
  selector:
    app: httpbin
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: httpbin
spec:
  replicas: 1
  selector:
    matchLabels:
      app: httpbin
      version: v1
  template:
    metadata:
      labels:
        app: httpbin
        version: v1
    spec:
      serviceAccountName: httpbin
      containers:
      - image: docker.io/kennethreitz/httpbin
        imagePullPolicy: IfNotPresent
        name: httpbin
        ports:
        - containerPort: 80

注：手动注入，部署命令如下：

$ kubectl apply -f <(istioctl kube-inject -f samples/httpbin/httpbin.yaml)

创建一个目标规则，在调用 httpbin 服务时应用熔断设置：

kubectl apply -f - <<EOF
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: httpbin
spec:
  host: httpbin
  trafficPolicy:
    connectionPool:
      tcp:
        maxConnections: 1
      http:
        http1MaxPendingRequests: 1
        maxRequestsPerConnection: 1
    outlierDetection:
      consecutiveErrors: 1
      interval: 1s
      baseEjectionTime: 3m
      maxEjectionPercent: 100
EOF

在 DestinationRule 配置中，定义了 maxConnections: 1 和 http1MaxPendingRequests: 1。这些规则意味着，如果并发的连接和请求数超过一个，在 istio-proxy 进行进一步的请求和连接时，后续请求或连接将被阻止。

增加一个测试客户端

创建客户端程序Fortio以发送流量到 httpbin 服务。

Fortio 是一个快速、小巧（3Mb docker 镜像，最小依赖）、可重用、可嵌入的 go 库以及命令行工具和服务器进程，服务器包括一个简单的 Web UI 和结果的图形表示（都是单个延迟图以及多个结果比较最小值、最大值、平均值、qps 和百分位数图）。

Fortio 还包括一组服务器端功能（类似于 httpbin）以帮助调试和测试：请求回送包括标头、添加延迟或错误代码的概率分布、tcp 回送、tcp 代理、http 扇出/分散和收集代理服务器、GRPC 回声/健康以及 http 等…

向客户端注入 Istio Sidecar 代理，以便 Istio 对其网络交互进行管理：

$ kubectl apply -f <(istioctl kube-inject -f samples/httpbin/sample-client/fortio-deploy.yaml)

生成FORTIO_PORT参数

FORTIO_POD=$(kubectl get pod | grep fortio | awk ‘{ print $1 }’)

登入客户端 Pod 并使用 Fortio 工具调用 httpbin 服务。-curl 参数表明发送一次调用：

[root@master01 istio-1.10.3]# kubectl exec -it $FORTIO_POD  -c fortio -- /usr/bin/fortio  load  -curl  http://httpbin:8000/get
HTTP/1.1 200 OK
server: envoy
date: Tue, 10 Aug 2021 02:53:08 GMT
content-type: application/json
content-length: 594
access-control-allow-origin: *
access-control-allow-credentials: true
x-envoy-upstream-service-time: 11

{
  "args": {},
  "headers": {
    "Host": "httpbin:8000",
    "User-Agent": "fortio.org/fortio-1.11.3",
    "X-B3-Parentspanid": "1410263b9341228f",
    "X-B3-Sampled": "0",
    "X-B3-Spanid": "ddb57a39b067b8a3",
    "X-B3-Traceid": "5044667c51dbf7281410263b9341228f",
    "X-Envoy-Attempt-Count": "1",
    "X-Forwarded-Client-Cert": "By=spiffe://cluster.local/ns/default/sa/httpbin;Hash=d8906081c4cdfada648017c91bd4f7779d8fcfcc07505ae719e65d50df149ccb;Subject=\"\";URI=spiffe://cluster.local/ns/default/sa/default"
  },
  "origin": "127.0.0.6",
  "url": "http://httpbin:8000/get"
}

触发熔断器

在前面DestinationRule 配置中，定义了 maxConnections: 1 和 http1MaxPendingRequests: 1。如果并发的连接和请求数超过一个，在 istio-proxy 进行进一步的请求和连接时，后续请求或连接可能被阻止。

发送并发数为 2 的连接（-c 2），请求 20 次（-n 20）：

[root@master01 istio-1.10.3]# kubectl exec -it $FORTIO_POD  -c fortio -- /usr/bin/fortio load -c 2 -qps 0 -n 20 -loglevel Warning http://httpbin:8000/get
02:56:29 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 0 queries per second, 5->5 procs, for 20 calls: http://httpbin:8000/get
Starting at max qps with 2 thread(s) [gomax 5] for exactly 20 calls (10 per thread + 0)
02:56:29 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:56:29 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
Ended after 119.101006ms : 20 calls. qps=167.92
Aggregated Function Time : count 20 avg 0.010222645 +/- 0.01167 min 0.000775367 max 0.059878998 sum 0.204452894
# range, mid point, percentile, count
>= 0.000775367 <= 0.001 , 0.000887683 , 5.00, 1
> 0.003 <= 0.004 , 0.0035 , 10.00, 1
> 0.006 <= 0.007 , 0.0065 , 30.00, 4
> 0.007 <= 0.008 , 0.0075 , 65.00, 7
> 0.008 <= 0.009 , 0.0085 , 80.00, 3
> 0.01 <= 0.011 , 0.0105 , 90.00, 2
> 0.014 <= 0.016 , 0.015 , 95.00, 1
> 0.05 <= 0.059879 , 0.0549395 , 100.00, 1
# target 50% 0.00757143
# target 75% 0.00866667
# target 90% 0.011
# target 99% 0.0579032
# target 99.9% 0.0596814
Sockets used: 4 (for perfect keepalive, would be 2)
Jitter: false
Code 200 : 18 (90.0 %)
Code 503 : 2 (10.0 %)
Response Header Sizes : count 20 avg 207.1 +/- 69.03 min 0 max 231 sum 4142
Response Body/Total Sizes : count 20 avg 765.8 +/- 174.9 min 241 max 825 sum 15316
All done 20 calls (plus 0 warmup) 10.223 ms avg, 167.9 qps

上面的例子中，20个请求只丢失了2个。说明在httpbin的连接虽然在规则规定了maxConnections: 1 和 http1MaxPendingRequests: 1，但是在实际使用中，其1+1的模式最大可能的处理连接。 2. 将并发连接数提高到 3 个：

[root@master01 istio-1.10.3]# kubectl exec -it $FORTIO_POD  -c fortio -- /usr/bin/fortio load -c 3 -loglevel Warning http://httpbin:8000/get
02:59:15 I logger.go:127> Log level is now 3 Warning (was 2 Info)
Fortio 1.11.3 running at 8 queries per second, 5->5 procs, for 5s: http://httpbin:8000/get
Starting at 8 qps with 3 thread(s) [gomax 5] for 5s : 13 calls each (total 39)
02:59:15 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:16 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:16 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:16 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:17 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:17 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:17 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:17 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:17 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:18 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:18 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:18 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:19 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:20 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:20 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:20 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
02:59:20 W http_client.go:693> Parsed non ok code 503 (HTTP/1.1 503)
Ended after 4.882695412s : 39 calls. qps=7.9874
Sleep times : count 36 avg 0.3991201 +/- 0.006301 min 0.377749958 max 0.404587224 sum 14.3683237
Aggregated Function Time : count 39 avg 0.0063830586 +/- 0.006152 min 0.001316094 max 0.027898116 sum 0.248939285
# range, mid point, percentile, count
>= 0.00131609 <= 0.002 , 0.00165805 , 35.90, 14
> 0.002 <= 0.003 , 0.0025 , 48.72, 5
> 0.003 <= 0.004 , 0.0035 , 51.28, 1
> 0.004 <= 0.005 , 0.0045 , 53.85, 1
> 0.007 <= 0.008 , 0.0075 , 69.23, 6
> 0.008 <= 0.009 , 0.0085 , 79.49, 4
> 0.009 <= 0.01 , 0.0095 , 82.05, 1
> 0.01 <= 0.011 , 0.0105 , 89.74, 3
> 0.014 <= 0.016 , 0.015 , 92.31, 1
> 0.02 <= 0.025 , 0.0225 , 97.44, 2
> 0.025 <= 0.0278981 , 0.0264491 , 100.00, 1
# target 50% 0.0035
# target 75% 0.0085625
# target 90% 0.0142
# target 99% 0.0267679
# target 99.9% 0.0277851
Sockets used: 22 (for perfect keepalive, would be 3)
Jitter: false
Code 200 : 18 (46.2 %)
Code 503 : 21 (53.8 %)
Response Header Sizes : count 39 avg 106.25641 +/- 114.8 min 0 max 231 sum 4144
Response Body/Total Sizes : count 39 avg 510.17949 +/- 290.7 min 241 max 825 sum 19897
All done 39 calls (plus 3 warmup) 6.383 ms avg, 8.0 qps

上面的例子，有着40个请求的测试中，失败的个数达到了21个。说明在大规模请求的情况下，整体性能下降很快，熔断生效。

查询 istio-proxy 状态可以了解更多熔断详情:

[root@master01 istio-1.10.3]# kubectl exec $FORTIO_POD -c istio-proxy -- pilot-agent request GET stats | grep httpbin | grep pending        cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.default.remaining_pending: 1
cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.default.rq_pending_open: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.circuit_breakers.high.rq_pending_open: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_active: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_failure_eject: 0
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_overflow: 58
cluster.outbound|8000||httpbin.default.svc.cluster.local.upstream_rq_pending_total: 176

upstream_rq_pending_overflow 值 58，这意味着，目前为止已有 58个调用连接被标记为熔断。

清理规则/服务和客户端

$ kubectl delete destinationrule httpbin

下线 httpbin 服务和客户端：

$ kubectl delete deploy httpbin fortio-deploy $ kubectl delete svc httpbin

这个案例里的fortio测试镜像，在容器化环境可以引用。