可以看出:
1.调用的都是 zone1 的两个实例。在 zone1 有实例的时候,调用的都是 zone1 的实例。
2.实例之间线程池隔离,例如下面这两行日志:
2020-05-27 08:00:07.835 INFO [service-provider,452ccc4ba7304c2c,5f56b4a90f40e5cd] [10276] [bulkhead-service-provider:192.168.0.142:8001-1][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8001/test-simple 2020-05-27 08:00:07.901 INFO [service-provider,452ccc4ba7304c2c,452ccc4ba7304c2c] [10276] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:28]:testGet zone1 2020-05-27 08:00:07.906 INFO [service-provider,452ccc4ba7304c2c,51e05504dafe88fc] [10276] [bulkhead-service-provider:192.168.0.142:8002-1][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8002/test-simple 2020-05-27 08:00:07.943 INFO [service-provider,452ccc4ba7304c2c,452ccc4ba7304c2c] [10276] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:29]:testGet zone1
调用192.168.0.142:8001
这个实例的线程池是bulkhead-service-provider:192.168.0.142:8001-1
,调用192.168.0.142:8002
这个实例的线程是bulkhead-service-provider:192.168.0.142:8002-1
。他们分别属于对应实例的resilience
线程池。
3.对于readTimeout
,get
请求会重试,post
,put
,delete
请求并不会重试。
对于 Get 请求的重试,先开始调用的是会 Timeout 的 192.168.0.142:8001
,readtimeout 后重试 192.168.0.142:8002
,成功返回
2020-05-27 09:47:27.975 INFO [service-provider,5de10ccae7fe4636,cca178414a880ec1] [6416] [bulkhead-service-provider:192.168.0.142:8001-5][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8001/test-read-time-out 2020-05-27 09:47:28.992 INFO [service-provider,5de10ccae7fe4636,947b76400677ab80] [6416] [bulkhead-service-provider:192.168.0.142:8002-5][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8002/test-read-time-out 2020-05-27 09:47:29.002 INFO [service-provider,5de10ccae7fe4636,5de10ccae7fe4636] [6416] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:56]:testTimeoutGet zone1
POST,PUT,DELETE 请求 readTimeout 之后,都不会重试:
2020-05-27 09:47:33.059 INFO [service-provider,5de10ccae7fe4636,a065f86f497efeff] [6416] [bulkhead-service-provider:192.168.0.142:8001-8][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: PUT -> http://192.168.0.142:8001/test-read-time-out 2020-05-27 09:47:34.062 ERROR [service-provider,5de10ccae7fe4636,5de10ccae7fe4636] [6416] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:74]:testTimeoutPut error: Read timed out
4.对于接口异常 500,get
请求会重试,post
,put
,delete
请求并不会重试。
对于 Get 请求的重试:
2020-05-27 09:47:35.098 INFO [service-provider,5de10ccae7fe4636,cca178414a880ec1] [6416] [bulkhead-service-provider:192.168.0.142:8001-5][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8001/test-exception-thrown 2020-05-27 09:47:35.099 INFO [service-provider,5de10ccae7fe4636,18c0b280dbb66634] [6416] [bulkhead-service-provider:192.168.0.142:8002-1][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8002/test-exception-thrown 2020-05-27 09:47:35.104 INFO [service-provider,5de10ccae7fe4636,5de10ccae7fe4636] [6416] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:85]:testExceptionThrownGet zone1
POST,PUT,DELETE 请求 readTimeout 之后,都不会重试:
2020-05-27 09:47:37.119 INFO [service-provider,5de10ccae7fe4636,14213e88b6256cc4] [6416] [bulkhead-service-provider:192.168.0.142:8001-1][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: POST -> http://192.168.0.142:8001/test-exception-thrown 2020-05-27 09:47:37.130 ERROR [service-provider,5de10ccae7fe4636,5de10ccae7fe4636] [6416] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:95]:testExceptionThrownPost error: HTTP/1.1 500 Internal Server Error connection: keep-alive content-type: application/json date: Wed, 27 May 2020 09:47:37 GMT transfer-encoding: chunked feign.httpclient.ApacheHttpClient$1@282032be
2. 测试 connectTimeout 与断路器打开是否会重试
去掉所有的sleep
代码:TimeUnit.SECONDS.sleep(2)
重新调用,这样会触发192.168.0.142:8001
这个实例的断路器打开,会看到类似于下面的日志。
GET请求会直接走健康的实例:
2020-05-27 10:21:22.344 INFO [service-provider,646176448e323aae,9ad3431d2f902590] [18552] [bulkhead-service-provider:192.168.0.142:8002-2][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8002/test-exception-thrown 2020-05-27 10:21:22.355 INFO [service-provider,646176448e323aae,646176448e323aae] [18552] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:84]:testExceptionThrownGet zone1
POST,PUT,DELETE 请求也会重试(因为断路器打开请求并没有发,可以重试其他健康的实例),并有日志提示:
2020-05-27 10:21:22.368 INFO [service-provider,646176448e323aae,646176448e323aae] [18552] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.config.CustomizedCircuitBreakerAspect:66]:retry on circuit breaker is on: CircuitBreaker 'service-provider:192.168.0.142:8001' is OPEN and does not permit further calls 2020-05-27 10:21:22.371 INFO [service-provider,646176448e323aae,72a9649927744693] [18552] [bulkhead-service-provider:192.168.0.142:8002-4][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: POST -> http://192.168.0.142:8002/test-exception-thrown 2020-05-27 10:21:22.375 INFO [service-provider,646176448e323aae,646176448e323aae] [18552] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:91]:testExceptionThrownPost zone1
关闭zone1-service-provider-instance1
这个实例,然后立刻测试,由于实例更新延迟,当前还会尝试请求zone1-service-provider-instance1
,这样就会触发 connectTimeout
对于 GET 请求,会重试:
2020-05-27 11:01:19.994 INFO [service-provider,1b89146721dbc129,734840868a313100] [18552] [bulkhead-service-provider:192.168.0.142:8001-3][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8001/test-simple 2020-05-27 11:01:21.000 INFO [service-provider,1b89146721dbc129,4d69c74c46e4ad18] [18552] [bulkhead-service-provider:192.168.0.142:8002-2][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8002/test-simple 2020-05-27 11:01:21.005 INFO [service-provider,1b89146721dbc129,1b89146721dbc129] [18552] [XNIO-2 task-3][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:28]:testGet zone1
对于 POST,PUT,DELETE 请求,不会重试:
2020-05-27 11:01:22.021 INFO [service-provider,1b89146721dbc129,d944884521bb654b] [18552] [bulkhead-service-provider:192.168.0.142:8001-5][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: POST -> http://192.168.0.142:8001/test-simple 2020-05-27 11:01:23.021 ERROR [service-provider,1b89146721dbc129,1b89146721dbc129] [18552] [XNIO-2 task-3][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:39]:testPost error: Connect to 192.168.0.142:8001 [/192.168.0.142] failed: connect timed out
3. 测试同 zone 内没有实例的现象
将 zone1 内的所有 service-provider 实例关闭,等待10秒左右(等待实例缓存更新),测试。 可以看到并没有返回其他 zone 的 service-provider:
2020-05-27 11:07:28.653 WARN [service-provider,d8ad2db60925852f,d8ad2db60925852f] [18552] [boundedElastic-3][com.github.hashjang.hoxton.service.consumer.config.RoundRobinBaseOnTraceIdLoadBalancer:52]:No servers available for service: service-provider 2020-05-27 11:07:28.655 WARN [service-provider,d8ad2db60925852f,d8ad2db60925852f] [18552] [XNIO-2 task-4][org.springframework.cloud.openfeign.loadbalancer.FeignBlockingLoadBalancerClient:67]:Load balancer does not contain an instance for the service service-provider 2020-05-27 11:07:28.664 WARN [service-provider,d8ad2db60925852f,d8ad2db60925852f] [18552] [boundedElastic-3][com.github.hashjang.hoxton.service.consumer.config.RoundRobinBaseOnTraceIdLoadBalancer:52]:No servers available for service: service-provider 2020-05-27 11:07:28.665 WARN [service-provider,d8ad2db60925852f,d8ad2db60925852f] [18552] [XNIO-2 task-4][org.springframework.cloud.openfeign.loadbalancer.FeignBlockingLoadBalancerClient:67]:Load balancer does not contain an instance for the service service-provider 2020-05-27 11:07:28.666 ERROR [service-provider,d8ad2db60925852f,d8ad2db60925852f] [18552] [XNIO-2 task-4][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:32]:testGet error: [503] during [GET] to [http://service-provider/test-simple] [ServiceProviderTestFeignCleint#testGet(Map)]: [Load balancer does not contain an instance for the service service-provider]
4. 测试不同微服务配置是否隔离
再启动 zone1 的所有 service-provider 实例。测试 service-consumer 的如下代码:
@RequestMapping("/testVariousServiceFeign") public void testVariousServiceFeign() { try { log.info("service-provider testTimeoutGet {}", serviceProviderTestReadTimeoutFeignCleint.testTimeoutGet()); } catch (Exception e) { log.error("service-provider testTimeoutGet error: {}", e.getMessage()); } try { log.info("service-provider2 testTimeoutGet {}", serviceProvider2TestReadTimeoutFeignCleint.testTimeoutGet()); } catch (Exception e) { log.error("service-provider2 testTimeoutGet error: {}", e.getMessage()); } try { log.info("service-provider testExceptionThrownDelete {}", serviceProviderTestExceptionThrownFeignCleint.testExceptionThrownGet()); } catch (Exception e) { log.error("service-provider testExceptionThrownDelete error: {}", e.getMessage()); } try { log.info("service-provider2 testExceptionThrownDelete {}", serviceProvider2TestExceptionThrownFeignCleint.testExceptionThrownGet()); } catch (Exception e) { log.error("service-provider2 testExceptionThrownDelete error: {}", e.getMessage()); } }
配置是:
feign: hystrix: enabled: false client: config: default: connectTimeout: 1000 readTimeout: 1000 service-provider2: connectTimeout: 1000 readTimeout: 8000 resilience4j.retry: configs: default: maxRetryAttempts: 1 waitDuration: 1 retryExceptions: - java.lang.Exception service-provider2: maxRetryAttempts: 4
这样,微服务 service-provider 的 readTimeout 是 1 秒,service-provider2 的 readTimeout 是 8 秒,所以调用 service-provider 会 readTimeout,但是调用 service-provider2 不会 readTimeout。重试方面, service-provider 会重试 1 次,service-provider2则会重试 4 次。
从日志上看出,的确是这样:
2020-05-27 11:35:11.461 INFO [service-provider,22ae9c0806b2a1a1,9b407dbd29038074] [11272] [bulkhead-service-provider:192.168.0.142:8001-1][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8001/test-read-time-out 2020-05-27 11:35:12.489 ERROR [service-provider,22ae9c0806b2a1a1,22ae9c0806b2a1a1] [11272] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:126]:service-provider testTimeoutGet error: Read timed out 2020-05-27 11:35:12.511 INFO [service-provider,22ae9c0806b2a1a1,e4811a2a098219bf] [11272] [bulkhead-service-provider2:192.168.0.142:8004-1][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8004/test-read-time-out 2020-05-27 11:35:17.531 INFO [service-provider,22ae9c0806b2a1a1,22ae9c0806b2a1a1] [11272] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:129]:service-provider2 testTimeoutGet zone1 2020-05-27 11:35:17.538 INFO [service-provider,22ae9c0806b2a1a1,d46022986064751e] [11272] [bulkhead-service-provider:192.168.0.142:8001-2][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8001/test-exception-thrown 2020-05-27 11:35:17.560 ERROR [service-provider,22ae9c0806b2a1a1,22ae9c0806b2a1a1] [11272] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:136]:service-provider testExceptionThrownDelete error: HTTP/1.1 500 Internal Server Error connection: keep-alive content-type: application/json date: Wed, 27 May 2020 11:35:17 GMT transfer-encoding: chunked feign.httpclient.ApacheHttpClient$1@d77e01d 2020-05-27 11:35:17.564 INFO [service-provider,22ae9c0806b2a1a1,27b1c085d8d575e8] [11272] [bulkhead-service-provider2:192.168.0.142:8004-2][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8004/test-exception-thrown 2020-05-27 11:35:18.078 INFO [service-provider,22ae9c0806b2a1a1,9b58bfb4335cf943] [11272] [bulkhead-service-provider2:192.168.0.142:8004-3][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8004/test-exception-thrown 2020-05-27 11:35:18.588 INFO [service-provider,22ae9c0806b2a1a1,3352ea0e472e64de] [11272] [bulkhead-service-provider2:192.168.0.142:8004-4][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8004/test-exception-thrown 2020-05-27 11:35:19.100 INFO [service-provider,22ae9c0806b2a1a1,a40281d44d528908] [11272] [bulkhead-service-provider2:192.168.0.142:8004-5][com.github.hashjang.hoxton.service.consumer.config.LoadBalancerConfig$CircuitBreakableClient:113]:call url: GET -> http://192.168.0.142:8004/test-exception-thrown 2020-05-27 11:35:19.115 ERROR [service-provider,22ae9c0806b2a1a1,22ae9c0806b2a1a1] [11272] [XNIO-2 task-1][com.github.hashjang.hoxton.service.consumer.controller.TestFeignController:141]:service-provider2 testExceptionThrownDelete error: HTTP/1.1 500 Internal Server Error connection: keep-alive content-type: application/json date: Wed, 27 May 2020 11:35:19 GMT transfer-encoding: chunked feign.httpclient.ApacheHttpClient$1@107e2ead 2020-05-27 11:40:05.435 INFO [service-provider,,] [11272] [AsyncResolver-bootstrap-executor-0][com.netflix.discovery.shared.resolver.aws.ConfigClusterResolver:43]:Resolving eureka endpoints via configuration
5. 测试 api-gateway 的加解密
启动 api-gateway,发送请求:
curl --location --request POST 'http://127.0.0.1:8201/service-provider/test-simple' \ --header 'Content-Type: application/json' \ --data-raw '{ "test":"test1234", "key":"key1234" }'
service-provider 的实例上可以看到:
2020-05-28 07:20:40.198 INFO [service-provider,1abdf34e5bd1e6bc,7e472f324fa44d94] [16604] [XNIO-2 task-458][com.github.hashjang.hoxton.service.provider.controller.TestServiceController:26]:test called POST, body {decrypted=test1234}
api-gateway 上面的日志:
2020-05-28 07:20:40.176 INFO [service-api-gateway,1abdf34e5bd1e6bc,1abdf34e5bd1e6bc] [1056] [reactor-http-nio-3][com.github.hashjang.hoxton.api.gateway.filter.EncryptFilter:50]: decrypt data: EncryptFilter.DecryptResult(successful=true, result={"decrypted":"test1234"}, key=key1234) 2020-05-28 07:20:40.185 INFO [service-api-gateway,1abdf34e5bd1e6bc,1abdf34e5bd1e6bc] [1056] [boundedElastic-56][com.github.hashjang.hoxton.api.gateway.filter.InstanceCircuitBreakerFilter:53]: try to send request to: http://192.168.0.142:8002/test-simple: stats: {"numberOfNotPermittedCalls":0,"numberOfSlowCalls":0,"numberOfBufferedCalls":0,"numberOfSlowSuccessfulCalls":0,"numberOfSuccessfulCalls":0,"numberOfSlowFailedCalls":0,"numberOfFailedCalls":0,"slowCallRate":-1.0,"failureRate":-1.0} 2020-05-28 07:20:40.203 INFO [service-api-gateway,1abdf34e5bd1e6bc,1abdf34e5bd1e6bc] [1056] [reactor-http-nio-4][com.github.hashjang.hoxton.api.gateway.filter.EncryptFilter$1:106]: encrypt response: zone1
请求响应是:
zone1 - key1234
可以看出,request body 还有 response body 都成功被修改。之后压测下这个接口。
6. api-gateway 重试
请求会触发 instance1 readTimeout 接口。
curl --location --request GET 'http://127.0.0.1:8201/service-provider/test-read-time-out'
发现有重试:
2020-05-28 07:23:55.746 INFO [service-api-gateway,96379b064aa79589,96379b064aa79589] [1056] [boundedElastic-57][com.github.hashjang.hoxton.api.gateway.filter.InstanceCircuitBreakerFilter:53]: try to send request to: http://192.168.0.142:8001/test-read-time-out: stats: {"numberOfNotPermittedCalls":0,"numberOfSlowCalls":0,"numberOfBufferedCalls":0,"numberOfSlowSuccessfulCalls":0,"numberOfSuccessfulCalls":0,"numberOfSlowFailedCalls":0,"numberOfFailedCalls":0,"slowCallRate":-1.0,"failureRate":-1.0} 2020-05-28 07:23:56.851 INFO [service-api-gateway,96379b064aa79589,96379b064aa79589] [1056] [boundedElastic-57][com.github.hashjang.hoxton.api.gateway.filter.InstanceCircuitBreakerFilter:53]: try to send request to: http://192.168.0.142:8002/test-read-time-out: stats: {"numberOfNotPermittedCalls":0,"numberOfSlowCalls":0,"numberOfBufferedCalls":0,"numberOfSlowSuccessfulCalls":0,"numberOfSuccessfulCalls":0,"numberOfSlowFailedCalls":0,"numberOfFailedCalls":0,"slowCallRate":-1.0,"failureRate":-1.0}
多线程并发请求,发现 instance1 的断路器打开,有一段时间仅把请求发送到 instance2
对于接口异常同理也会重试,尝试请求 instance1 有异常的接口:
curl --location --request GET 'http://127.0.0.1:8201/service-provider/test-exception-thrown'
发现也会重试:
2020-05-28 07:27:01.155 INFO [service-api-gateway,a9859e558a3f985e,a9859e558a3f985e] [1056] [boundedElastic-58][com.github.hashjang.hoxton.api.gateway.filter.InstanceCircuitBreakerFilter:53]: try to send request to: http://192.168.0.142:8001/test-exception-thrown: stats: {"numberOfNotPermittedCalls":0,"numberOfSlowCalls":0,"numberOfBufferedCalls":1,"numberOfSlowSuccessfulCalls":0,"numberOfSuccessfulCalls":0,"numberOfSlowFailedCalls":0,"numberOfFailedCalls":1,"slowCallRate":-1.0,"failureRate":-1.0} 2020-05-28 07:27:01.314 INFO [service-api-gateway,a9859e558a3f985e,a9859e558a3f985e] [1056] [boundedElastic-58][com.github.hashjang.hoxton.api.gateway.filter.InstanceCircuitBreakerFilter:53]: try to send request to: http://192.168.0.142:8002/test-exception-thrown: stats: {"numberOfNotPermittedCalls":0,"numberOfSlowCalls":0,"numberOfBufferedCalls":0,"numberOfSlowSuccessfulCalls":0,"numberOfSuccessfulCalls":0,"numberOfSlowFailedCalls":0,"numberOfFailedCalls":0,"slowCallRate":-1.0,"failureRate":-1.0}
多线程并发请求,发现 instance1 的断路器打开,有一段时间仅把请求发送到 instance2