Maciej Zdeb
2018-10-19 11:37:25 UTC
Hi,
We're experiencing some inconsistency between backend members with health
check and those with tracking enabled.
Our configuration usually looks like this:
frontend front
mode http
bind 10.0.10.10:80 alpn h2,http/1.1 process 1
bind 10.0.10.10:80 alpn h2,http/1.1 process 2
default_backend back
backend back
mode http
server slot_0_checker 127.127.127.127:60001 check weight 0 disabled
server slot_1_checker 127.127.127.127:60002 check weight 0 disabled
server slot_0_0 127.127.127.127:60001 source 10.10.10.1 track
back/slot_0_checker weight 50 disabled
server slot_1_0 127.127.127.127:60002 source 10.10.10.1 track
back/slot_1_checker weight 50 disabled
server slot_0_1 127.127.127.127:60001 source 10.10.10.2 track
back/slot_0_checker weight 50 disabled
server slot_1_1 127.127.127.127:60002 source 10.10.10.2 track
back/slot_1_checker weight 50 disabled
Notice servers with name *_checker (with weight 0) which are used only for
health checks, and servers without _checker which are multiplied because of
different source address.
Our management software discovers changes in our environment and update
server addresses through haproxy socket. For example when adding member to
backend our software executes commands:
set server back/slot_0_checker addr 10.20.0.1 port 80; set server
back/slot_0_checker weight 0; enable server back/slot_0_checker
set server back/slot_0_0 addr 10.20.0.1 port 80; set server back/slot_0_0
weight 50; enable server back/slot_0_0
Usually everything is fine, but sometimes such operation results in
inconsistency. One of 8 haproxy processes on slot_0_checker sees correct UP
state but on slot_0_0 and slot_0_1 DOWN 3/3, on the rest 7 processes state
between slot_0_checker and slot_0_x is consistent and UP. Screen of such
situation and dump of "show server state" attached.
We tried to replicate it manually but without success. We would appreciate
any help! Thanks!
We're using:
HA-Proxy version 1.8.14-52e4d43 2018/09/20
Copyright 2000-2018 Willy Tarreau <***@haproxy.org>
Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
-fwrapv -fno-strict-overflow -Wno-unused-label -DIP_BIND_ADDRESS_NO_PORT=24
OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_DL=1 USE_OPENSSL=1 USE_LUA=1
USE_PCRE=1
Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with OpenSSL version : OpenSSL 1.0.2h 3 May 2016
Running on OpenSSL version : OpenSSL 1.0.2j 26 Sep 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.31 2012-07-06
Running on PCRE version : 8.31 2012-07-06
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace
We're experiencing some inconsistency between backend members with health
check and those with tracking enabled.
Our configuration usually looks like this:
frontend front
mode http
bind 10.0.10.10:80 alpn h2,http/1.1 process 1
bind 10.0.10.10:80 alpn h2,http/1.1 process 2
default_backend back
backend back
mode http
server slot_0_checker 127.127.127.127:60001 check weight 0 disabled
server slot_1_checker 127.127.127.127:60002 check weight 0 disabled
server slot_0_0 127.127.127.127:60001 source 10.10.10.1 track
back/slot_0_checker weight 50 disabled
server slot_1_0 127.127.127.127:60002 source 10.10.10.1 track
back/slot_1_checker weight 50 disabled
server slot_0_1 127.127.127.127:60001 source 10.10.10.2 track
back/slot_0_checker weight 50 disabled
server slot_1_1 127.127.127.127:60002 source 10.10.10.2 track
back/slot_1_checker weight 50 disabled
Notice servers with name *_checker (with weight 0) which are used only for
health checks, and servers without _checker which are multiplied because of
different source address.
Our management software discovers changes in our environment and update
server addresses through haproxy socket. For example when adding member to
backend our software executes commands:
set server back/slot_0_checker addr 10.20.0.1 port 80; set server
back/slot_0_checker weight 0; enable server back/slot_0_checker
set server back/slot_0_0 addr 10.20.0.1 port 80; set server back/slot_0_0
weight 50; enable server back/slot_0_0
Usually everything is fine, but sometimes such operation results in
inconsistency. One of 8 haproxy processes on slot_0_checker sees correct UP
state but on slot_0_0 and slot_0_1 DOWN 3/3, on the rest 7 processes state
between slot_0_checker and slot_0_x is consistent and UP. Screen of such
situation and dump of "show server state" attached.
We tried to replicate it manually but without success. We would appreciate
any help! Thanks!
We're using:
HA-Proxy version 1.8.14-52e4d43 2018/09/20
Copyright 2000-2018 Willy Tarreau <***@haproxy.org>
Build options :
TARGET = linux2628
CPU = generic
CC = gcc
CFLAGS = -O2 -g -fno-strict-aliasing -Wdeclaration-after-statement
-fwrapv -fno-strict-overflow -Wno-unused-label -DIP_BIND_ADDRESS_NO_PORT=24
OPTIONS = USE_GETADDRINFO=1 USE_ZLIB=1 USE_DL=1 USE_OPENSSL=1 USE_LUA=1
USE_PCRE=1
Default settings :
maxconn = 2000, bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with OpenSSL version : OpenSSL 1.0.2h 3 May 2016
Running on OpenSSL version : OpenSSL 1.0.2j 26 Sep 2016
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : SSLv3 TLSv1.0 TLSv1.1 TLSv1.2
Built with Lua version : Lua 5.3.1
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT
IP_FREEBIND
Encrypted password support via crypt(3): yes
Built with multi-threading support.
Built with PCRE version : 8.31 2012-07-06
Running on PCRE version : 8.31 2012-07-06
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Built with zlib version : 1.2.8
Running on zlib version : 1.2.8
Compression algorithms supported : identity("identity"),
deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with network namespace support.
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available filters :
[SPOE] spoe
[COMP] compression
[TRACE] trace