Jesse Szwedko
2018-12-05 15:05:58 UTC
Hi all,
This is very related to an old thread,
https://www.mail-archive.com/***@formilux.org/msg22988.html, but I
wasn't able to find a good way to reply to it having not been on the list
at that point.
I'd like to enumerate a use case similar, but slightly different, to what
Pavlos described that might highlight the benefit of tracking per
connection or in the hopes that someone might point me in the direction of
a better approach :).
I am running HAProxy in a number of application VMs to handle TLS
termination and connection pooling to the local application. Incoming HTTP
requests are load balanced across these by an AWS load balancer. I utilize
HTTP Keep-Alive to avoid them having to constantly setup and tear down TCP
connections (and re-perform the TLS negotiation), but my clients send a
continuous stream of requests, meaning they never expire the keepalive
timeout, and thus typically using the same TCP connection indefinitely. As
a result they stick to the HAProxy instance that they initially connect to,
and thus the application they are initially proxied to.
The issue I have is that the long lived HTTP clients, while having
generally consistent request characteristics within a given client, do not
have similar characteristics not across clients (e.g. client A may send 1
MB/s over 10 req/s while client B may send 100 MB/s over 20 req/s). As the
clients have inconsistent behavior between each other, this can result in
some applications receiving heavier clients than others and an overall
imbalance of load due to their stickiness. I'm hoping to help alleviate
this imbalance by occasionally requiring clients to reconnect. I realize
that I can use `Connection: Close`, but this relies on the client
respecting this header.
I do plan to eventually move the HAProxies off onto a separate tier which
would load balance across all of the available back-ends (where http-reuse
could help me), but until then I've come up with the following
configuration using a stick-table in an attempt to mimic a
"http-max-keepalive-requests" directive:
frontend https-in
bind *:443 ssl crt /etc/ssl/private/cert.pem
stick-table type binary len 20 size 1000 expire 75s store gpc0
acl close_connection sc0_get_gpc0 ge 100
acl exceeded_connection sc0_get_gpc0 gt 100
acl reset_connection sc0_clr_gpc0 ge 0
acl mark_connection sc0_inc_gpc0 gt 0
timeout http-keep-alive 75s
timeout client 75s
http-request set-header X-Concat %[src]:%[src_port]:%[dst]:%[dst_port]
http-request set-header X-Concat-SHA %[req.fhdr(X-Concat),sha1]
http-request track-sc0 req.fhdr(X-Concat-SHA)
http-request sc-inc-gpc0(0)
# reject clients that have exceeded the maximum number of connections
while handling new client connections that managed to reuse the same port
http-request reject if !{ http_first_req } exceeded_connection
http-response set-header Connection Keep-Alive unless close_connection
http-response set-header Keep-Alive timeout=75\ max=100 if {
http_first_req } reset_connection
mark_connection
http-response set-header Connection Close if close_connection
default_backend https
backend https
server localhost 127.0.0.1:8080 maxconn 1000
The things I don't like about it are:
* Is there an easier way to track on the TCP connection,? I wasn't able to
find a fetch parameter that seemed to correspond to this and so ended up
making dummy headers to create a unique tuple identifying the connection to
track.
* Is it possible to close the connection at the same time I am adding the
`Connection: Close` header? Right now, clients that ignore the header get
an abruptly closed connection on their next request. Admittedly this is a
client problem, but I've found that a decent number of our clients don't
respect this header and would prefer to handle them a little more
gracefully.
I was also hoping someone might have suggestions for improving my approach.
Cheers!
-Jesse
This is very related to an old thread,
https://www.mail-archive.com/***@formilux.org/msg22988.html, but I
wasn't able to find a good way to reply to it having not been on the list
at that point.
I'd like to enumerate a use case similar, but slightly different, to what
Pavlos described that might highlight the benefit of tracking per
connection or in the hopes that someone might point me in the direction of
a better approach :).
I am running HAProxy in a number of application VMs to handle TLS
termination and connection pooling to the local application. Incoming HTTP
requests are load balanced across these by an AWS load balancer. I utilize
HTTP Keep-Alive to avoid them having to constantly setup and tear down TCP
connections (and re-perform the TLS negotiation), but my clients send a
continuous stream of requests, meaning they never expire the keepalive
timeout, and thus typically using the same TCP connection indefinitely. As
a result they stick to the HAProxy instance that they initially connect to,
and thus the application they are initially proxied to.
The issue I have is that the long lived HTTP clients, while having
generally consistent request characteristics within a given client, do not
have similar characteristics not across clients (e.g. client A may send 1
MB/s over 10 req/s while client B may send 100 MB/s over 20 req/s). As the
clients have inconsistent behavior between each other, this can result in
some applications receiving heavier clients than others and an overall
imbalance of load due to their stickiness. I'm hoping to help alleviate
this imbalance by occasionally requiring clients to reconnect. I realize
that I can use `Connection: Close`, but this relies on the client
respecting this header.
I do plan to eventually move the HAProxies off onto a separate tier which
would load balance across all of the available back-ends (where http-reuse
could help me), but until then I've come up with the following
configuration using a stick-table in an attempt to mimic a
"http-max-keepalive-requests" directive:
frontend https-in
bind *:443 ssl crt /etc/ssl/private/cert.pem
stick-table type binary len 20 size 1000 expire 75s store gpc0
acl close_connection sc0_get_gpc0 ge 100
acl exceeded_connection sc0_get_gpc0 gt 100
acl reset_connection sc0_clr_gpc0 ge 0
acl mark_connection sc0_inc_gpc0 gt 0
timeout http-keep-alive 75s
timeout client 75s
http-request set-header X-Concat %[src]:%[src_port]:%[dst]:%[dst_port]
http-request set-header X-Concat-SHA %[req.fhdr(X-Concat),sha1]
http-request track-sc0 req.fhdr(X-Concat-SHA)
http-request sc-inc-gpc0(0)
# reject clients that have exceeded the maximum number of connections
while handling new client connections that managed to reuse the same port
http-request reject if !{ http_first_req } exceeded_connection
http-response set-header Connection Keep-Alive unless close_connection
http-response set-header Keep-Alive timeout=75\ max=100 if {
http_first_req } reset_connection
mark_connection
http-response set-header Connection Close if close_connection
default_backend https
backend https
server localhost 127.0.0.1:8080 maxconn 1000
The things I don't like about it are:
* Is there an easier way to track on the TCP connection,? I wasn't able to
find a fetch parameter that seemed to correspond to this and so ended up
making dummy headers to create a unique tuple identifying the connection to
track.
* Is it possible to close the connection at the same time I am adding the
`Connection: Close` header? Right now, clients that ignore the header get
an abruptly closed connection on their next request. Admittedly this is a
client problem, but I've found that a decent number of our clients don't
respect this header and would prefer to handle them a little more
gracefully.
I was also hoping someone might have suggestions for improving my approach.
Cheers!
-Jesse