Discussion:
HAProxy bytes in/bytes out stats are not updated
Sergey Arlashin
2018-11-18 14:23:23 UTC
Permalink
Hi!

We have a TCP service that is load-balanced with HAProxy. It works pretty well, however the stats page doesn't seem to report correct traffic statistics. Even though we have data transferred all the time, stats show the same amount of bytes in/out.

Our traffic if mainly long running TCP sessions that once are established, remain in ESTABLISHED state for a very long time. Probably it is somehow related?

Can anyone please help me sole this issue?

Thank you!

Regards,
Sergey
Willy Tarreau
2018-11-18 17:47:31 UTC
Permalink
Hi Sergey,
Post by Sergey Arlashin
Hi!
We have a TCP service that is load-balanced with HAProxy. It works pretty
well, however the stats page doesn't seem to report correct traffic
statistics. Even though we have data transferred all the time, stats show the
same amount of bytes in/out.
Our traffic if mainly long running TCP sessions that once are established,
remain in ESTABLISHED state for a very long time. Probably it is somehow
related?
Can anyone please help me sole this issue?
Stats are usually updated only at session termination. There is "option
contstats" to allow such counters to be updated upon each transfer, but
starting around 1.3.16 or so, it became less effective since it's only
performed at the upper layers while direct forwarding automatically
happens at much lower layers. With this said, with this option, an
update will be performed at least once every 2 GB, which I admit is
already not often enough for most use cases, but it's only a side effect
of the fact that we don't schedule more than 2 GB to be forwarded at once.

At the moment I don't know what could be done to force these counters to
be updated more often. I suspect that it would be possible to implement
a dummy filter to force this to happen, which could possibly be a nice
option instead of a one-size-fits-all, but I'm not certain about this.

If anyone else has an idea, I'm interested as well :-)

Willy
Sergey Arlashin
2018-11-20 14:35:14 UTC
Permalink
Hi Willy,

Thank you for the answer. I checked contstats and I see it is actually working. HAProxy - 1.8.1.
Even small requests are reflected in the traffic stats.


Regards,
Sergey
Post by Willy Tarreau
Hi Sergey,
Post by Sergey Arlashin
Hi!
We have a TCP service that is load-balanced with HAProxy. It works pretty
well, however the stats page doesn't seem to report correct traffic
statistics. Even though we have data transferred all the time, stats show the
same amount of bytes in/out.
Our traffic if mainly long running TCP sessions that once are established,
remain in ESTABLISHED state for a very long time. Probably it is somehow
related?
Can anyone please help me sole this issue?
Stats are usually updated only at session termination. There is "option
contstats" to allow such counters to be updated upon each transfer, but
starting around 1.3.16 or so, it became less effective since it's only
performed at the upper layers while direct forwarding automatically
happens at much lower layers. With this said, with this option, an
update will be performed at least once every 2 GB, which I admit is
already not often enough for most use cases, but it's only a side effect
of the fact that we don't schedule more than 2 GB to be forwarded at once.
At the moment I don't know what could be done to force these counters to
be updated more often. I suspect that it would be possible to implement
a dummy filter to force this to happen, which could possibly be a nice
option instead of a one-size-fits-all, but I'm not certain about this.
If anyone else has an idea, I'm interested as well :-)
Willy
Willy Tarreau
2018-11-20 14:51:46 UTC
Permalink
Post by Sergey Arlashin
Hi Willy,
Thank you for the answer. I checked contstats and I see it is actually working. HAProxy - 1.8.1.
Even small requests are reflected in the traffic stats.
Ah you're right, I completely forgot I addressed this two years ago
with this commit :

commit def0d22cc54229072a8abb6a850e6805208790f5
Author: Willy Tarreau <***@1wt.eu>
Date: Tue Nov 8 22:03:00 2016 +0100

MINOR: stream: make option contstats usable again

Quite a lot of people have been complaining about option contstats not
working correctly anymore since about 1.4. The reason was that one reason
for the significant performance boost between 1.3 and 1.4 was the ability
to forward data between a server and a client without waking up the stream
manager. And we couldn't afford to force sessions to constantly wake it
up given that most of the people interested in contstats are also those
interested in high performance transmission.
(...)

It now forces the streams to wake up at least every 5 seconds to update
the counters. It's even documented for the option. Be careful that with a
large number of concurrent connections (hundreds of thousands) it can cause
an increase of CPU usage even when the connections are idle, just because
each of them will wake up every 5 seconds. But usually it's not a problem
if you're facing issues with jumps in stats.

Great, I'm happy to have nothing to do and that something I did and did
not remember makes a user happy :-)

Willy
Sergey Arlashin
2018-11-20 19:29:43 UTC
Permalink
Also I just noticed, when I reload HAProxy in master worker mode with SIGUSR2, stats stop get updated for already established sessions. I need to reestablish the sessions in order to see stat updates.

Is this a desired behaviour? Or probably there is a way to fix this?


Thanks!

Regards,
Sergey
Post by Willy Tarreau
Post by Sergey Arlashin
Hi Willy,
Thank you for the answer. I checked contstats and I see it is actually working. HAProxy - 1.8.1.
Even small requests are reflected in the traffic stats.
Ah you're right, I completely forgot I addressed this two years ago
commit def0d22cc54229072a8abb6a850e6805208790f5
Date: Tue Nov 8 22:03:00 2016 +0100
MINOR: stream: make option contstats usable again
Quite a lot of people have been complaining about option contstats not
working correctly anymore since about 1.4. The reason was that one reason
for the significant performance boost between 1.3 and 1.4 was the ability
to forward data between a server and a client without waking up the stream
manager. And we couldn't afford to force sessions to constantly wake it
up given that most of the people interested in contstats are also those
interested in high performance transmission.
(...)
It now forces the streams to wake up at least every 5 seconds to update
the counters. It's even documented for the option. Be careful that with a
large number of concurrent connections (hundreds of thousands) it can cause
an increase of CPU usage even when the connections are idle, just because
each of them will wake up every 5 seconds. But usually it's not a problem
if you're facing issues with jumps in stats.
Great, I'm happy to have nothing to do and that something I did and did
not remember makes a user happy :-)
Willy
Lukas Tribus
2018-11-20 20:44:41 UTC
Permalink
Hello Sergey,

On Tue, 20 Nov 2018 at 20:30, Sergey Arlashin
Post by Sergey Arlashin
Also I just noticed, when I reload HAProxy in master worker mode with SIGUSR2, stats
stop get updated for already established sessions. I need to reestablish the sessions in
order to see stat updates.
Is this a desired behaviour?
The established session is connected to the *old* process, which has
nothing to do with the new process (and it's statistics that you are
looking at).

Restart the process hard, limit the amount of time those old session
keep running by lowering timeouts and/or setting hard-stop-after.


Regards,
Lukas
Willy Tarreau
2018-11-21 03:50:40 UTC
Permalink
Post by Lukas Tribus
Restart the process hard, limit the amount of time those old session
keep running by lowering timeouts and/or setting hard-stop-after.
Indeed that's also a very good solution, it depends if it's acceptable
to stop older sessions or not (which I don't know).

Some protocols using long sessions (like RDP) support very well being
disconnected and reconnected, it's even transparent for the user. For
some other (like SSH) it's clearly not acceptable. But to be honest
I've never understood why some people load-balance SSH :-) Here I
don't know what it is.

Willy

Willy Tarreau
2018-11-21 03:47:57 UTC
Permalink
Post by Sergey Arlashin
Also I just noticed, when I reload HAProxy in master worker mode with
SIGUSR2, stats stop get updated for already established sessions. I need to
reestablish the sessions in order to see stat updates.
Is this a desired behaviour? Or probably there is a way to fix this?
No, they are still updated, it's just that your stats requests are sent
to the new process, which still doesn't have new connections. There is
no real "fix", if you have long sessions and you reload frequently, you
end up with multiple processes each having their own sessions and own
stats. With the new master-worker features in 1.9 you'll be able to
check the stats of older processes as well (via the CLI) in case you
really need this. Another option could be to figure why you need to
reload often, which seems a bit contradictory with the need for long
sessions. Maybe some operations don't require a reload.

Willy
Loading...