Sharing OpenSSL CTX between multiple sockets

Discussion:

Julian Wiesener

2018-11-22 17:11:26 UTC

Hello,

one of our clients runs a haproxy setup with a 2000+ SSL-Certificates on multiple IPs. As an OpenSSL CTX needs to be created for each certificate for each sockets, restarting or reloading the config takes several minutes. Therfore i like to propose to share the CTX for on multiple sockets, which reduces the reload-times to acceptable values (~9 secs+0.5 per IP instead of 8 oer IP on our testsetup).

The attached patch is a POC based on 73373ab43a4599e8bcb209687e9ec1c7be37779a, i'm aware that this is at this state not commitable, but i like to discuss the further directions.

First issue is, it needs to be configurable and should be disabled by default, as it disables the ability to set different SSL options on different IPs. One way would be a global option, good enough for me, but you wouldn't be able to run a client-certificate setup only one IP and share the server certificates with others. Enable by bind configuration is an onther option that should not make to mouch trouble, however we still would need to decide on how to name them, i would propose share_ssl_ctx.

Also there is some name confusion in the code. atm there is struct shared_context, which is not for sharing CTX, it is a shared session cache. Therfore i introduced shared_ssl_ctx_list which does share the context. I could live with it, but i think that might lead to confusion in the future.

I did not yet care for reloading the config. If someone would disable the proposed sharing option and reloading, it might end in an disaster, i'll need to look into the code, removing or changing of certificates without restart is also tbd.

It also was not clear to me yet, if i require a lock, a ctx at any time, if only one thread is running whenever the config is reloaded, i think there is no need to, but i might have missed cases which could lead to deinitialization of an CTX.

So please let me know what you think and what needs to be done to get it upstream.

Kind regards,
Julian

Julian Wiesener

2018-11-22 17:13:37 UTC

Permalink

Hi again,

of course i forgot to attach the patch...

Kind regards,
Julian

Lukas Tribus

2018-11-22 18:39:11 UTC

Permalink

Hello Julian,

Post by Julian Wiesener
Hello,
one of our clients runs a haproxy setup with a 2000+ SSL-Certificates on multiple IPs.
As an OpenSSL CTX needs to be created for each certificate for each sockets,
restarting or reloading the config takes several minutes. Therfore i like to propose
to share the CTX for on multiple sockets, which reduces the reload-times to
acceptable values (~9 secs+0.5 per IP instead of 8 oer IP on our testsetup).

Trying to understand the use-case better here, binding to any IP is
not acceptable? Your client *needs* to bind to specific IPs?

Like:
bind :443 ssl crt /etc/...

Binding to different IPs should be also possible though:

bind 10.0.0.1:443,10.0.0.2:443,10.0.0.4,443 ssl crt /etc/

I'd assume such a configuration would only create a single CTX, do you
know whether that is in fact the case (as it's just an assumption)?

Just looking for the simplest possible approach here ...

Regards,
Lukas

Julian Wiesener

2018-11-22 19:09:07 UTC

Permalink

Hi Lukas,

On Thu, 22 Nov 2018 19:39:11 +0100

Post by Lukas Tribus
Trying to understand the use-case better here, binding to any IP is
not acceptable? Your client *needs* to bind to specific IPs?

one bind for multiple IPs would reduce the flexibility of the config, you could not longer set different Backends on different IPs that share one certificate (directory) for example. There are clealy ways to reduce the reload times by changing the configuration, however there is a strong preference to keep the current configuration layout and solve the problem in code, if there are no strong reasons against it.

Post by Lukas Tribus
I'd assume such a configuration would only create a single CTX, do you
know whether that is in fact the case (as it's just an assumption)?

I did not verify, but based on my understanding of the code, i would assume the same.

Regards,
Julian

Lukas Tribus

2018-11-22 20:10:43 UTC

Permalink

Hello Julian,

Post by Julian Wiesener
Hi Lukas,
On Thu, 22 Nov 2018 19:39:11 +0100

Post by Lukas Tribus
Trying to understand the use-case better here, binding to any IP is
not acceptable? Your client *needs* to bind to specific IPs?

one bind for multiple IPs would reduce the flexibility of the config, you
could not longer set different Backends on different IPs that share one
certificate (directory) for example. There are clealy ways to reduce
the reload times by changing the configuration, however there is a
strong preference to keep the current configuration layout and solve
the problem in code, if there are no strong reasons against it.

You are advocating for a new code path and along with an additional
configuration knob. Feature bloat is a problem and if a use-case can
be covered by modifying the configuration slightly, I'd personally opt
for the latter. We should certainly look at all possibilities to cover
the use-case. But I want to make it clear: I'm not the the SSL
maintainer or the author, this is merely my personal opinion about
this as a contributor.

As for flexibility with my 2 proposals you can access the destination
IP in an ACL with the dst variable, which I believes gives you
everything you need, including backend selection based on the
connected IP, all with a single bind statement. I understand that was
just an example, but with the ACL variables you should be able to do
*a lot*.

It sounds like the reason for this is not that you need the same
certificate on multiple IP's, but that all certificates are in one
directory - which quite frankly would be better addressed in the
provisioning layer. The crt-list feature can also be used to
selectively load certificates at scale, but it does require the
provisioning layer to know which certificates belongs to which bind
statement.

Just my two cents ...

Best regards,
Lukas

Willy Tarreau

2018-11-23 06:35:00 UTC

Permalink

Hi guys,

Post by Lukas Tribus

Post by Julian Wiesener
one bind for multiple IPs would reduce the flexibility of the config, you
could not longer set different Backends on different IPs that share one
certificate (directory) for example. There are clealy ways to reduce
the reload times by changing the configuration, however there is a
strong preference to keep the current configuration layout and solve
the problem in code, if there are no strong reasons against it.

In fact, after Aurelien's patches to update SSL certs over the CLI,
we've had a long meeting here with Emeric and William trying to spot
the various corner cases. We came up to the conclusion that there
definitely is some value in being able to merge same certs, but that
the name is not something sufficent. One simple example is when you
present different certs with different key sizes depending on the
listener (the bind line in fact). The common case is when you have
a partner connecting over a VPN using a crappy old client with some
algorithms or key size limitations and that TLS is almost useless on
this path, but you also want to present a strong certificate to the
net. It can also be the opposite, a partner demanding very strong
security for critical operations but you don't want to affect your
regular users. And the same distinction sometimes exists between
internal and public networks. So we figured that since in the end
it's about loading files and these files end up being the only
discriminant on the file system, which can be atomically updated, the
path to the cert files definitely had to be used as the key to
distinguish all these certs. We also found that it's important to
distinguish the cert types (RSA/DSA/ECDSA) as not all bind lines
necessarily want to share all of them, thus you may end up creating
different contexts for the same cert just based on the types used.

One valid use case for sharing certs is when some people use multiple
"bind" lines to bind to different processes/threads for improved
performance. I want to address these by moving the process and thread
masks from the bind_conf struct to the listener itself. The code
already supports having multiple listeners per bind_conf since it
supports port ranges and comma-delimited addresses. So it's not very
difficult but it requires quite some changes spread all over the code
that I don't feel enthousiast to do right now.

There are some other (ugly) use cases where people load certs from
directories, and instead of having one directory per hosted site,
they load all the certs in all frontends... It uses quite some RAM and
some startup time! We've seen one such case where haproxy was using
something like 3.5 GB on startup because a few tens of thousands of
certs present in a directory were loaded multiple times.

Post by Lukas Tribus
As for flexibility with my 2 proposals you can access the destination
IP in an ACL with the dst variable, which I believes gives you
everything you need, including backend selection based on the
connected IP, all with a single bind statement. I understand that was
just an example, but with the ACL variables you should be able to do
*a lot*.

In general I agree that it is one of the best solutions, which happens
to be very flexible and can even be updated without reloading, over the
CLI. Some even use map files for this, something roughly like this :

frontend public
bind :443 ...
use_backend %[dst,map(ip-to-back.map)]

And your ip-to-back.map file contains things like this :

10.0.0.1 back-cust1
10.0.0.2 back-cust2
...

With this said, I'd like that we manage to deduplicate certs at startup
after 1.9, at least in order to ease updates over the CLI. Boot time and
RAM usage savings will mostly be a byproduct of this. I don't want to
encourage bad usages when solutions exist but if new features come with
improvements to existing ones, that's fine :-)

Cheers,
Willy