[doc] Split peer and root CA for trusted certificates; Add purpose #6792

minglumlu · 2025-12-12T06:50:27Z

Split peer and root CA for user installed trusted certificates.
Add purpose for user installed certificates.

robhoes · 2025-12-12T09:13:39Z

The doc ‎doc/content/design/pool-certificates.md is a historical design document from a few years ago with status "released". We should leave it as is, and add a new doc for proposed changes and additions.

psafont · 2025-12-12T09:55:16Z

It makes sense to keep the historical document as is, but it should be clear as well that it has been ammended or obsoleted by another, newer design. RFCs usually use links to designate this relationship between them. For example RFC 6106 and RFC 8106

psafont · 2025-12-12T12:08:54Z

doc/content/design/pool-certificates.md

+| Trusted root CA for \<PURPOSE\>| /etc/stunnel/certs-ca-\<PURPOSE\>/      | no (derived from the \<PURPOSE\>) | Root CA certificates that users have installed to validate the server certificate for \<PURPOSE\>
+| Trusted peer for \<PURPOSE\>| /etc/stunnel/certs-peer-\<PURPOSE\>/       | no (derived from the \<PURPOSE\>) | Peer certificates that users have installed to validate the server certificate for \<PURPOSE\>


The first directory is user-configurable, through the API, as per the new use-case (the user needs to configure the root before trusting anything)

The second one, because it relies on certificate pinning, I think it shouldn't and the API for users to control them should be removed. If this doesn't ring true, then I think more explanations should be added to understand the threat model this meant to protect against to understand what's needed or not.

I understand the locations of individual certificates are internal info of XAPI; and the bundle locations need to be known by local TLS communication endpoints.
I can't see the use case of letting the locations configurable by users.

But installing certificates for a purpose is being able to configure the certificates, at least from what is presented in the table

I've not get the point. Please bear with me. But the full path of the store in the filesystem is determined by XAPI. Only a part of it is determined by the purpose received from the API. Say, the purpose is for wlb; licensing, the certificates with this purpose will be stored under /etc/trusted-certs/ca-wlb/name1.pem or /etc/trusted-certs/peer-wlb/name1.pem and /etc/trusted-certs/ca-licensing/name1.pem or /etc/trusted-certs/peer-licensing/name1.pem.
The corresponding bundles would be:
/etc/trusted-certs/peer-bundle-wlb.pem or /etc/trusted-certs/peer-bundle-wlb.pem and
/etc/trusted-certs/peer-bundle-licensing.pem or /etc/trusted-certs/peer-bundle-licensing.pem.

I see where the misunderstanding comes from:
I have to be clear that:

Users are not meant to change the certificate files at all, not a single one of the certificates, even if in the table it say they can be configured

The only way to configure a certificate is using the API.

Certificates that can be installed, or removed using the API are considered to be configurable.

There are some certificates that the users are never meant to be aware of, these are considered to not be configurable.

The second one, because it relies on certificate pinning, I think it shouldn't and the API for users to control them should be removed.

The idea is actually to let users directly install "pinned" certificates.

An alternative, which might be what you are thinking of, would be to indirectly install them through other calls. For example a call that configures the address of the WLB or license server would take the certificate as a new parameter.

psafont · 2025-12-12T12:09:43Z

doc/content/design/pool-certificates.md

+This means that the TLS clients need to only trust the user-installed certificates when establishing a TLS connection for a specific purpose. Conversely pool connections will ignore these certificates.
+Two kinds of trusted certificates can be supported when validating a server certificate, which also represent two validation modes:
+* root CA certificate, which is used in standard PKI validation where the server’s certificate chain is built with it;
+* peer certificate, which represents the expected server leaf certificate and hence can be used to compare with the received server leaf certificate. This model supports quick trust establishment, including Trust‑On‑First‑Use (TOFU) and self-signed server certificate scenarios where the first received leaf certificate is trusted to avoid user intervention or configuration.


Is this describing certificate pinning? It would be good to mention it in name if it's that. Or describe how to differs.

Good to know the "certificate pinning" which seems a common used phrase. Thanks.

psafont · 2025-12-12T12:14:15Z

doc/content/design/pool-certificates.md

 * Pool.uninstall_ca_certificate: rename of Pool.certificate_uninstall, removes the certificate from the database.
 * Host.reset_server_certificate: replaces Host.emergency_reset_server_certificate, now it's allowed for role _R_POOL_ADMIN. It adds a record for the generated Default Certificate to the database while removing the previous record, if any.
 * Pool.rotate_internal_certificates: This call generates new Pool certificates, and substitutes the previous certificates with these. See the certificate expiry section for more details.
+* Pool.install_peer_certificate: adds the peer certificate to the database.


It's not clear to me how is the "purpose" injected in these RPC calls, or how are the root certificate modified to make use of it.

Why isn't the WLB being used as a reason as well, for example?

I added the argument list for this.

Why isn't the WLB being used as a reason as well, for example?

Yeah. WLB is a use case as well. WLB supports both CA signed certificates and self-signed certificates. The "purpose" can be extended with such as "wlb" as well.
I describe two approaches for the validation: certificate chain validation and certificate pinning. The bundles are updated separately now.

Signed-off-by: Ming Lu <ming.lu@cloud.com>

robhoes · 2025-12-15T16:47:26Z

doc/content/design/trusted-certificates.md

+
+* A new enumeration type "purpose" is introduced to indicate the intended usage of a trusted certificate.
+A new *Certificate* class field "purposes" (a set of values of enumeration type "purpose") will be added to represent all applicable purposes of a trusted certificate.
+By default, this set is empty which corresponds to the existing general "ca" certificates.


I.e. an empty set implies "all" purposes?

I understand the emptiness here is actually "general" as a certificate with "purposes" being an empty set will not be placed into all directories bundles for purposes.

robhoes · 2025-12-15T16:54:28Z

doc/content/design/trusted-certificates.md

+The new "peer" will represent trusted peer certificates.
+
+* A new enumeration type "purpose" is introduced to indicate the intended usage of a trusted certificate.
+A new *Certificate* class field "purposes" (a set of values of enumeration type "purpose") will be added to represent all applicable purposes of a trusted certificate.


I'd probably use the singular word purpose here and in the parameters below.

robhoes · 2025-12-15T17:06:28Z

doc/content/design/trusted-certificates.md

+| Pool Bundle     | /etc/stunnel/xapi-pool-ca-bundle.pem    | no       | Bundle of certificates that hosts use to verify other hosts on pool communications, this is kept in sync with "Trusted Pool"
+
+For backwards compatibility, when a trusted certificate is being installed via "pool.install_ca_certificate" or "pool.install_peer_certificate" but with empty "purposes",
+the trusted certificate will be stored as "Trusted Default" and "Default Bundle".


But if you mix CA and peer certificates, then it is not clear which verification mode needs to be used (verifying the chain + hostname vs. just the peer cert). Should we not have a separate default bundle for peer certificates?

Or perhaps we should not allow peer certificates without "purpose"?

Yeah, it's better to not allow peer certificates without purpose as it seems not sensible to have a general purpose peer certificate.
Here is only for the backwards compatibility, as there may have been root CA or peer certificates in database already with type="ca", and they are stored in "Trusted Default" and "Default Bundle" already.

How about not allowing the empty purpose for pool.install_peer_certificate?
But pool.install_ca_certificate, to keep the original semantics, it supports:

root CA certificate + empty purpose -> Trusted Default

root CA certificate + non-empty purpose -> Trusted CA

peer certificate + empty purpose -> not allowed

peer certificate + non-empty purpose -> no allowed

robhoes · 2025-12-15T18:04:29Z

I think we should make the general precedence rules clear in this document: when xapi makes an external connection, then there may be multiple "matching" trusted certificate bundles and it must unambiguously choose one to verify the remote against. This could be resolved, for example, by searching for a certificate bundle in this order:

A peer-certificate bundle with explicit and matching purpose
A CA-certificate bundle with explicit and matching purpose
A CA-certificate bundle with "any" purpose (empty purpose field)

(insert "3. A peer-certificate bundle with any purpose" if we allow that)

Signed-off-by: Ming Lu <ming.lu@cloud.com>

psafont · 2025-12-16T10:45:40Z

doc/content/design/trusted-certificates.md

+When a trusted certificate is installed, the local endpoint can validate the peer identity during TLS connection establishment.
+Certificate chain validation is a general-purpose, standards-based approach but requires additional steps, such as getting the peer's certificate signed by a CA.
+In contrast, certificate pinning offers a quicker way to set up trust in some cases without the overhead of CA signing.
+For example, in Trust‑On‑First‑Use (TOFU) policy and a self-signed server certificate scenario the first received leaf certificate can be trusted to avoid user intervention or configuration.


This reduces security as well.

If the service the hosts connects to is going to be known ahead of time (like in the case of licensing) A certificates can be added to the trust root using a package similar to ca-certificates to make it always available.

If it doesn't but the service is aware that it needs to be configured, the installation of the peer certificate can be automated using the API, this is the case for WLB.

Are there any more cases that need to be covered? I strongly suggest to stay away from TOFU in this case, and find alternative solutions that do not reduce security.

I can replace the example with another one. But the trust set up is completely controlled by users via APIs, instead of XAPI itself. No TOFU is applied in XAPI.
For the example of WLB, the XenCenter will ask the user to trust the WLB's certificate (peer certificate) explicitly. Then XenCenter installs the trusted certificate into XAPI via API if the user accepts.
For another client which wants to apply TOFU, it needs to determine if it's a first use case. If yes, it still uses the API to install the trusted certificate.
In both case above, the security is ensured by the human intervention or first-use check.

psafont · 2025-12-16T16:40:10Z

doc/content/design/trusted-certificates.md

+In this design, it is used to install root CA certificates only.
+A new argument "purpose" is appended to specify the purposes of the trusted certificate to be installed. By default it is an empty set.
+* session (ref session_id): reference to a valid session;
+* name (string): the name of the certificate;
+* cert (string): the certificate in PEM format;
+* purpose (string list): the purposes of the certificate


I would propose to add a new API call, leaving the current one for backwards compatibility purposes:

pool.install_root_certificate session cert purposes
where

cert is the certificate in PEM format

purposes is a list of purpose enum values

This gets rid of the name parameter. name was needed when the call was intriduced because certificates were not database object. This meant there was no way to shop information about the certificate without downloading it back again, and that there was no way to decide its name in the filesystem.
Because now we can show the fingerprint as well as expiration data to identify certificates, and we can get a unique name that the user does not have to handle, I think removing the name is ideal.

It's great that the existing hidden relation between name and the file name can be removed. But I think the name can be kept as a meaningful identify.
With this in consideration, new APIs pool.install_root_certificate and pool.install_peer_certificate would have the same argument list. It could be one API with an additional type for "root" or "peer", or just two APIs.
I have no preference. Either is fine to me.

Hi @robhoes May I please have your thoughts here?

I think about it again.
Comparing:

pool.install_ca_certificate

pool.install_root_certificate

pool.install_peer_certificate

with

pool.install_ca_certificate

pool.install_peer_certificate

The latter looks concise. And the appended purpose can keep the backwards compatibility.
But the hidden relation between name and file name can be removed.

psafont · 2025-12-16T16:40:52Z

doc/content/design/trusted-certificates.md

+### pool.install_peer_certificate
+This is a new API introduced in this design with its arguments being defined as:
+* session (ref session_id): reference to a valid session;
+* name (string): the name of the certificate;


I don't think users should be able to decide the filename of these certificates.

Suggested change

* name (string): the name of the certificate;

psafont · 2025-12-16T16:42:12Z

doc/content/design/trusted-certificates.md

+* force (bool): remove the database entry even if the file doesn't exist.
+
+### pool.join
+Since the trusted certificates managed in this design are pool-wide, any existing trusted certificates on the joining host should be removed during pool.join.


I think that instead of removing them, the certificates should be exchanged, like it happens with pool certificates.

I also think that the ca certificates are exchanges currently, am I wrong?

Yeah, the CA certificates are exchanged in pool.join.
I thought the pool's trust set up should not be changed by a new joiner. It has some implications on the trust set up. Would it be clearer that a new joiner accepts the pool's trust set up and drops its own? Just like a joiner drops its other pool-wide configurations during pool join.

I thought the pool's trust set up should not be changed by a new joiner.

I think it's OK to do so, unless there are filename, or configuration conflicts (think of different WLB servers being used). On Pool join pools A and B have to trust each other, so they also trust whatever they were trusting as well. There's no problem security-related, IMO.

We can be smarter and for each of the purposes see how the related setting is joined, and drop any unused certificates coming from the joiner.

For example:
Pool A uses 3.wlb.example.com as WLB server, stores its peer certificate using wlb as purpose.
Host B uses 1.wlb.example.com as WLB server, stores its peer certificate using wlb as purpose.

When joining, only a single WLB server is used, the configuration of Pool A is used. Host B imports the wlb certificates in Pool A. After the reboot on pool join, Host B checks that now 3.wlb.example.com is used, and the certificate that was used for 1.wlb.example.com can be uninstalled.

psafont · 2025-12-16T16:47:08Z

doc/content/design/trusted-certificates.md

+* session (ref session_id): reference to a valid session;
+* name (string): the name of the certificate;
+* cert (string): the certificate in PEM format;
+* purpose (string list): the purposes of the certificate.


Shouldn't this be a list of DNS hotnames? for example [www.example.com; www1.example.com] This way it's very clear the relationship between the certificate and the server that is trusted.

Otherwise there should be some kind of general mechanism to be able to tell to which server names is each certificate related to.

I see. Since it's the peer, it makes sense to know the particular server names.
The purpose is to make it possible to retrieve both trusted root CA and peer certificate when validating the peer identify for one purpose without getting other unrelated certificates involved.
How about just using the name to store the server names as this info is only used by API client, rather than XAPI?

I think that makes sense, we need to be careful with the description of the field name, since it has a different meaning depending on the type of certificate.
Also, this means that if a certificate is used for several domains, it seems like it needs to be duplicated.

Maybe we could do some sort of detection of shared certificate and change the name to have a concatenation of domain names: www.example.com,www1.example.com, but this seems fragile

The distinction we are making here between "CA certificates" and "peer certificates" as separate types of trusted certificates is because they will be used a little differently when verifying the remote's cert:

CA: Verify the chain and check the hostname, using the stunnel options verifyChain and checkHost. The hostname needs to be verified, because there may be many certs for different hosts signed by the same CA.

Peer (pinning): Exact match of the certificate, using the stunnel option verifyPeer. The hostname does not need to be checked. This also makes it easier to connect to IP addresses rather than hostnames (as we do for xapi-xapi connections within a pool).

The idea of the purpose is to limit what a trusted certificate is used for, in a way that works for both cases. I think adding hostnames/domains in the mix is making this more complicated for not a lot of benefit?

With introducing the purpose, the only use pattern of these trusted certificates is to use the bundles for a purpose (a CA bundle or a peer bundle). The server names are not involved in this use pattern.
And the server names are not applicable to CA certificates.
For peer certificates, it looks useful for the clients/users when there are multiple peer certificates installed for the same purpose.
For example, a user may have set up multiple license servers, and for some reason they switch license server back and forth. It would be clear for them to know which peer certificate is for which license server in a quicker way if they can see the server names from the peer certificates.

Maybe we could do some sort of detection of shared certificate ... but this seems fragile

Yeah. I think detection of the sharing is necessary. But updating for sharing seems too complicated.
Say, there is a peer certificate for www.example.com installed already. Now a same one but with a different name www1.example.com is to be installed. Making it return an error will not impact the use case as it is the bundle to be used instead of the server name. And it's still clear to the user as the www.example.com is still there.

Otherwise, I would lean to add CN and SAN fields in Certificate for thorough clearness. But it would not bring a lot of benefit currently.

Peer (pinning): Exact match of the certificate, using the stunnel option verifyPeer. The hostname does not need to be checked. This also makes it easier to connect to IP addresses rather than hostnames (as we do for xapi-xapi connections within a pool).

I agree that during verification the hostname does not need to be matched. However, when the xapi host wants to connect to a server, if it knows which certificate to use it can minimize the amount of roundtrips to establish a connection. With the current approach there's an iteration of certificates until one matches. Using nameservers / IPs, the xapi host can trust a single certificate when connecting, instead of a bundle.

This means also means that the xapi host can minimize the amount of certificates are trusted per connection, and be more granular.

And the server names are not applicable to CA certificates.

I agree, it's part of the negative of my proposal

With the current approach there's an iteration of certificates until one matches.

I think usually this done by the TLS lib so that we can have a big bundle for everything. And the purpose being proposed here is exactly to make it better.

This means also means that the xapi host can minimize the amount of certificates are trusted per connection, and be more granular.

I would not expect that there will be too many certificates in the bundle reduced by purpose. In most cases there will be only one in it. And use of bundle is also a common practice in TLS.

Given we have to combine CA and peer certificates together, the granularity of purpose looks proper.

The distinction we are making here between "CA certificates" and "peer certificates" as separate types of trusted certificates is because they will be used a little differently when verifying the remote's cert:

CA: Verify the chain and check the hostname, using the stunnel options verifyChain and checkHost. The hostname needs to be verified, because there may be many certs for different hosts signed by the same CA.

Peer (pinning): Exact match of the certificate, using the stunnel option verifyPeer. The hostname does not need to be checked. This also makes it easier to connect to IP addresses rather than hostnames (as we do for xapi-xapi connections within a pool).

The idea of the purpose is to limit what a trusted certificate is used for, in a way that works for both cases. I think adding hostnames/domains in the mix is making this more complicated for not a lot of benefit?

In case of static IP, the hostname is not pushed to DNS, thus hosts in a pool can not resole each other,
*This also makes it easier to connect to IP addresses rather than hostnames * may explain why xapi-xapi connections within a pool works.

psafont · 2025-12-16T16:48:33Z

doc/content/design/trusted-certificates.md

+
+## Precedence order of choosing trust stores
+The "Peer Bundle", "CA Bundle", and "Default Bundle" can be directly used when establishing TLS connections.
+The endpoint to validate the peer's identity must unambiguously choose only one from these bundles with the following precedence order:


How can the endpoint decide which to use? just try it and if it fails continue with the other options?

By the existence of the bundle and the precedence order.
If a peer bundle exists, use the bundle. If if fails, no more attempts on other options.
If peer bundle doesn't exist, but CA bundle exists, use the CA bundle. Again, no more attempts on default CA bundle.

Signed-off-by: Ming Lu <ming.lu@cloud.com>

minglumlu · 2025-12-19T02:38:52Z

Hi @robhoes @psafont
I would appreciate any further thoughts or feedback you may have on this design.
From our discussions so far, the outstanding design questions are:

Installation APIs: ca | root | peer v.s. ca | peer
Certificate server names: whether the server name should be updated when installing the same peer certificate.
During pool.join, adopt or drop joiner's pool-wide trusted certificates or not.
Are there any other unresolved or unaddressed issues?

psafont · 2025-12-19T15:00:16Z

I strongly prefer deprecation of the call pool.install_ca_certificate and be able to drop the name field because it's leaking the abstraction.
I think using bundles is an OK compromise.
The simplest design might be to adopt. I'm wary of dropping certificates because it's not clear when to uninstall them: before restarting as new member means some communication might be interrupted. Doing it later means some hosts might have more certificates than others.

minglumlu changed the title ~~[doc] Split peer and root CA for trusted certificates; Add trust purp…~~ [doc] Split peer and root CA for trusted certificates; Add purpose Dec 12, 2025

psafont reviewed Dec 12, 2025

View reviewed changes

[doc] Improvements on management of trusted certificates

08d29ff

Signed-off-by: Ming Lu <ming.lu@cloud.com>

minglumlu force-pushed the private/mingl/CP-310088 branch from d6a4b85 to 08d29ff Compare December 15, 2025 08:12

robhoes reviewed Dec 15, 2025

View reviewed changes

fixup! [doc] Improvements on management of trusted certificates

5a5e2a2

Signed-off-by: Ming Lu <ming.lu@cloud.com>

minglumlu force-pushed the private/mingl/CP-310088 branch from 8f1b299 to 5a5e2a2 Compare December 16, 2025 05:44

psafont reviewed Dec 16, 2025

View reviewed changes

minglumlu force-pushed the private/mingl/CP-310088 branch from a14d9ff to 4f51e45 Compare December 17, 2025 07:55

fixup! fixup! [doc] Improvements on management of trusted certificates

cedf310

Signed-off-by: Ming Lu <ming.lu@cloud.com>

minglumlu force-pushed the private/mingl/CP-310088 branch from 4f51e45 to cedf310 Compare December 17, 2025 08:19

minglumlu requested review from psafont and robhoes December 19, 2025 02:47

		\| Trusted root CA for \<PURPOSE\>\| /etc/stunnel/certs-ca-\<PURPOSE\>/ \| no (derived from the \<PURPOSE\>) \| Root CA certificates that users have installed to validate the server certificate for \<PURPOSE\>
		\| Trusted peer for \<PURPOSE\>\| /etc/stunnel/certs-peer-\<PURPOSE\>/ \| no (derived from the \<PURPOSE\>) \| Peer certificates that users have installed to validate the server certificate for \<PURPOSE\>

[doc] Split peer and root CA for trusted certificates; Add purpose #6792

Are you sure you want to change the base?

[doc] Split peer and root CA for trusted certificates; Add purpose #6792

Uh oh!

Conversation

minglumlu commented Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

robhoes commented Dec 12, 2025

Uh oh!

psafont commented Dec 12, 2025

Uh oh!

psafont Dec 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robhoes Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

minglumlu Dec 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

minglumlu Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

minglumlu Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

robhoes commented Dec 15, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

minglumlu Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

minglumlu Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

minglumlu Dec 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

minglumlu commented Dec 12, 2025 •

edited

Loading

psafont Dec 12, 2025 •

edited

Loading

robhoes Dec 15, 2025 •

edited

Loading

minglumlu Dec 15, 2025 •

edited

Loading

minglumlu Dec 16, 2025 •

edited

Loading

minglumlu Dec 16, 2025 •

edited

Loading

minglumlu Dec 17, 2025 •

edited

Loading

minglumlu Dec 17, 2025 •

edited

Loading

minglumlu Dec 17, 2025 •

edited

Loading

psafont Dec 17, 2025 •

edited

Loading

minglumlu Dec 18, 2025 •

edited

Loading

minglumlu Dec 17, 2025 •

edited

Loading

minglumlu commented Dec 19, 2025 •

edited

Loading