Skip to content

Conversation

@cgwalters
Copy link
Collaborator

@cgwalters cgwalters commented Dec 17, 2025

See individual commits.

This is mainly with an eye to reworking our sealing, for which we needed to do a "scratch" build and as soon as we do that we might as well just take the step of consolidating our integration test and base image etc.

This was developed iteratively, and I did at least do a build test of each individual stage.

@github-actions github-actions bot added the area/documentation Updates to the documentation label Dec 17, 2025
@bootc-bot bootc-bot bot requested a review from ckyrouac December 17, 2025 20:48
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and beneficial refactoring of the build system. Key changes include moving to a 'from scratch' base image build, consolidating the integration test image into the main build flow, and streamlining the process by building packages separately before injecting them into the final image. These changes greatly improve the structure and maintainability of the build process. My review includes a couple of suggestions for further cleanup, such as removing a now-redundant Containerfile and fixing a minor style issue in a shell script.

@jeckersb
Copy link
Collaborator

I won't have time to test this but the eyeball test looks sane, especially grateful for breaking it up into individual commits so I can more easily follow your train of thought 👏

@henrywang
Copy link
Collaborator

Our bot updated bcvk version in PR #1867.

@cgwalters cgwalters force-pushed the buildsys-more branch 2 times, most recently from ae9169f to f95e7fc Compare December 18, 2025 15:22
@cgwalters cgwalters marked this pull request as draft December 18, 2025 15:36
@github-actions github-actions bot added the area/ostree Issues related to ostree label Dec 18, 2025
@cgwalters cgwalters marked this pull request as ready for review December 18, 2025 16:38
@cgwalters
Copy link
Collaborator Author

OK, fixed some more things; ended up adding an env var to shortcut things and skip package builds, it was just too fragile without that. The main thing I overlooked is that the unit/integration test depchains ended up pulling in package builds too.

@cgwalters
Copy link
Collaborator Author

cgwalters commented Dec 18, 2025

Hum, not sure yet why most but not all of the cfs tests failed.

And now I see this blocks on #1225

Details # PR #1864 CI Failure Investigation - Composefs Boot Failures

Summary

PR #1864 ("build-sys: Many changes") introduces a "from scratch" image build approach
using bootc-base-imagectl build-rootfs --manifest=standard. This causes composefs
sealed UKI boot failures on certain distros.

Test Results Pattern

Distro ostree composefs-sealeduki-sdboot
Fedora 42 PARTIAL (upgrade tests fail with ostree.final-diffid) PASS
Fedora 43 PARTIAL FAIL - SSH timeout, VM doesn't boot
Fedora 44 PARTIAL FAIL - SSH timeout, VM doesn't boot
CentOS 9 PARTIAL (excluded from matrix)
CentOS 10 PARTIAL FAIL - SSH timeout, VM doesn't boot

Key observation: Fedora 42 composefs works, but Fedora 43/44 and CentOS 10 don't.

Main Branch Status

On main branch (run 20343724695), ALL composefs and ostree tests PASS.
This confirms the issue is introduced by PR #1864.

Key Changes in PR #1864

  1. "From scratch" image build - Uses bootc-base-imagectl build-rootfs --manifest=standard
    instead of directly layering on top of the base image. This is to fix
    Incorrect mtime when generating EROFS from OCI image containers/composefs-rs#132

  2. Cloud-init always enabled - Removed the conditional cloudinit argument from
    provision-derived.sh. Cloud-init is now always installed AND enabled via:

    ln -s ../cloud-init.target /usr/lib/systemd/system/default.target.wants
  3. Dockerfile consolidation - Removed separate hack/Containerfile for integration
    tests, consolidated into main Dockerfile.

Composefs Sealed UKI Build Process

The sealed composefs image is built via Dockerfile.cfsuki:

  1. Compute composefs digest: hack/compute-composefs-digest $input_image
  2. Build UKI with hardcoded cmdline:
    cmdline="composefs=${COMPOSEFS_FSVERITY} console=ttyS0,115200n8 console=hvc0 enforcing=0 rw"
    
  3. Sign UKI with test secureboot keys
  4. Place UKI in /boot/EFI/Linux/

Note: The cmdline is hardcoded and does NOT read from /usr/lib/bootc/kargs.d/.

Failure Details

For failing distros (Fedora 43/44, CentOS 10):

  • VM is created successfully by bcvk
  • Installation completes ("Installation completed successfully!")
  • VM starts but never becomes SSH-accessible
  • SSH timeout after 60 attempts (~3 minutes of waiting)
  • No kernel panic or boot error visible in logs

Potential Causes to Investigate

  1. Initramfs differences - The "from scratch" build may produce different initramfs
    content that doesn't work correctly with composefs sealed boot on newer distros.

  2. Cloud-init interaction - Cloud-init being enabled might interfere with bcvk's
    SSH key injection, though Fedora 42 works with the same cloud-init setup.

  3. Kernel/dracut version differences - Fedora 43/44 and CentOS 10 may have different
    kernel or dracut versions that behave differently with the sealed composefs boot.

  4. Missing kargs - The hardcoded cmdline in Dockerfile.cfsuki may be missing
    required kernel arguments for newer distros.

  5. Network configuration - The "from scratch" build may be missing network
    configuration that the base images normally provide.

Cloud-init Version Differences

  • Fedora 42: cloud-init-24.2-5.fc42 - services: cloud-init.service
  • Fedora 43: cloud-init-25.2-7.fc43 - services: cloud-init-main.service (renamed!)
  • CentOS 10: cloud-init-24.4-6.el10 - services: cloud-init.service

The service rename in cloud-init 25 (Fedora 43) may cause issues if something
expects the old service name, but CentOS 10 uses cloud-init 24.x and still fails.

Files to Examine

  • Dockerfile - New from-scratch build logic
  • Dockerfile.cfsuki - Sealed composefs UKI build
  • hack/build-sealed - Script that builds sealed image
  • hack/provision-derived.sh - Provisioning script (cloud-init changes)
  • hack/compute-composefs-digest - Computes composefs fsverity digest

Next Steps

  1. Check if main branch composefs tests use the same bcvk/test infrastructure
  2. Compare initramfs contents between working (F42) and failing (F43/C10) images
  3. Try to get console output from failing VMs to see where boot hangs
  4. Consider reverting the cloud-init enabling change to test if that's the cause
  5. Test if the "from scratch" build works correctly on failing distros for non-sealed boot

@cgwalters
Copy link
Collaborator Author

OK and then the next problem is the "from scratch" is really painful for composefs installs due to containers/composefs-rs#62

We were previously trying to support a direct `podman/docker build`
*and* injecting externally built packages (for CI).

Looking to rework for sealed images it was too hacky; let's
just accept that a raw `podman build` no longer works, the canonical
entry for local build is `just build` which builds both a package
and a container.

This way CI and local work exactly the same.

Signed-off-by: Colin Walters <walters@verbum.org>
Oops.

Signed-off-by: Colin Walters <walters@verbum.org>
Signed-off-by: Colin Walters <walters@verbum.org>
This changes things so we always run through https://docs.fedoraproject.org/en-US/bootc/building-from-scratch/
in our default builds, which helps work around containers/composefs-rs#132

But it will also help clean up our image building in general
a bit.

Signed-off-by: Colin Walters <walters@verbum.org>
Move all content from the derived test image (hack/Containerfile) into
the main Dockerfile base image. This includes nushell, cloud-init, and
the other testing packages from packages.txt.

This simplifies the build by avoiding the need to juggle multiple images
during testing workflows - the base image now contains everything needed.

Assisted-by: OpenCode (Claude Sonnet 4)
Signed-off-by: Colin Walters <walters@verbum.org>
The previous commit consolidated test content (nushell, cloud-init, etc.)
into the base image. This completes that work by removing the separate
`build-integration-test-image` target and updating all references.

Now `just build` produces the complete test-ready image directly,
simplifying the build pipeline and eliminating the intermediate
`localhost/bootc-integration` image.

Signed-off-by: Colin Walters <walters@verbum.org>
Removing localhost/bootc-pkg at the end of the package target
also deletes the build stage layers, causing subsequent builds
to miss the cache and rebuild the RPMs from scratch.

Keep the image around; use `just clean-local-images` to reclaim space.

Signed-off-by: Colin Walters <walters@verbum.org>
Ensure all RUN instructions after the "external dependency cutoff point"
marker include `--network=none` right after `RUN`.
This enforces that external dependencies are clearly delineated in the early stages of the Dockerfile.

The check is part of `cargo xtask check-buildsys` and includes unit tests.

Assisted-by: OpenCode (Sonnet 4)
Signed-off-by: Colin Walters <walters@verbum.org>
Now that we're building a from-scratch image it won't have `/ostree`
in it; this line was always pruning the wrong repo.

Signed-off-by: Colin Walters <walters@verbum.org>
Remove the separate build-from-packages and _build-from-package helper
recipes. The build logic is now inlined directly in the build recipe.

Add BOOTC_SKIP_PACKAGE=1 environment variable support to skip the
package build step when packages are provided externally (e.g. from
CI artifacts). This is used in ci.yml for the test-integration job.

Assisted-by: OpenCode (Sonnet 4)
Signed-off-by: Colin Walters <walters@verbum.org>
@cgwalters cgwalters marked this pull request as draft December 18, 2025 22:35
@cgwalters
Copy link
Collaborator Author

OK, I think now we're just down to the failures in those selinux-policy tests. Not sure what's up with that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/documentation Updates to the documentation area/ostree Issues related to ostree

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants