Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
388 changes: 388 additions & 0 deletions src/coding-guidelines/types-and-traits/gui_UnionPartialInit
Original file line number Diff line number Diff line change
@@ -0,0 +1,388 @@
.. SPDX-License-Identifier: MIT OR Apache-2.0
SPDX-FileCopyrightText: The Coding Guidelines Subcommittee Contributors

.. default-domain:: coding-guidelines

.. guideline:: Do not read from union fields that may contain uninitialized bytes
:id: gui_UnionPartialInit
:category: required
:status: draft
:release: 1.85.0
:decidability: undecidable
:scope: expression
:tags: unions, initialization, undefined-behavior

Do not read from a union field unless all bytes of that field have been explicitly
initialized. Partial initialization of a union's composite field leaves some bytes
in an uninitialized state, and reading those bytes is undefined behavior.

When working with unions:

* Initialize all bytes of a field before reading from it
* Do not assume that initializing one variant preserves the initialized state of another
* Do not rely on prior initialization of a union before reassignment
* Use ``MaybeUninit`` with proper initialization patterns rather than custom unions for
managing uninitialized memory

You can access a field of a union even when the backing bytes of that field are uninitialized provided that:

- The resulting value has an unspecified but well-defined bit pattern.
- Interpreting that value must still comply with the requirements of the accessed type
(e.g., no invalid enum discriminants, no invalid pointer values, etc.).

For example, reading an uninitialized ``u32`` field of a union is allowed;
reading an uninitialized bool field is disallowed because not all bit patterns are valid.

.. rationale::
:id: rat_UnionPartialInitReason
:status: draft

Unions in Rust allow multiple fields to share the same memory. When a union field
is a composite type (tuple, struct, array), writing to only some components leaves
the remaining bytes in an indeterminate state. Reading these uninitialized bytes
is undefined behavior [RUST-REF-UB]_.

This issue is particularly insidious because:

* **Silent data corruption**: The program may appear to work, reading stale or
garbage values that happen to be "reasonable" in testing.

* **Optimization interactions**: The compiler may merge, inline, or deduplicate
functions in ways that change which code paths execute. A function that fully
initializes a union may be merged with one that partially initializes it,
causing UB to appear in previously-safe code paths [LLVM-MERGE]_.

* **Function pointer comparisons**: Relying on function pointer equality to
select code paths is unreliable (see gui_FnPtrEquality). Combined with partial
initialization, this can lead to UB being introduced through seemingly unrelated
optimizations.

* **Reassignment resets initialization**: Assigning a new value to a union
(e.g., ``*u = MyUnion { uninit: () }``) does not preserve the initialized
state of other fields. All fields must be considered uninitialized after
such an assignment.

The Rust memory model requires that all bytes be initialized before a typed
read occurs. There is no exception for "partial" reads of composite types —
the entire field must be valid.

The sole exception is that unions work like C unions:
any union field may be read, even if it was never written.
The resulting bytes must, however, form a valid representation for the field's type,
which is not guaranteed if the union contains arbitrary data.

.. non_compliant_example::
:id: non_compl_ex_PartialInit1
:status: draft

This noncompliant example partially initializes a tuple field, leaving the second element uninitialized.

.. code-block:: rust

union MyMaybeUninit {
uninit: (),
init: (u8, u8),
}

fn write_first(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe { a.init.0 = 1; } // Only initializes the first byte
}

fn main() {
let mut a = MyMaybeUninit { init: (0, 0) };
write_first(&mut a);

// Undefined behavior reading uninitialized value
println!("{}", unsafe { a.init.1 }); // noncompliant
}

.. non_compliant_example::
:id: non_compl_ex_PartialInit2
:status: draft

This noncompliant example assumes prior initialization is preserved after reassignment.

.. code-block:: rust

union Data {
raw: [u8; 4],
value: u32,
}

fn partial_update(d: &mut Data) {
// Reassignment invalidates all prior initialization
*d = Data { raw: [0; 4] };

// Only update first two bytes
unsafe {
d.raw[0] = 0xAB;
d.raw[1] = 0xCD;
}
}

fn main() {
let mut d = Data { value: 0xFFFFFFFF };
partial_update(&mut d);

// 'raw[2]' and 'raw[3]' are uninitialized after reassignment
println!("{:?}", unsafe { d.raw }); // noncompliant
}

.. non_compliant_example::
:id: non_compl_ex_PartialInit3
:status: draft

This noncompliant example combines function pointer comparison with partial initialization,
creating subtle undefined behavior that may only manifest after optimization.

.. code-block:: rust

union MyMaybeUninit {
uninit: (),
init: (u8, u8),
}

#[no_mangle]
fn write_first(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe { a.init.0 = 1; }
}

#[no_mangle]
fn write_both(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe {
a.init.0 = 1;
a.init.1 = 2;
}
}

fn main() {
let mut a = MyMaybeUninit { init: (0, 0) };

// Non-compliant: function pointer comparison is unreliable,
// and 'write_first' leaves 'a.init.'1 uninitialized
if write_first as usize == write_both as usize {
write_first(&mut a);
}

// UB if the branch was taken (functions may be merged by optimizer)
println!("{}", unsafe { a.init.1 }); // noncompliant
}

.. compliant_example::
:id: compl_ex_FullInit1
:status: draft

This compliant examples initializes all bytes of the field before reading.

.. code-block:: rust

union MyMaybeUninit {
uninit: (),
init: (u8, u8),
}

fn write_both(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe {
a.init.0 = 1;
a.init.1 = 2; // Initialize all bytes
}
}

fn main() {
let mut a = MyMaybeUninit { init: (0, 0) };
write_both(&mut a);

// Both bytes are initialized
println!("{}", unsafe { a.init.1 }); // compliant
}

.. compliant_example::
:id: compl_ex_FullInit2
:status: draft

This compliant example uses ``MaybeUninit`` with proper initialization patterns.

.. code-block:: rust

use std::mem::MaybeUninit;

fn init_tuple() -> (u8, u8) {
let mut data: MaybeUninit<(u8, u8)> = MaybeUninit::uninit();

unsafe {
let ptr = data.as_mut_ptr();
(*ptr).0 = 1;
(*ptr).1 = 2; // Initialize all fields
// data is fully initialized before call to 'assume_init'
data.assume_init()
}
}

fn main() {
let result = init_tuple();
println!("{}, {}", result.0, result.1); // compliant
}

.. compliant_example::
:id: compl_ex_FullInit3
:status: draft

This compliant example initializes through the composite field directly.

.. code-block:: rust

union Data {
raw: [u8; 4],
value: u32,
}

fn full_init(d: &mut Data) {
// Initialize entire field at once
*d = Data { raw: [0xAB, 0xCD, 0xEF, 0x12] };
}

fn main() {
let mut d = Data { value: 0 };
full_init(&mut d);

// All bytes in 'd' are initialized
println!("{:?}", unsafe { d.raw }); // compliant
}

.. compliant_example::
:id: compl_ex_FullInit4
:status: draft

This compliant solution avoids relying on function pointer comparisons.

.. code-block:: rust

union MyMaybeUninit {
uninit: (),
init: (u8, u8),
}

enum InitLevel {
Partial,
Full,
}

fn write_first(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe { a.init.0 = 1; }
}

fn write_both(a: &mut MyMaybeUninit) {
*a = MyMaybeUninit { uninit: () };
unsafe {
a.init.0 = 1;
a.init.1 = 2;
}
}

fn main() {
let mut a = MyMaybeUninit { init: (0, 0) };
let level = InitLevel::Full; // Explicit tracking, not pointer comparison

match level {
InitLevel::Full => {
write_both(&mut a);
// Compliant: safe to read both fields
println!("{}", unsafe { a.init.1 });
}
InitLevel::Partial => {
write_first(&mut a);
// Only read the initialized field
println!("{}", unsafe { a.init.0 });
}
}
}

.. compliant_example::
:id: compl_ex_Ke869nSXuShU
:status: draft

Types such as ``u8``, ``u16``, ``u32``, and ``i128`` allow all possible bit patterns.
Provided the memory is initialized, there is no undefined behavior.

.. rust-example::

union U {
n: u32,
bytes: [u8; 4],
}

# fn main() {
let u = U { bytes: [0xFF, 0xEE, 0xDD, 0xCC] };
let n = unsafe { u.n }; // OK — all bit patterns valid for u32
# }

.. compliant_example::
:id: compl_ex_Ke869nSXuShT
:status: draft

The following code reads a union field:

.. rust-example::

union U {
x: u32,
y: f32,
}

# fn main() {
let u = U { x: 123 }; // write to one field
let f = unsafe { u.y }; // reading the other field is allowed
# }

.. non_compliant_example::
:id: non_compl_ex_Qb5GqYTP6db3
:status: draft

Even though unions allow reads of any field, not all bit patterns are valid for a ``bool``.
Unions do not relax type validity requirements.
Only the read itself is allowed;
the resulting bytes must still be a valid bool.

.. rust-example::

union U {
b: bool,
x: u8,
}

# fn main() {
let u = U { x: 255 }; // 255 is not a valid bool representation
let b = unsafe { u.b }; // UB — invalid bool
# }

.. bibliography::
:id: bib_UnionFieldValidity
:status: draft

.. list-table::
:header-rows: 0
:widths: auto
:class: bibliography-table

* - .. [RUST-REF-UB]
- The Rust Project Developers. "Behavior Considered Undefined." *The Rust
Reference*, n.d.
https://doc.rust-lang.org/reference/behavior-considered-undefined.html.

* - .. [RUST-REF-UNION]
- The Rust Project Developers. "Unions." *The Rust Reference*, n.d.
https://doc.rust-lang.org/reference/items/unions.html.

* - .. [LLVM-MERGE]
- LLVM Project. "MergeFunctions Pass." *LLVM Documentation*, n.d.
https://llvm.org/docs/MergeFunctions.html.

* - .. [UCG-VALIDITY]
- Rust Unsafe Code Guidelines Working Group. "Validity and Safety
Invariant." *Rust Unsafe Code Guidelines*, n.d.
https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#validity-and-safety-invariant.
1 change: 1 addition & 0 deletions src/coding-guidelines/types-and-traits/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@ Types and Traits
================

.. include:: gui_xztNdXA2oFNC.rst.inc
.. include:: gui_UnionPartialInit