Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@ CHANGELOG
------

**CHANGES**
- Improve cluster update resiliency by ensuring clustermgtd is started after updates complete successfully, or after failed updates where queue reconfiguration succeeded.
- Add chef attribute `cluster/in_place_update_on_fleet_enabled` to disable in-place updates on compute and login nodes
and mitigate performance impact at scale.
- Upgrade Slurm to version 24.11.7 (from 24.11.6).
Expand All @@ -24,10 +25,13 @@ CHANGELOG
- Open MPI: openmpi40-aws-4.1.7-2 and openmpi50-aws-5.0.8-11

**BUG FIXES**
- Fix an issue where cfn-hup enters an endless loop on the head node after a rollback to a cluster state older than 24 hours, caused by cfn-signal failing to signal an expired wait condition handle.
- Fix race condition where compute nodes could deploy the wrong cluster config version after an update failure.
- Prevent cluster readiness check failures due to instances launched while the check is in progress.
- Fix incorrect timestamp parsing for chef-client.log in CloudWatch Agent configuration.
- Reduce EFA installation time for Ubuntu by ~20 minutes by only holding kernel packages for the installed kernel.
- Add GetFunction and GetPolicy permissions to PClusterBuildImageCleanupRole to prevent AccessDenied errors during build image stack deletion.
- Fix validation error messages when `DevSettings` is null or `DevSettings/InstanceTypesData` is missing required fields.
- Fix an issue where cfn-hup enters an endless loop on the head node after a rollback to a cluster state older than 24 hours, caused by cfn-signal failing to signal an expired wait condition handle.

3.14.0
------
Expand Down
Loading