Added documentation for how resource limits work and how they are configured (#129)

ArneTR · web-flow · commit 4f2a99e590b9 · 2025-12-15T16:58:49.000+01:00
diff --git a/content/en/docs/cluster/host-resource-reservations.md b/content/en/docs/cluster/host-resource-reservations.md
@@ -0,0 +1,52 @@
+---
+title: "Host Resource Reservations"
+description: "Reserving CPU and Memory to ensure GMT has sufficient compute power to orchestrate measurement"
+date: 2025-12-15T16:20:15+10:00
+weight: 1005
+toc: false
+---
+
+GMT runs on the host system orchestrating containers. To have this process running smoothly.
+GMT reserves a share of the available CPUs and the available Memory for the host system.
+
+This ensures:
+
+- Host does not OOM and measurement fails
+- Host does not run into CPU scarcity and metric providers can safely capture all metrics
+
+The setting is configured *per machine* in the `config.yml`
+
+```yml
+# config.yml
+machine:
+    ...  
+  host_reserved_cpus: 1
+  host_reserved_memory: 1073741824 # 1 GiB
+
+```
+
+#### Specifications
+
+- `host_reserved_cpus` **[integer]** (Default 1): Value between 1 and CPU_MAX. CPU_MAX is the amount of available compute threads on the system.
+- `host_reserved_memory` **[integer]** (Default 0): Value between 0 and MEMORY_MAX. MEMORY_MAX is the amount of available physical memory on the system.
+
+## How to choose good values
+
+### CPU
+
+GMT will always reserve one core. So setting a value for `host_reserved_cpus` < 1 will fail. Typically GMT does not need more than one core, so you only should set this value to a higher value if your system has SMT / Hyper-Threading enabled or your host does a lot of other operations that should run undisturbed by the orchestrated containers.
+
+For the former you should reserve as much cores as the SMT increases the core count. Typically SMT doubles a physical core to two hyper-cores. Thus you should set the reservation to two cores.
+
+For the latter however though this indicates that the machine is a *noisy* measurement machine and we currently do not see any valid case for such a measurement machine ... write us an email if you have a one :)
+
+### Memory
+
+The default value for memory reservations in development is 0. This means Memory can be overcommited. In development this is fine as OOM situations can be resolved manually and it makes development quicker as containers have more memory to work with.
+
+In a cluster setup the value should be derived as follows:
+
+- Start the GMT `client.py`
+- Wait 30 seconds until the python process has reserved all memory it needs
+- Check `free -m` and read the current *used* memory.
+- Add 300 MB and round up to the neareast half GiB (e.g 2.1 GiB (+300 MB) used memory rounds up 2.5 GiB)
diff --git a/content/en/docs/installation/minimum-system-requirements.md b/content/en/docs/installation/minimum-system-requirements.md
@@ -11,6 +11,8 @@ toc: true
 At least an SSE2-compatible processor is required;
 For macOS a 64bit-compatible Intel processor (Core2Duo or newer) or an M1 ARM or newer is required.
 
+The CPU must have at least **two physical threads** available (in case of SMT / Hyperthreading it should be 4 threads).
+
 ### Memory
 
 - 1 GB
diff --git a/content/en/docs/measuring/configuration.md b/content/en/docs/measuring/configuration.md
@@ -109,9 +109,13 @@ For the rest please see [installation →]({{< relref "/docs/cluster/installatio
 
 ### machine
 
-If you run locally nothing needs to be configured here. But if you run a *cluster* you must set the base temperature values for the accuracy control to work
+If you run locally nothing needs to be configured here.
 
-Please see [cluster installation →]({{< relref "/docs/cluster/installation" >}}) and [accuracy control →]({{< relref "/docs/cluster/accuracy-control" >}})
+But if you run a *cluster* you must set the base temperature values for the accuracy control to work as well as configure the host reservation for CPU and memory.
+
+Please see [cluster installation →]({{< relref "/docs/cluster/installation" >}}), [accuracy control →]({{< relref "/docs/cluster/accuracy-control" >}}) and [host resource reservations]({{< relref "/docs/cluster/host-resource-reservations" >}}).
+
+Also see [Resource Limits]({{< relref "/docs/measuring/resource-limits" >}}) to better understand how GMT enforces resource limits on its orchestrated containers.
 
 ### measurement
 
diff --git a/content/en/docs/measuring/resource-limits.md b/content/en/docs/measuring/resource-limits.md
@@ -0,0 +1,45 @@
+---
+title : "Resource Limits"
+description: ""
+date: 2025-12-15T16:48:45+10:00
+weight: 426
+---
+
+Resource limits are an essential part of any in production deployed container.
+
+GMT enables **and** enforces resource limits by:
+
+- Using `Compose Specification` settings on containers like `mem_limit` and `cpus`
+- Auto-Assigning values to these two settings in case they are not set
+
+## Understanding Auto-Assignment
+
+When no resource limits are set GMT will determine how many available CPUs and how much memory is on the host sytem.
+
+### CPUs
+
+GMT will always reserve one core to have processing of the measurement and metric providers running smoothly. This means you cannot run GMT on a system with less than 2 cores.
+
+After reserving one core GMT will assign the available cores *in full* to all other containers. Meaning that CPUs are not dedicated per container, but available to all containers in parallel and scheduling per CPU is done by the native OS scheduler.
+
+If your machine has SMT / Hyper-Threading enabled or you feel you need to reserve more than one core for GMT check out [host resource reservations]({{< relref "/docs/cluster/host-resource-reservations" >}})
+
+### Memory
+
+Per default GMT will not reserve any memory from the host. This is fine for development but can lead to OOM situations in unattended modes like the [cluster mode]({{< relref "/docs/cluster" >}}).
+
+If you feel you want to change that check out [host resource reservations]({{< relref "/docs/cluster/host-resource-reservations" >}}).
+
+Once a value is set, or the value is left at 0, GMT will calculate the total memory on the host system and deduct the configured memory reservation.
+
+Then all containers that have a manual `mem_limit` set will get their memory assigned.
+
+The rest of the memory is then auto-assigned to the rest of the containers in the `usage_scenario.yml` evenly.
+
+If during the process the memory gets exhausted because too much memory was requested manually through `mem_limit` GMT will error.
+
+### Validating
+
+You can see the auto applied values in the *Containers* Tab in the Dashboard
+
+<center><img style="width: 600px;" src="/img/dashboard-containers-tab.webp" alt="Dashboard Container Tab for GMT Measurements"></center>
diff --git a/static/img/dashboard-containers-tab.webp b/static/img/dashboard-containers-tab.webp