Script worker #50

erthalion · 2025-07-22T13:55:58Z

Script worker allows much more flexible definition of load structure. Few examples:

# short version, only one scenario run
$ strace -f -o /tmp/log berserker -f workloads/example.short.ber
[2025-07-22T13:56:42Z INFO  berserker] Config: WorkloadConfig { restart_interval: 10, per_core: true, workers: 1, workload: Processes { arrival_rate: 10.0, departure_rate: 200.0, random_process: true }, duration: 0 }
; ModuleID = 'main'
source_filename = "main"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"

@name = private unnamed_addr constant [5 x i8] c"stub\00", align 1
@const = private unnamed_addr constant [14 x i8] c"run task stub\00", align 1
@const.1 = private unnamed_addr constant [20 x i8] c"open file /tmp/test\00", align 1
@const.2 = private unnamed_addr constant [10 x i8] c"/tmp/test\00", align 1

declare i64 @task(i64 %0)

declare i64 @debug(i64 %0)

declare i64 @open(i64 %0)

define i64 @main() {
entry:
  %task = call i64 @debug(ptr @const)
  %task1 = call i64 @task(ptr @name)
  %task2 = call i64 @debug(ptr @const.1)
  %task3 = call i64 @open(ptr @const.2)
  ret i64 0
}
[2025-07-22T13:56:42Z INFO  berserker] Child 201315
[2025-07-22T13:56:42Z INFO  berserker] waitpid: 201315
[2025-07-22T13:56:42Z INFO  berserker] 201315 stopped

$ cat /tmp/log | grep -E test\|stub
202305 execve("/path/to/stub", ["stub", "ZYNnTXs"], 0x7ffd68f58ed8 /* 60 vars */ <unfinished ...>
202303 openat(AT_FDCWD, "/tmp/test", O_WRONLY|O_CREAT|O_CLOEXEC, 0666) = 3

# long version, running the scenario with specified distribution
$ berserker -f workloads/example.ber
[2025-07-22T14:00:36Z DEBUG berserker] ARGS Args { flag_c: None, flag_f: Some("workloads/example.ber") }
[...]
[2025-07-22T14:00:36Z INFO  berserker] Child 203681
[2025-07-22T14:00:36Z INFO  berserker] waitpid: 203678
[2025-07-22T14:00:36Z DEBUG berserker::worker::script] run task stub
[2025-07-22T14:00:36Z INFO  berserker::worker::script] Interval 0.0586862540339888, rounded 59
[2025-07-22T14:00:36Z DEBUG berserker::worker::script] open file /tmp/test
[2025-07-22T14:00:36Z INFO  berserker::worker::script] Interval 0.10129037145396365, rounded 101
[2025-07-22T14:00:36Z DEBUG berserker::worker::script] run task stub
[2025-07-22T14:00:36Z DEBUG berserker::worker::script] open file /tmp/test
[2025-07-22T14:00:36Z INFO  berserker::worker::script] Interval 0.04452548013581766, rounded 45
[2025-07-22T14:00:36Z DEBUG berserker::worker::script] run task stub
[2025-07-22T14:00:36Z DEBUG berserker::worker::script] open file /tmp/test
[2025-07-22T14:00:36Z INFO  berserker::worker::script] Interval 0.08633732479159864, rounded 86
[...]

# running with config as before
$ berserker -c workloads/syscalls.toml
[2025-07-22T14:02:43Z DEBUG berserker] ARGS Args { flag_c: Some("workloads/syscalls.toml"), flag_f: None }
[2025-07-22T14:02:43Z INFO  berserker] Config: WorkloadConfig { restart_interval: 10, per_core: true, workers: 1, workload: Syscalls { arrival_rate: 10.0, tight_loop: false, syscall_nr: 39 }, duration: 0 }
[2025-07-22T14:02:43Z INFO  berserker] Child 204588
[2025-07-22T14:02:43Z INFO  berserker] Child 204589
[2025-07-22T14:02:43Z INFO  berserker::worker::syscalls] Process 0 from 0
[2025-07-22T14:02:43Z INFO  berserker::worker::syscalls] Running syscall getpid
[...]

Molter73

Sorry I left so many rust formatted blocks that should've been suggestions instead, the GH UI was making the diff show as only with the last line I selected, so I thought having actual formatting for the comments would be more readable.

Overall looks good, thanks for taking the time implementing this!

Molter73 · 2025-07-25T10:14:56Z

src/script/parser.rs

+    let mut ast = vec![];
+    let pairs = InstructionParser::parse(Rule::file, source)?;
+    for pair in pairs {
+        if let Rule::file = pair.as_rule() {
+            ast.push(build_ast_from_expr(pair.into_inner().next().unwrap()));
+        }
+    }


I think a functional approach might be a bit more readable here, something like this:

let ast = InstructionParser::parse(Rule::file, source)? .filter_map(|p| { if let Rule::file = p.as_rule() { Some(build_ast_from_expr(p.into_inner().next()?)) } else { None } }) .collect::<Result<Vec<_>, _>>()?;

WDYT?

Molter73 · 2025-07-25T10:20:59Z

src/script/parser.rs

+            let mut args: HashMap<String, String> = HashMap::new();
+
+            for arg in args_pair {
+                let mut inner = arg.into_inner();
+                let key = inner.next().unwrap();
+                let value = inner.next().unwrap();
+
+                assert_eq!(key.as_rule(), Rule::ident);
+                assert_eq!(value.as_rule(), Rule::value);
+
+                args.insert(
+                    key.as_span().as_str().to_string(),
+                    value.as_span().as_str().to_string(),
+                );


Similar to the previous comment, this could be done in an iterator so we don't need a mutable hashmap

let args = args_pair .map(|arg| { let mut inner = arg.into_inner(); let key = inner.next().unwrap(); let value = inner.next().unwrap(); assert_eq!(key.as_rule(), Rule::ident); assert_eq!(value.as_rule(), Rule::value); let key = key.as_span().as_str().to_string(); let value = value.as_span().as_str().to_string(); (key, value) }) .collect();

Molter73 · 2025-07-25T10:22:03Z

src/script/parser.rs

+                assert_eq!(key.as_rule(), Rule::ident);
+                assert_eq!(value.as_rule(), Rule::value);


Not something to do in this PR, but we might want to take a closer look at how we are doing error handling across the project, I see a lot of unwraps and asserts.

Molter73 · 2025-07-25T10:25:31Z

src/script/parser.rs

+    Ok(ast)
+}
+
+fn build_ast_from_expr(pair: pest::iterators::Pair<Rule>) -> Node {


Have you considered implementing the From or TryFrom traits instead of having build_* functions? I think this could result in more idiomatic code.

https://doc.rust-lang.org/std/convert/trait.From.html
https://doc.rust-lang.org/std/convert/trait.TryFrom.html

Molter73 · 2025-07-25T10:29:16Z

src/script/parser.rs

+fn build_ast_from_instr(pair: pest::iterators::Pair<Rule>) -> Vec<Instruction> {
+    let mut instr = vec![] as Vec<Instruction>;
+
+    for i in pair.into_inner() {
+        let mut instrs = i.into_inner().next().unwrap().into_inner();
+        let name = instrs.next().unwrap().as_span().as_str().to_string();
+        let args_pair = instrs.next().unwrap().into_inner();
+        let args: Vec<Arg> = args_pair
+            .into_iter()
+            .map(|arg| {
+                let a = arg.into_inner().next().unwrap();
+                match a.as_rule() {
+                    Rule::constant => Arg::Const {
+                        text: a
+                            .into_inner()
+                            .next()
+                            .unwrap()
+                            .as_span()
+                            .as_str()
+                            .to_string(),
+                    },
+                    Rule::ident => Arg::Var {
+                        name: a.as_span().as_str().to_string(),
+                    },
+                    unknown => panic!("Unknown arg type {unknown:?}"),
+                }
+            })
+            .collect();
+
+        instr.push(Instruction::Task { name, args });
+    }
+
+    instr
+}


This function could be a single map operation, like this:

fn build_ast_from_instr(pair: pest::iterators::Pair<Rule>) -> Vec<Instruction> { pair.into_inner() .map(|i| { let mut instrs = i.into_inner().next().unwrap().into_inner(); let name = instrs.next().unwrap().as_span().as_str().to_string(); let args_pair = instrs.next().unwrap().into_inner(); let args: Vec<Arg> = args_pair .into_iter() .map(|arg| { let a = arg.into_inner().next().unwrap(); match a.as_rule() { Rule::constant => Arg::Const { text: a .into_inner() .next() .unwrap() .as_span() .as_str() .to_string(), }, Rule::ident => Arg::Var { name: a.as_span().as_str().to_string(), }, unknown => panic!("Unknown arg type {unknown:?}"), } }) .collect(); Instruction::Task { name, args } }) .collect() }

Molter73 · 2025-07-25T11:01:28Z

src/main.rs

+        parse_instructions(&std::fs::read_to_string(script_path).unwrap())
+            .unwrap();
+
+    ast.iter().for_each(|node| {


I like iterators, but in this particular case a regular for loop might be better

Suggested change

ast.iter().for_each(|node| {

for node in ast {

Molter73 · 2025-07-25T11:07:07Z

src/main.rs

+fn run_script(script_path: String) -> Vec<Option<i32>> {
+    let mut handles = vec![];
+
+    let ast: Vec<Node> =
+        parse_instructions(&std::fs::read_to_string(script_path).unwrap())
+            .unwrap();
+
+    ast.iter().for_each(|node| {
+        debug!("AST NODE: {:?}", node);
+
+        let Node::Work {
+            name: _,
+            args,
+            instructions: _,
+            dist: _,
+        } = node;
+
+        let workers: u32 = args
+            .get("workers")
+            .cloned()
+            .unwrap_or(String::from("0"))
+            .parse()
+            .unwrap();
+        let h: Vec<_> = (0..workers)
+            .map(|_| {
+                let worker = new_script_worker(node.clone());
+
+                match fork() {
+                    Ok(Fork::Parent(child)) => {
+                        info!("Child {}", child);
+                        Some(child)
+                    }
+                    Ok(Fork::Child) => {
+                        worker.run_payload().unwrap();
+                        None
+                    }
+                    Err(e) => {
+                        warn!("Failed: {e:?}");
+                        None
+                    }
+                }
+            })
+            .collect();

-fn main() {
-    let args: Vec<String> = env::args().collect();
-    let default_config = String::from("workload.toml");
-    let config_path = &args.get(1).unwrap_or(&default_config);
-    let duration_timer = SystemTime::now();
+        handles.extend(h);
+    });

-    let config = Config::builder()
-        // Add in `./Settings.toml`
-        .add_source(
-            config::File::with_name("/etc/berserker/workload.toml")
-                .required(false),
-        )
-        .add_source(config::File::with_name(config_path).required(false))
-        // Add in settings from the environment (with a prefix of APP)
-        // Eg.. `BERSERKER__WORKLOAD__ARRIVAL_RATE=1` would set the `arrival_rate` key
-        .add_source(
-            config::Environment::with_prefix("BERSERKER")
-                .try_parsing(true)
-                .separator("__"),
-        )
-        .build()
-        .unwrap()
-        .try_deserialize::<WorkloadConfig>()
-        .unwrap();
+    handles
+}


Ok, you might want to disregard the previous comment, because I know see this entire function could be one large iterator operation like this:

fn run_script(script_path: String) -> Vec<Option<i32>> { parse_instructions(&std::fs::read_to_string(script_path).unwrap()) .unwrap() .iter() .map(|node| { debug!("AST NODE: {:?}", node); let Node::Work { name: _, args, instructions: _, dist: _, } = node; let workers: u32 = args .get("workers") .cloned() .unwrap_or(String::from("0")) .parse() .unwrap(); (0..workers) .map(|_| { let worker = new_script_worker(node.clone()); match fork() { Ok(Fork::Parent(child)) => { info!("Child {}", child); Some(child) } Ok(Fork::Child) => { worker.run_payload().unwrap(); None } Err(e) => { warn!("Failed: {e:?}"); None } } }) .collect::<Vec<_>>() }) .collect() }

Generally speaking, iterators should be used when you are doing something self contained that has no side effects and for loops for when you do want side effects. At least that is what I heard once and I like to stick to it.

Molter73 · 2025-07-25T11:10:14Z

src/main.rs

+        let Node::Work {
+            name: _,
+            args,
+            instructions: _,
+            dist: _,
+        } = node;


This should be more concise if you only want one field

let Node::Work { args, .. } = node;

Molter73 · 2025-07-25T11:12:02Z

src/main.rs

+            .cloned()
+            .unwrap_or(String::from("0"))


Since we are just going to parse the string, I don't think we need to make a copy of it

Suggested change

.cloned()

.unwrap_or(String::from("0"))

.unwrap_or("0")

Molter73 · 2025-07-25T11:16:23Z

src/main.rs

+    let default_config = String::from("workload.toml");
+    let duration_timer = SystemTime::now();
+    let script_path = args.flag_f;
+    let config_path = args.flag_c.unwrap_or(default_config);
+
+    let config = Config::builder()
+        // Add in `./Settings.toml`
+        .add_source(
+            config::File::with_name("/etc/berserker/workload.toml")
+                .required(false),
+        )
+        .add_source(
+            config::File::with_name(config_path.as_str()).required(false),


[nit] You don't actually need a String for default_config and config_path since you end up just calling as_str() on it anyways, you can do this instead:

let default_config = "workload.toml"; let duration_timer = SystemTime::now(); let script_path = args.flag_f; let config_path = args.flag_c.as_ref().unwrap_or(default_config); let config = Config::builder() // Add in `./Settings.toml` .add_source( config::File::with_name("/etc/berserker/workload.toml") .required(false), ) .add_source( config::File::with_name(config_path).required(false),

Add networking instructions. Add functions with different set of arguments.

Add helpers to generate random values for workload. Use BTF to attach fentry/fexit to BPF programs (WIP). Add SIGTERM hook to finish workers processes.

erthalion force-pushed the feature/script branch 6 times, most recently from 9bbc770 to 3de270c Compare July 23, 2025 12:29

Script worker

0e44c3d

erthalion force-pushed the feature/script branch from 3de270c to 0e44c3d Compare July 23, 2025 12:40

erthalion marked this pull request as ready for review July 23, 2025 13:12

Molter73 reviewed Jul 25, 2025

View reviewed changes

Add machine section for global state.

026c376

Add networking instructions. Add functions with different set of arguments.

erthalion force-pushed the feature/script branch from 8b9e023 to 026c376 Compare July 25, 2025 16:21

Misc improvements

32ea55b

Add helpers to generate random values for workload. Use BTF to attach fentry/fexit to BPF programs (WIP). Add SIGTERM hook to finish workers processes.

erthalion force-pushed the feature/script branch from 8dc07d4 to 32ea55b Compare October 8, 2025 14:29

Parser tests

4a2b02f

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Script worker #50

Script worker #50

Uh oh!

erthalion commented Jul 22, 2025 •

edited

Loading

Uh oh!

Molter73 left a comment

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Molter73 Jul 25, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		assert_eq!(key.as_rule(), Rule::ident);
		assert_eq!(value.as_rule(), Rule::value);

Script worker #50

Are you sure you want to change the base?

Script worker #50

Uh oh!

Conversation

erthalion commented Jul 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Molter73 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

erthalion commented Jul 22, 2025 •

edited

Loading