-
Notifications
You must be signed in to change notification settings - Fork 376
[CLI] Move native file locking into workers #2997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: php-wasm/node/testable-syscall-overrides
Are you sure you want to change the base?
[CLI] Move native file locking into workers #2997
Conversation
…Windows FileLockManager tests
|
We can do this with a FileLockManagerForPosix and a FileLockManagerForWindows and fall back to the FileLockManagerForNode if the native locking API is not available. I've working on the implementation for both. The main things that require care are:
For Posix, we can keep it fairly simple for fcntl() by keeping track of which files a process has locked via fcntl() and then unlocking the entire range via fcntl() when locks need to be released. For Windows, implementing fcntl() semantics is more complicated. We'll have to maintain a collection of which ranges are locked per file in order to be able to unlock those ranges. If a caller wants to unlock part of a range, we'll have to unlock the entire range and then obtain locks for the remaining portions of the original locked range. For shared locks, we can obtain the new ranges before releasing the original range, but for exclusive ranges, we'll have to release the original range before attempting to obtain locks on the remaining ranges. (According to a Google answer about whether Windows allows overlapping exclusive locks by the same process) The good news is that we are already tracking locked ranges in the FileLockManagerForNode. The work for the Windows locks shouldn't be that different. cc @adamziel |
60a6f0b to
6023b7d
Compare
…ire remaining range
|
I roughed out native FileLockManager's for POSIX and Windows, but they are yet untested. Tomorrow, I plan to start by adapting native locking tests and testing these new classes. |
|
The pre-requisites for this one seem to be mostly in place. The CLI spawn handler now creates a new OS process for any spawned PHP subprocess. The request handler still uses multiple PHP instances, but can be tuned down by adding In #3014, I'm exploring a CI stress test to confirm multiple workers are indeed used for handling concurrent requests. |
|
@adamziel, all the FileLockManager tests are passing for the Windows version. I explicitly disabled the tests that assert partial lock/unlock and overlapping locked ranges. Those things aren't supported for Windows, and I'm not sure we'll be able to support that safely on Windows. Regardless, I don't think we'll need it to support SQLite's locking scheme. I think we might be able to test Playground CLI with the Windows lock integration before moving to separate worker processes, so I'm planning to try that yet tonight. It may give us a little more info. Either way, these Windows tests make me think it's worth trying the multi-process approach here. |
|
@adamziel I was able to get Playground CLI booting with native file locking in Windows. With a very simple test it appears to be working well. Tomorrow, I plan to prototype moving Playground CLI from multi-worker to multi-process. This is needed if we want to rely exclusively on native OS file locking because locks are per-fd and per-process. Otherwise the native OS will see locks held by php-wasm workers as locks held by the same process. |
|
Amazing @brandonpayton! |
|
Btw, I thought Playground CLI was already multi-process and we just needed to cap the number of workers to 1? |
The purpose of this is to eliminate worker args that are not serializable as JSON so we can move from workers to separate processes which communicate via JSON-serialized messages rather than via transferred or cloned objects.
@adamziel we've just been multi-worker-thread up to now. Last week, I realized that I'd thought we were multi-process already but was incorrect (mentioned here). But I think we are close to moving to multiple processes. I see how it can work using the Node.js |
|
I have what I think should be working for a Playground CLI process cluster but need to wire up spawning a another PHP worker from the child process (for proc_open() I think). For that case, we probably just want to fork() a child process rather than fork() a process within the cluster. Will see. |
|
You know all that, but I saw |
Thanks for stating this explicitly! 😄 I just discovered cluster.fork() is more like POSIX fork() than child_process.fork(). By default cluster.fork() at least creates a child that runs the main-thread/primary script unless you explicitly override the child script path with In that case, it's like you're saying "actually this other script is the 'primary' you should fork() from". But it's not actually the primary script. It's just the script you want to run in a cluster. 🙃 |
|
@adamziel @mho22 I pushed some WIP changes for running Playground CLI php-wasm workers as separate processes. Some notes:
This is all that is coming to me now, but I'll likely follow up and share more soon. :) In the meantime, I'm working on:
|
|
Actually, the push failed. Fixing... |
The purpose of this is to eliminate worker args that are not serializable as JSON so we can move from workers to separate processes which communicate via JSON-serialized messages rather than via transferred or cloned objects.
|
The multi-process Playground CLI changes are now pushed. |
|
It's kinda wild, but multi-process Playground CLI worked with no modifications on Windows. 🎉 That's great (and unusual)! I instrumented FileLockManagerForWindows to log when locking file byte ranges, and it is being used to lock the SQLite DB. |
|
Currently IPC is JSON-based, but I just learned there is an "advanced" serialization option based on the HTML structured clone algorithm. It may not matter either way, but being able to pass instances of builtin objects is a blessing. |
|
Also, IPC via comlink isn't used much after startup. After that, the workers are just servicing HTTP requests directly. |
|
The Playground CLI file locking tests are passing. There are still other Playground CLI test failures, and I'm looking at those. |
|
The main Playground run-cli tests for Blueprints v1 are passing except for: The biggest thing that was causing failure was that runCLI() was being used with a single worker process which now only has a single PHP instance. Some tests needed more than a single instance. Switching to explicit multi-worker-process fixed them. Ultimately, to land this PR, we'll need to always run multiple php-wasm worker processes. I'm pleased with how well things appear to be going. Tomorrow, I'll continue cleaning this up and fixing tests. |
|
One note: It's a bit more awkward because, with a cluster of workers, there is no one place that can judge which request is the first (so we know the autologin-has-happened cookie cannot be from the current Playground CLI session). |
Sounds good and makes sense. Should we have a lower bound on the number of worker processes?
Maybe doing everything in PHP is not useful here? What do you think about moving some of the logic to |
Yes! In single-worker mode, we have a default maximum of 5 php-wasm instances at a time. Let's start our lower bound at 5 instances and see how it goes. |
The previous logic for clearing the autologin-has-happened cookie was in Every php-wasm worker process in the cluster calls Maybe we can do something with a lock file. Will see :) |
Motivation for the change, related issues
In order to fix native file locking in Windows, we are moving native file locking into workers.
More details coming...
Implementation details
TBD
Testing Instructions (or ideally a Blueprint)
TBD