Skip to content

Conversation

@f-f
Copy link
Member

@f-f f-f commented Dec 9, 2025

WIP, should fix #696, fix #649

@f-f
Copy link
Member Author

f-f commented Dec 9, 2025

Main chunks of work to be done at this point:

  • enqueuing/running matrix jobs, need to settle on a design for this. Current direction is "build plan is fixed at submission time, and we dynamically add jobs for dependent packages after a job is completed"
  • enqueing/running package set jobs, need to figure out auth story for that
  • return jobs from the API; need to spec it more:
    • do we keep backwards compatibility? Spago should keep working
    • what details do we return?
    • when returning a list we should allowing selecting queued/done/in progress jobs

@f-f f-f mentioned this pull request Dec 9, 2025
@fsoikin
Copy link
Contributor

fsoikin commented Dec 9, 2025

But the first two bullets can be implemented as separate changes, right? They're not needed for strict parity.

@f-f
Copy link
Member Author

f-f commented Dec 11, 2025

@fsoikin the overall goal is to merge #669 ASAP, so that we can reupload the whole registry.
That can't be merged until we have guaranteed single job execution (#696, this PR, and #649, not yet in this PR but soon), because jobs would otherwise take too long and likely conflict. Splitting off compilation for different compilers is part of making the jobs faster and support keeping the compiler information up to date.

We could be doing all these changes in separate PRs but I don't see the point since they will all need to end up in the compilers-in-metadata branch since we can't merge to trunk until we have the whole package.

Comment on lines 100 to 105
insertPackageJob :: PackageOperation -> ContT Response (Run _) Response
insertPackageJob operation = do
lift $ Log.info $ "Enqueuing job for package " <> PackageName.print (Operation.packageName operation)
jobId <- newJobId
lift $ Db.insertPackageJob { jobId, payload: operation }
jsonOk V1.jobCreatedResponseCodec { jobId }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code checked for running jobs before creating new ones:

lift (Db.runningJobForPackage packageName) >>= case _ of
  Right { jobId, jobType: runningJobType } -> ...

Looks like the duplicate checking is no longer there, is that intentional?

Copy link
Member Author

@f-f f-f Dec 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have noticed as well, and I am reshuffling this part of the code to add this back, among other things.

I have been thinking about splitting the "package job" table as well into different tables for publish/unpublish/transfer, because they do require different checking (publishing/unpublishing needs to check for the tuple package name and version, transferring only needs to match the package name). Thoughts?

JOIN ${JOB_INFO_TABLE} info ON job.jobId = info.jobId
WHERE info.finishedAt IS NULL
AND info.startedAt IS NULL
ORDER BY info.createdAt DESC
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why DESC here? Don't we want the oldest job to go out first (FIFO), like it previously was? Same for the other selectNext* functions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I have noticed too, it should be ASC.

And it should probably not be LIMIT 1 either, as we likely want to do some toposorting on the result. I have not figured out yet where we draw the line though, I wouldn't like to be toposorting thousands of matrix jobs over and over, so we might want keep the limit and toposort on insertion. But then we need to ORDER BY something else than creation date when fishing that out. Ideas?

@thomashoneyman
Copy link
Member

Also the tests are failing because of a missing version field in the fixtures, see:
5afab58

@thomashoneyman
Copy link
Member

Main chunks of work to be done at this point:

In addition to your list (some responses to that below), the other point that I'm not seeing here is that we need to ensure that the github issue module only proxies requests over to the server API and writes comments on 'notify' logs but does not actually execute e.g. the publish operation itself. Otherwise we've defeated the purpose of the pull request, as we can't enforce a lock on the git commits anymore.

  • enqueuing/running matrix jobs, need to settle on a design for this. Current direction is "build plan is fixed at submission time, and we dynamically add jobs for dependent packages after a job is completed"

This makes sense to me: start with no-dependency packages, and on completion we queue dependents that are satisfied by this package existing, propagate outwards.

  • enqueing/running package set jobs, need to figure out auth story for that

What do we need to do beyond what we have today? Today, we open an issue, pacchettibotti signs the payload via the GitHub Actions setup, and then the job is executed by the action. With the server, the only difference would be that this payload is submitted to the server API. I'm not sure I'm seeing what the additional auth issues are. We can also have a daily GitHub Action cron job that kicks off the daily package set updater, so as far as the server is concerned it's just receiving an authenticated package set update.

The same approach is used for transfers or unpublishing: open the issue as a purescript/packaging maintainer and it will be auto-signed by pacchettibotti. I like this because it's public.

@f-f
Copy link
Member Author

f-f commented Dec 13, 2025

@thomashoneyman

Also the tests are failing because of a missing version field in the fixtures, see:
5afab58

Yeah, I have not fixed that yet because I was not sure we need the version in there at all. Help me with reminding why we need the version now 😄

Also: I see these are on a different branch? I will cherry-pick them here, and feel free to push new stuff here as well.

the other point that I'm not seeing here is that we need to ensure that the github issue module only proxies requests over to the server API and writes comments on 'notify' logs but does not actually execute e.g. the publish operation itself

Yeah it's planned but I want to thread through the matrix jobs and the package set jobs to the database before going to the github interaction part. It might be a nice spot where we can parallelise work if you want to have a go at it. I will push the current state of my branch so you have the latest.

The same approach is used for transfers or unpublishing: open the issue as a purescript/packaging maintainer and it will be auto-signed by pacchettibotti. I like this because it's public.

Yeah if you think it's straightforward then it probably is 😄 I have not looked at this chunk of code yet so I might be overthinking stuff.

I like the public aspect of the package set updates as well, and I in fact want to make it public-only, in the sense that I'd put the package-set server endpoints behind encryption/authentication (with just a simple shared secret for example) so that only the code that runs on github can hit them and we don't have to worry about things coming to the registry server from other places. No strong feelings about this though if you think it's too convoluted

@thomashoneyman
Copy link
Member

Ah, if you want it to be forced public then we could indeed do a shared secret we send in the HTTP header or something. I’d like transfer and unpublish to be public too — we could potentially remove the ability for pacchettibotti signatures to be valid for those and only the shared secret can be used as a trustee override?

f-f and others added 7 commits December 14, 2025 10:48
the publishCodec requires a version file but the test fixtures weren't
updated to include it
The JS insertMatrixJobImpl expects columns [jobId, packageName,
packageVersion, compilerVersion, payload] but the PureScript types were
missing packageName and packageVersion
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants