feat(rollups): add rollup_contents mapping #519

Carbonhell · 2025-12-22T00:33:13Z

Closes #494.

I wasn't sure whether to use IDs from the pull_request table to represent PRs in the mapping, or just the GitHub PR number - I went with the latter as there would be no record for the rollup at the time of its creation, as that gets populated via GitHub events, if I understood correctly.

I haven't experienced rollups at all as I haven't (yet?) contributed to a project with bors, but I noticed there's no safety check for multiple rollups containing the same PR(s) being opened. I found a related comment here, but I kept it out of this PR to reduce its scope and simplify review. I think such a safety check could be added easily by checking for rollups associated to each candidate PR, and verifying the status of the rollup PR to ignore closed ones. That would still allow one to recreate the same scenario by reopening a previously closed rollup, though.

One thing I'm unsure about is how to handle the error when writing the rollup_contents records in the DB. I think that should only happen either in case of DB connection errors, or if the list of PRs contains duplicates. I assume the former can be propagated to the caller, whereas I'm not sure if the latter is realistic and should be accounted for. I'm thinking we can just ignore the conflicting records in such case - let me know if I should proceed with that.

Edit: sorry, I forgot the migration test files. Will push them tomorrow as soon as possible

…nsert/get logic chore(git): add relevant gitattributes to ensure correct line endings for sqlx files

Kobzol · 2025-12-22T10:24:45Z

Hi, thanks for the PR! I'll take a look, but I don't know when I'll be able to get to it, since it's Holiday season :)

I wasn't sure whether to use IDs from the pull_request table to represent PRs in the mapping, or just the GitHub PR number - I went with the latter as there would be no record for the rollup at the time of its creation, as that gets populated via GitHub events, if I understood correctly.

Yeah, that makes sense.

but I noticed there's no safety check for multiple rollups containing the same PR(s) being opened.

This is not a problem on its own, there are often multiple rollups that contain a single PR, because if a rollup fails, you create a new one, often before the old one is closed. I don't think that we need to think about that right now.

if the list of PRs contains duplicates. I assume the former can be propagated to the caller, whereas I'm not sure if the latter is realistic and should be accounted for.

Also a good point! I'll ignore duplicate PR numbers in rollups.

Kobzol

I think that the main query that we will want to execute on the queue page would look something like this:

SELECT
    rollup_pr_number, member_pr_number
FROM rollup_member
WHERE repository = $1 AND
      rollup_pr_number IN (SELECT number
                           FROM pull_request AS pr
                           WHERE pr.repository = $1
                             AND pr.status IN ('open', 'draft'))
ORDER BY rollup_pr_number;

And we get back something like HashMap<PrNum, HashSet<PrNum>>, which maps rollups to their member PRs. Because we assume that there will be only a couple of open rollups, so it seems easier and faster to do it this way, rather than try to find a corresponding rollup for each PR in the queue.

Kobzol · 2025-12-22T19:37:18Z

.gitattributes

@@ -0,0 +1 @@
+/.sqlx/*.json binary linguist-generated=true


Did you have any specific use-case for this? GitHub already marked them as generated before.

Sorry, yes, it's mostly for the binary part, I have a windows workstation and the sqlx command to generate those files was causing diffs due to line endings. I copied this file from the docs.rs repo to avoid this issue. I can remove it from the PR as it's out of scope, if you prefer.

Oh, I see. Let's keep it then, no real harm in that.

Kobzol · 2025-12-22T19:46:31Z

src/database/client.rs

+    /// Register the contents of a rollup in the DB.
+    /// The contents are stored as rollup PR number - member PR number records, scoped under a specific repository.
+    /// All records are inserted in a single transaction.
+    pub async fn create_rollup(


Suggested change

pub async fn create_rollup(

pub async fn register_rollup_members(

I think that's a bit less misleading.

Kobzol · 2025-12-22T19:47:03Z

migrations/20251221160635_rollup_contents.up.sql

@@ -0,0 +1,11 @@
+CREATE TABLE IF NOT EXISTS rollup_contents


I like the member terminology that you have used, so let's go with it.

Suggested change

CREATE TABLE IF NOT EXISTS rollup_contents

CREATE TABLE IF NOT EXISTS rollup_member

Kobzol · 2025-12-22T19:57:16Z

src/database/client.rs

+    }
+
+    /// Returns the numbers of all the PRs associated with the requested rollup.
+    pub async fn get_rollup_pr_contents(


I think that the main operation that we will want to do is load members information for a set of PRs that will be displayed on the queue page. So I'm thinking something like:

Load non-closed PRs on the queue page endpoint

Do a query that will fetch all members for all rollups in those PRs.

And then in the frontend, we'll build a reverse index, so that we can also associate a PR with its rollup(s).

As an alternative to 1) + 2), we could directly attach a bool whether a given PR is a rollup or not to the return value of 1), and then only ask for members from actual rollups, to reduce the DB traffic slightly. But we'd need to take a look at the query plans to see if it makes sense. And probably perf. will be fine with both approaches.

Kobzol · 2025-12-22T20:18:15Z

migrations/20251221160635_rollup_contents.up.sql

+
+    PRIMARY KEY (repository, rollup_pr_number, member_pr_number)
+);
+CREATE INDEX rollup_contents_member_pr_number_idx ON rollup_contents (repository, member_pr_number);


As per my other comment, I think that it makes more sense to create the index on the rollup_pr_number attribute, as we will be loading members of rollups, and not looking up rollups for individual PRs, at least for the queue page.

I will proceed removing this index, then. For the index on the rollup_pr_attribute I assume you mean including the repository attribute as well, like the query in your review. That should already be covered by the index generated by the primary key, as I've purposefully ordered the columns by putting the member_pr_number attribute last, as per https://www.postgresql.org/docs/current/indexes-multicolumn.html .

Yeah, also including repository. Fair enough, as long as the index is used for the query I posted, it should be fine.

I ran some simulations with 20k PRs and 3k rollup_member rows on a local Postgres instance, and EXPLAIN ANALYZE was still choosing a sequential scan over rollup members anyway. So maybe the index won't be even needed so much. But that shouldn't be an issue for some time anyway.

feat(rollups): add rollup_contents database table, along with basic i…

fda8499

…nsert/get logic chore(git): add relevant gitattributes to ensure correct line endings for sqlx files

Kobzol mentioned this pull request Dec 22, 2025

Filter out duplicate PRs when making a rollup #521

Merged

Kobzol reviewed Dec 22, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(rollups): add rollup_contents mapping #519

feat(rollups): add rollup_contents mapping #519

Uh oh!

Carbonhell commented Dec 22, 2025 •

edited

Loading

Uh oh!

Kobzol commented Dec 22, 2025 •

edited

Loading

Uh oh!

Kobzol left a comment •

edited

Loading

Uh oh!

Kobzol Dec 22, 2025

Uh oh!

Carbonhell Dec 22, 2025 •

edited

Loading

Uh oh!

Kobzol Dec 22, 2025

Uh oh!

Kobzol Dec 22, 2025

Uh oh!

Kobzol Dec 22, 2025

Uh oh!

Kobzol Dec 22, 2025

Uh oh!

Kobzol Dec 22, 2025

Uh oh!

Carbonhell Dec 22, 2025

Uh oh!

Kobzol Dec 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		@@ -0,0 +1 @@
		/.sqlx/*.json binary linguist-generated=true No newline at end of file

	pub async fn create_rollup(
	pub async fn register_rollup_members(

	CREATE TABLE IF NOT EXISTS rollup_contents
	CREATE TABLE IF NOT EXISTS rollup_member

feat(rollups): add rollup_contents mapping #519

Are you sure you want to change the base?

feat(rollups): add rollup_contents mapping #519

Uh oh!

Conversation

Carbonhell commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kobzol commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Kobzol left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Carbonhell Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Carbonhell commented Dec 22, 2025 •

edited

Loading

Kobzol commented Dec 22, 2025 •

edited

Loading

Kobzol left a comment •

edited

Loading

Carbonhell Dec 22, 2025 •

edited

Loading