Skip to content

Commit 2659e7a

Browse files
committed
first commit
1 parent 0c743e3 commit 2659e7a

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+11825
-0
lines changed

.gitignore

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
_site
2+
.quarto
3+
/.quarto/
4+
posts/etc
5+
6+
/.luarc.json

_brand.yml

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
color:
2+
background: '#1b1b1e'
3+
foreground: '#eeeeee'
4+
primary: '#4488dd'
5+
typography:
6+
fonts:
7+
- source: google
8+
family: 'Lato'
9+
weight: [100, 300, 400, 700]
10+
- source: google
11+
family: 'Source Sans Pro'
12+
base:
13+
family: 'Source Sans Pro'
14+
headings:
15+
family: 'Lato'
16+
monospace:
17+
background-color: background
18+
color: foreground
19+
defaults:
20+
bootstrap:
21+
defaults: # defaults also supports a string as its value
22+
link-decoration: underline
23+
navbar-bg: "#224488"
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
{
2+
"hash": "30f9e2d83b2573bfb44865cc97ce101e",
3+
"result": {
4+
"engine": "knitr",
5+
"markdown": "---\ntitle: 'An early win for `jog` in scoped resolution'\nformat: html\ndate: 2025-01-24\nauthor: Carlos\ncategories:\n - performance\nfilters:\n - ../github-commit.lua\n - ../drop-knitr-stderr.lua\n---\n\n::: {.cell}\n\n:::\n\n\n\nA win from using `jog` for one of our filters:\n\n- Before: []{.github-commit hash=\"53da9da410b2c95d9ae1dca75d71507cff606c2a\"}\n- After: []{.github-commit hash=\"bf5bc5add450aa8a7c911d7162c28e1294f7631e\"}\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stderr}\n\n```\nRows: 5030 Columns: 3\n── Column specification ────────────────────────────────────────────────────────\nDelimiter: \",\"\nchr (2): filter, name\ndbl (1): time\n\nℹ Use `spec()` to retrieve the full column specification for this data.\nℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.\n```\n\n\n:::\n\n::: {.cell-output-display}\n![Runtimes by filter on `quarto-web` before moving to `jog`.](index_files/figure-html/fig-before-1.png){#fig-before width=576}\n:::\n:::\n\n::: {.cell}\n::: {.cell-output .cell-output-stderr}\n\n```\nRows: 4979 Columns: 3\n── Column specification ────────────────────────────────────────────────────────\nDelimiter: \",\"\nchr (2): filter, name\ndbl (1): time\n\nℹ Use `spec()` to retrieve the full column specification for this data.\nℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.\n```\n\n\n:::\n\n::: {.cell-output-display}\n![Runtimes by filter on `quarto-web` after moving to `jog`.](index_files/figure-html/fig-after-1.png){#fig-after width=576}\n:::\n:::\n",
6+
"supporting": [
7+
"index_files"
8+
],
9+
"filters": [
10+
"rmarkdown/pagebreak.lua"
11+
],
12+
"includes": {},
13+
"engineDependencies": {},
14+
"preserve": {},
15+
"postProcess": true
16+
}
17+
}
409 KB
Loading
437 KB
Loading
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
{
2+
"hash": "f5da63201d5e2ba7a5e9150b27c0e78b",
3+
"result": {
4+
"engine": "knitr",
5+
"markdown": "---\ntitle: Hashing Performance\ndate: 2025-01-26\nauthor: Carlos\ncategories:\n - performance \n - TypeScript\nfilters:\n - ../drop-knitr-stderr.lua\n---\n\n::: {.cell}\n\n:::\n\n\n\nIn the course of 1.7's perf work, we are going to introduce a number of persistent caches\nfor Quarto projects. This will require knowing which hashing functions perform well\nunder what settings. I'm using [this file](./deno-hash-bench.ts) to measure the results.\n\n::: {.cell}\n::: {.cell-output .cell-output-stderr}\n\n```\nRows: 17 Columns: 5\n── Column specification ────────────────────────────────────────────────────────\nDelimiter: \",\"\ndbl (5): size_log2, sha256, md5, djb2, blueimp-md5\n\nℹ Use `spec()` to retrieve the full column specification for this data.\nℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.\n```\n\n\n:::\n\n::: {.cell-output-display}\n![Runtimes of different hashing algorithms in Deno](index_files/figure-html/fig-runtimes-1.png){#fig-runtimes width=672}\n:::\n:::\n\nImportant features:\n\n| algorithm | sync | quality |\n|---------------|------|------------|\n| `djb2` | yes | non-crypto |\n| `blueimp-md5` | yes | meh |\n| `md5` | no | meh |\n| `sha256` | no | good |\n\n### djb2\n\nI've spent some time trying to write a faster version of djb2 and couldn't really make meaningful progress.\nI tried:\n\n- unrolling the loop directly (not enough of a win)\n- operating at 32 bits at a time by converting the string to a buffer first\n\n## Takeaways\n\n- `blueimp-md5` only makes sense if `sync` MD5 calls are necessary: it's slower than `md5` at every range.\n- `md5` only makes sense if the quality improvement over `djb2` is needed, but `sha256` not being required:\n - the DJB2 algorithm gives ~32 bits of hashing space, birthday paradoxes start appearing at 2^16 items, while MD5 gives 128 bits.\n - `md5` is, adversarially, trivially breakable\n\n- `sha256` is async and has a large startup cost, but is the fastest at strings starting at size ~2^14 = 16k, faster even than `djb2`.\n\n## A design for a general-purpose cache?\n\nIf we need cryptographically-safe hashes, then we need to use SHA-256 everywhere. Unfortunately, that incurs ~15ms of overhead per call independently of the size of the string. That's a lot.\n\nIf `djb2` is good enough in terms of quality, then we still need to worry about hash space size. `djb2` has 32 bits of address space. By the birthday paradox, if we want a 1 in a million chance of a hash collision, then the cache size needs to be at most [~100](https://en.wikipedia.org/wiki/Birthday_problem#Probability_table).\n\nHonestly, this number is small enough that I'm wary about using `djb2` at all in Quarto as a substitute for string equality.\n\nIf we could create a 64-bit version of `djb2`, that would likely suffice for Quarto documents: the critical size for such caches to achieve a 1-in-a-million catastrophic failure is ~6 million.\n\n`md5` has 128 bits, and in non-adversarial settings that's plenty.\n\nThe penalty of using `md5` is about 50%, and the requirement for using async:\n\n::: {.cell}\n::: {.cell-output-display}\n![](index_files/figure-html/unnamed-chunk-3-1.png){width=672}\n:::\n:::\n\nThat's a completely acceptable tradeoff.\n\nSo, I think our general-purpose cache is:\n\n- use `md5` or `sha256`, whichever is faster. The breakpoint where `sha256` is clearly it is at string sizes of around 16k or larger.\n\n- this cache will be necessarily async.\n",
6+
"supporting": [
7+
"index_files"
8+
],
9+
"filters": [
10+
"rmarkdown/pagebreak.lua"
11+
],
12+
"includes": {},
13+
"engineDependencies": {},
14+
"preserve": {},
15+
"postProcess": true
16+
}
17+
}
101 KB
Loading
70.3 KB
Loading
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
{
2+
"hash": "c0526d4b2d1c2d7aebe5ac105b33cd28",
3+
"result": {
4+
"engine": "knitr",
5+
"markdown": "---\ntitle: Implementing a general-purpose project cache\ndate: 2025-01-28\nauthor: Carlos\ncategories:\n - performance \n - TypeScript\nfilters:\n - ../drop-knitr-stderr.lua\n---\n\n::: {.cell}\n\n:::\n\n\n\nI've implemented a disk cache and started using it on some of our slow, stable computations (such as the analysis of SCSS files).\n\nThis work is on the `feature/project-cache` branch.",
6+
"supporting": [],
7+
"filters": [
8+
"rmarkdown/pagebreak.lua"
9+
],
10+
"includes": {},
11+
"engineDependencies": {},
12+
"preserve": {},
13+
"postProcess": true
14+
}
15+
}

_freeze/site_libs/clipboard/clipboard.min.js

Lines changed: 7 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)