|
| 1 | +# Implementation Notes: Table of Contents Metadata |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This document captures lessons learned from implementing auto-generated table of contents (TOC) metadata for Redis documentation pages. These insights should help guide future metadata feature implementations. |
| 6 | + |
| 7 | +## Key Lessons |
| 8 | + |
| 9 | +### 1. Start with Hugo's Built-in Functions |
| 10 | + |
| 11 | +**Lesson**: Always check what Hugo provides before building custom solutions. |
| 12 | + |
| 13 | +**Context**: Initial attempts tried to manually extract headers from page content using custom partials. This was complex, error-prone, and required parsing HTML/Markdown. |
| 14 | + |
| 15 | +**Solution**: Hugo's `.TableOfContents` method already generates HTML TOC from page headings. Using this as the source was much simpler and more reliable. |
| 16 | + |
| 17 | +**Takeaway**: For future metadata features, audit Hugo's built-in methods first. They often solve 80% of the problem with minimal code. |
| 18 | + |
| 19 | +### 2. Regex Substitution for Format Conversion |
| 20 | + |
| 21 | +**Lesson**: Simple regex transformations can convert between formats more reliably than complex parsing. |
| 22 | + |
| 23 | +**Context**: Converting HTML to JSON seemed like it would require a full HTML parser or complex state machine. |
| 24 | + |
| 25 | +**Solution**: Breaking the conversion into small, sequential regex steps: |
| 26 | +1. Remove wrapper elements (`<nav>`, `</nav>`) |
| 27 | +2. Replace structural tags (`<ul>` → `[`, `</ul>` → `]`) |
| 28 | +3. Replace content tags (`<li><a href="#ID">TITLE</a>` → `{"id":"ID","title":"TITLE"`) |
| 29 | +4. Add structural elements (commas, nested arrays) |
| 30 | + |
| 31 | +**Takeaway**: For format conversions, think in terms of sequential substitution patterns rather than parsing. This is often simpler and more maintainable. |
| 32 | + |
| 33 | +### 3. Hugo Template Whitespace Matters |
| 34 | + |
| 35 | +**Lesson**: Hugo template whitespace and comments generate output that affects final formatting. |
| 36 | + |
| 37 | +**Context**: Generated JSON had many blank lines, making it less readable. |
| 38 | + |
| 39 | +**Solution**: Use Hugo's whitespace trimming markers (`{{-` and `-}}`) to prevent unwanted newlines. |
| 40 | + |
| 41 | +**Takeaway**: When generating structured output (JSON, YAML), always consider whitespace. Test the final output, not just the template logic. |
| 42 | + |
| 43 | +### 4. Markdown Templates Have Different Processing Rules |
| 44 | + |
| 45 | +**Lesson**: Hugo's markdown template processor (`.md` files) behaves differently from HTML templates. |
| 46 | + |
| 47 | +**Context**: Initial attempts to include metadata in markdown output failed because the template processor treated code blocks as boundaries. |
| 48 | + |
| 49 | +**Solution**: Place metadata generation in the template itself, not in content blocks. Use `safeHTML` filter to prevent HTML entity escaping. |
| 50 | + |
| 51 | +**Takeaway**: When targeting multiple output formats, test each format separately. Markdown templates have unique constraints that HTML templates don't have. |
| 52 | + |
| 53 | +### 5. Validate Against Schema Early |
| 54 | + |
| 55 | +**Lesson**: Create the schema before or immediately after implementation, not after. |
| 56 | + |
| 57 | +**Context**: Schema was created last, after implementation was complete. |
| 58 | + |
| 59 | +**Better approach**: Define the schema first, then implement to match it. This: |
| 60 | +- Clarifies the target structure |
| 61 | +- Enables validation during development |
| 62 | +- Provides documentation for implementers |
| 63 | +- Helps catch structural issues early |
| 64 | + |
| 65 | +**Takeaway**: For future metadata features, write the schema first as a specification. |
| 66 | + |
| 67 | +### 6. Test Multiple Page Types |
| 68 | + |
| 69 | +**Lesson**: Metadata features must work across different page types with different content. |
| 70 | + |
| 71 | +**Context**: Implementation was tested on data types pages and command pages, which have different metadata fields. |
| 72 | + |
| 73 | +**Takeaway**: Always test on at least 2-3 different page types to ensure the feature is robust and handles optional fields correctly. |
| 74 | + |
| 75 | +## Implementation Checklist for Future Metadata Features |
| 76 | + |
| 77 | +When implementing new metadata features, follow this order: |
| 78 | + |
| 79 | +1. **Define the schema** (`static/schemas/feature-name.json`) |
| 80 | + - Specify required and optional fields |
| 81 | + - Use JSON Schema Draft 7 |
| 82 | + - Include examples |
| 83 | + |
| 84 | +2. **Create documentation** (`build/metadata_docs/FEATURE_NAME_FORMAT.md`) |
| 85 | + - Explain the purpose and structure |
| 86 | + - Show examples |
| 87 | + - Document embedding locations (HTML, Markdown) |
| 88 | + |
| 89 | +3. **Implement the feature** |
| 90 | + - Create/modify Hugo partials |
| 91 | + - Test on multiple page types |
| 92 | + - Verify output in both HTML and Markdown formats |
| 93 | + |
| 94 | +4. **Validate the output** |
| 95 | + - Write validation scripts |
| 96 | + - Test against the schema |
| 97 | + - Check whitespace and formatting |
| 98 | + |
| 99 | +5. **Document implementation notes** |
| 100 | + - Capture lessons learned |
| 101 | + - Note any workarounds or gotchas |
| 102 | + - Provide guidance for future similar features |
| 103 | + |
| 104 | +## Common Gotchas |
| 105 | + |
| 106 | +- **HTML entity escaping**: Use `safeHTML` filter when outputting HTML/JSON in markdown templates |
| 107 | +- **Whitespace in templates**: Use `{{-` and `-}}` to trim whitespace |
| 108 | +- **Nested structures**: Test deeply nested content to ensure regex patterns handle all cases |
| 109 | +- **Optional fields**: Remember that not all pages have all metadata fields |
| 110 | +- **Markdown vs HTML**: Always test both output formats |
| 111 | + |
| 112 | +## Tools and Techniques |
| 113 | + |
| 114 | +- **Hugo filters**: `replaceRE`, `jsonify`, `safeHTML` |
| 115 | +- **Validation**: Python's `jsonschema` library for schema validation |
| 116 | +- **Testing**: Extract metadata from generated files and validate against schema |
| 117 | +- **Debugging**: Use `grep` and `head` to inspect generated output |
| 118 | + |
| 119 | + |
0 commit comments