Thoughts .toString()

Shiki: Line numbering for Shiki syntax highlighting


Conventional wisdom states that new bloggers should focus on content instead of trying to build their own platforms, but for a dev blogger being able to dictate how a block of code looks is significant for quality of life. If I’m going to spend a lot of time discussing code, then it helps to be able to cite line numbers and sources to give context when code is copied from somewhere. It helps to be able to distinguish between in-code comments and comments used to discuss code for blogging purposes. It helps to label filenames and languages without again polluting code comments. And all of these issues are why platforms such as Medium are a no-go for me, besides the annoying paywall. There just isn’t the customization I desire.

Before switching to Shiki, I wrote another piece of code to try to get highlight.js to output the format I wanted. A key problem was regarding code with multiple syntaxes in one file, such as with Astro or React components. highlight.js struggled here, and I had to add some Javascript to break code into sub-blocks so the library could interpret them as different languages. I switched to Shiki because it doesn’t have this problem. But Shiki also doesn’t support line numbering or a lot of other things out of the box.

The other reason for using Shiki is that it is shipped with Astro by default, though this is both a pro and a con. The disadvantage is that the integration forces customizations to be applied after the fact. For example, I wouldn’t be able to apply CSS variables at the time that a code block is created, before the syntax highlighting starts. This makes solutions that rely purely on CSS such as this impractical, unless I resign to hacky methods such as <style> tags right before a code fence in Markdown.

The plan

Let’s start with a vision, a proposal, for how lines of code should generally behave. Below is a block demonstrating what it usually looks like when long lines wrap around.

javascript
1if (deeply) {
2     if (nested) {
3          if (block) {
4               if (endIndex < startIndex) {
5                    console.error("This is a really long error meant to cause the line to wrap around and around, making many code blocks look like crap.");
6               }
7          }
8     }
9}

In the following block, the whitespace preceding each line is in its own inline-block span, causing the wrap-around the start at a more appropriate indent.

javascript
1if (deeply) {
2    if (nested) {
3        if (block) {
4            if (endIndex < startIndex) {
5                console.error("This is a really long error meant to cause the line to wrap around and around, making many code blocks look like crap.");
6            }
7        }
8    }
9}

Then, with a bit of CSS (text-indent: 4ch hanging;), we can create a single indent from the start of the text.

javascript
1if (deeply) {
2    if (nested) {
3        if (block) {
4            if (endIndex < startIndex) {
5                console.error("This is a really long error meant to cause the line to wrap around and around, making many code blocks look like crap.");
6            }
7        }
8    }
9}

This version is what we’re making. What does this have to do with line numbers? Since we’re already breaking up the line, might as well do this at once.

Shiki uses the Hypertext Abstract Syntax Tree (hast) to represent HTML elements. Each line is represented by a <span> element, and beneath that are a number of spans for each time the code changes color. These “token” spans each have exactly one text element as child. In other words,

plaintext
1[Element tagName: span, class: line]
2    [Element tagName: span]
3        [Text value: console.]
4    [Element tagName: span]
5        [Text value: error]
6    [Element tagName: span]
7        [Text value: (]
8    [Element tagName: span]
9        [Text value: "This is a really long error..."]
10    [Element tagName: span]
11        [Text value: );]
javascript
1console.error("This is a really long error...");

Shiki Transformers

Shiki Transformers were introduced as a feature some time in the past year to allow users to hook into different parts of the syntax highlighting process and inject code to alter the output. For this task, we’ll be hooking into when each line is created. Note that to keep things succinct, we’ll leave out some of the type and error checking.

/src/shiki/transforms/linenumbers.tstypescript
1import type { ShikiTransformer } from 'shiki';
2import type { Element, Text } from 'hast';
3
Helper function to reduce number of lines it takes to create an element.
4const createElement = (tagName: string): Element => {
5    return { type: 'element', tagName, properties: {}, children: []};
6}
7
8const transformer: ShikiTransformer = {
9    line(line: Element, index: number) {
We leverage the fact that the whitespace at beginnings of lines are always attached to the first color of code.
10        const firstTextSpan = line.children[0] as Element;
11        const textNode = firstTextSpan.children[0] as Text;
12        const text: string = textNode.value;
Match all whitespace that starts from the beginning of the text.
13        const match: string = text.match(/^\s*/g)![0];
Split the whitespace from the element.
14        const splitSpan = createElement('span');
15        splitSpan.children = [{ type: 'text', value: match }];
16        firstTextSpan.children = [{ type: 'text', value: text.slice(match.length)}];
Create divs for the different line parts.
17        const lineNumberDiv = createElement('div');
18        lineNumberDiv.properties['data-line-number'] = '';
19        lineNumberDiv.children = [{ type: 'text', value: index.toString() }];
20        
21        const lineWhitespaceDiv = createElement('div');
22        lineWhitespaceDiv.properties['data-line-whitespace'] = '';
23        lineWhitespaceDiv.children = [splitSpan];
24        
25        const lineCodeDiv = createElement('div');
26        lineCodeDiv.properties['data-line-code'] = '';
27        lineCodeDiv.children = line.children;
Place these divs under the line.
28        line.properties['data-line'] = '';
29        line.children = [lineNumberDiv, lineWhitespaceDiv, lineCodeDiv];
30        return line;
31    },
32    code(code) {
<code> block wraps all the lines. Use digits to set styling.
33        const numLines = code.children.length;
34        code.properties['data-line-number-digits'] = numLines.toString();
35        return code;
36    }
37};
38
39export default transformer;

Styling

Complete styling might be too much in a post, but I’ll include some essentials. Astro changes class on the outer <pre> tag from shiki-code to astro-code, so I’ll use that.

css
1.astro-code {
2    & code {
Extend every line to end even for short lines.
3        display: grid;
Give more spacing depending on line numbering digits.
4        &[data-line-number-digits="1"],
5        &[data-line-number-digits="2"] {
6            width: 1.5rem;
7        }
8        &[data-line-number-digits="3"] {
9            width: 2.25rem;
10        }
11        &[data-line-number-digits="4"] {
12            width: 3rem;
13        }
14    }
15
16    & [data-line] {
Make sure to wrap long lines.
17        display: flex;
18        white-space: pre-wrap;
19        overflow-x: hidden;
20        justify-content: flex-start;
21        align-items: start;
22
23        & [data-line-number] {
These make the number appear in the upper right of a line (in case of a wrapped line).
24            display: inline-block;
25            height: 100%;
26            text-align: right;
27            vertical-align: top;
Give a light gray color and line separating the numbering. Realistically, use @media for dark themes.
28            padding-right: 0.5rem;
29            color: #bbb;
30            border-right: 1px solid #bbb;
Prevent line numbers from shifting when browser width is adjusted.
31            flex-grow: 0;
32			flex-shrink: 0;
33        }
34
35        & [data-line-whitespace] {
Prevent whitespace from collapsing on small screens.
36            display: inline-block;
37            vertical-align: top;
38            height: 100%;
39            flex-shrink: 0;
40        }
41
42        & [data-line-code] {
43            display: inline-block;
44            vertical-align: top;
45            height: 100%;
Wrap long lines with a hanging indent.
46            text-wrap: wrap;
47            text-indent: 4ch hanging;
48        }
49    }
50}