Splitting References Across Multiple Reveal.js Slides

A Python post-render script for Quarto that splits long bibliographies across multiple Reveal.js slides — useful when you need to print your lecture slides.

4 min read

If you use Quarto to build Reveal.js presentations, you’ve probably run into this: your bibliography ends up on a single slide. On screen that’s fine as Reveal.js makes the slide scrollable, and you move on. But the moment you try to print those slides or export them to PDF, that single references slide becomes a mess. References overflow, get cut off, or just disappear.

I ran into this problem while preparing my lecture slides. I have quite a few references in some of my presentations, and I wanted a PDF version of the slides that actually include all of them. It took me a while to find a clean solution, so I figured I’d share it here.

Why this happens

When Quarto renders a bibliography in a Reveal.js presentation, it places all your references inside a single <section> element on one slide. Quarto adds the smaller and scrollable classes so you can scroll through them during a talk, but that doesn’t help when you’re printing. What you need is the references split across multiple slides, each with a manageable number of entries.

I looked around for a built-in Quarto option or a Lua filter that could handle this, but didn’t find anything that worked reliably. The issue is that the bibliography HTML is generated at render time, and Reveal.js has its own expectations about how slides are structured. A post-render approach, modifying the final HTML after Quarto is done, turned out to be the most straightforward path.

The approach

The idea is simple:

  1. Find the <section> containing the bibliography (#refs div) in the rendered HTML
  2. Extract each individual reference entry (csl-entry divs)
  3. Chunk them into groups (e.g., 7 per slide)
  4. Replace the original section with multiple new <section> elements, each containing a subset of references

Luckily, Quarto supports post-render scripts – scripts that run automatically after rendering. This is exactly the right hook for this kind of HTML manipulation.

The script

Here’s the full Python script. Save it as split-refs-post.py in your project directory:

#!/usr/bin/env python3
"""
Post-render script for Quarto: splits the bibliography #refs section
across multiple Reveal.js slides in the final HTML output.
"""

import re
import sys
from pathlib import Path

REFS_PER_SLIDE = 7
SLIDE_TITLE_FIRST = "Literatur"
SLIDE_TITLE_CONT = "Literatur (Forts.)"

# Classes to carry over from the original section to new slides
SECTION_CLASSES = "slide level2 smaller scrollable"


def find_refs_section(html: str):
    """Find the <section> that contains <div id="refs" ...>."""
    pattern = re.compile(
        r'(<section[^>]*id="literatur"[^>]*>)'
        r"(.*?)"
        r"(</section>)",
        re.DOTALL,
    )
    return pattern.search(html)


def extract_csl_entries(refs_html: str) -> list[str]:
    """Extract individual csl-entry divs from the refs block."""
    pattern = re.compile(
        r'<div\s+id="ref-[^"]*"\s+class="csl-entry"[^>]*role="listitem">'
        r".*?"
        r"</div>",
        re.DOTALL,
    )
    return pattern.findall(refs_html)


def build_slide(title: str, entries: list[str], section_classes: str) -> str:
    """Build a single Reveal.js slide section with bibliography entries."""
    entries_html = "\n".join(entries)
    return (
        f'<section class="{section_classes}">\n'
        f"<h2>{title}</h2>\n"
        f"\n"
        f'<div class="references csl-bib-body hanging-indent"'
        f' data-entry-spacing="0" data-line-spacing="2" role="list">\n'
        f"{entries_html}\n"
        f"</div>\n"
        f"</section>"
    )


def split_refs(html: str) -> str:
    """Main processing: find refs section, split into multiple slides."""
    match = find_refs_section(html)
    if not match:
        print("[split-refs-post] No #literatur section found.", file=sys.stderr)
        # Try a more generic pattern
        pattern = re.compile(
            r"(<section[^>]*>)\s*<h2>[^<]*</h2>\s*"
            r'(<div\s+id="refs"[^>]*>.*?</div>\s*)\s*'
            r"(</section>)",
            re.DOTALL,
        )
        match = pattern.search(html)
        if not match:
            print("[split-refs-post] No section with #refs div found.", file=sys.stderr)
            return html

    section_content = match.group(0)
    entries = extract_csl_entries(section_content)
    print(f"[split-refs-post] Found {len(entries)} bibliography entries.", file=sys.stderr)

    if len(entries) <= REFS_PER_SLIDE:
        print(f"[split-refs-post] No splitting needed.", file=sys.stderr)
        return html

    chunks = [
        entries[i : i + REFS_PER_SLIDE]
        for i in range(0, len(entries), REFS_PER_SLIDE)
    ]
    print(f"[split-refs-post] Splitting into {len(chunks)} slides.", file=sys.stderr)

    slides = []
    for idx, chunk in enumerate(chunks):
        title = SLIDE_TITLE_FIRST if idx == 0 else SLIDE_TITLE_CONT
        slides.append(build_slide(title, chunk, SECTION_CLASSES))

    replacement = "\n".join(slides)
    return html[: match.start()] + replacement + html[match.end() :]


def main():
    if len(sys.argv) > 1:
        files = [Path(f) for f in sys.argv[1:]]
    else:
        files = list(Path(".").glob("*.html"))

    for filepath in files:
        if not filepath.suffix == ".html":
            continue
        if not filepath.exists():
            continue

        print(f"[split-refs-post] Processing {filepath}", file=sys.stderr)
        html = filepath.read_text(encoding="utf-8")
        modified = split_refs(html)

        if modified != html:
            filepath.write_text(modified, encoding="utf-8")
            print(f"[split-refs-post] Split references in {filepath}", file=sys.stderr)
        else:
            print(f"[split-refs-post] No changes made to {filepath}", file=sys.stderr)


if __name__ == "__main__":
    main()

The key parts: find_refs_section locates the bibliography slide using a regex for the section’s ID. extract_csl_entries pulls out each individual reference. Then split_refs chunks them into groups of 7 and builds new slide sections. The fallback pattern handles cases where the section ID differs from what you’d expect.

You can adjust REFS_PER_SLIDE to fit your slide layout, and change the title constants to match your language.

Setting it up

To run the script automatically every time you render, add it to your _quarto.yml. That’s it. Quarto will pass the output file paths as arguments to the script after rendering. You can also run it manually:

quarto render slides.qmd
python split-refs-post.py slides.html

Wrapping up

It’s a small thing, but it made my PDF slides noticeably better and students can write down references on the slides. If you’re in a similar situation (lots of references, Reveal.js slides, and a need for decent print output) I hope this saves you some of the time I spent figuring it out.