IBD Imaging Digest

tooling

NLP

literature

obsidian

An automated daily literature monitor for IBD imaging research. Fetches new papers, ranks by semantic relevance against a personal corpus, and delivers a tiered digest into Obsidian.

Published

May 20, 2026

Modified

June 5, 2026

What it does

Keeping up with IBD imaging literature is a daily task that, left unstructured, turns into an hour of tab-opening and half-read abstracts. This pipeline replaces that with a ranked digest that arrives every morning in my Obsidian vault.

The system pulls from four sources: PubMed via E-utilities and three radiology journals (Radiology, Radiology: AI, European Radiology) via Crossref. Papers are deduplicated by DOI and stored in a local SQLite database. Each new paper is embedded with SPECTER2 and compared by cosine similarity to a seed corpus of papers I have curated in Zotero over the past few years. The resulting similarity scores are converted to percentile ranks and used to assign one of three tiers: Must-read (≥ 0.958), Skim (0.924 to 0.958), or Archive (below that).

The daily digest renders as an Obsidian Markdown file with callout blocks and checkbox conventions. The checkboxes are not cosmetic: they are intended to feed a future ranking improvement loop (Step 6) once there is enough signal from real use.

Update: Source coverage expanded from 4 to 18. Crossref now covers 17 journals across focused radiology, focused IBD, imaging AI, and broader GI and AI venues. The PubMed query gained a second branch not gated on IBD terms, covering abdominal radiology AI and agentic AI more broadly. European Radiology, which was initially fetched via Springer’s RSS feed (returning empty author lists and truncated abstracts), was switched to the Crossref API to match the pattern used for the other journals.

A second checkbox was added to must-read and skim papers: Read later. Checked papers route to a persistent rolling note at Inbox/To Read.md, with full abstracts included for offline reading, deduplicated by DOI. The Relevant checkbox is unchanged; the two are strictly independent.

Why SPECTER2 and a personal corpus

Generic keyword search returns too much and misses context. SPECTER2 is trained on citation graphs, which means papers that are conceptually related to my seed set score higher even if they use different terminology. The Zotero corpus acts as a standing definition of what “relevant” means for my specific research questions: quantitative MRI in IBD, motility, Bayesian modeling, and clinical AI pipelines. The ranker inherits that definition automatically.

The percentile-based thresholds are a deliberate choice over fixed cosine cutoffs. Score distributions shift with topic and corpus size. Calibrating to percentiles keeps the tiers stable relative to each other, independent of those shifts.

Update: The thresholds stated above (must-read ≥ 0.958, skim 0.924 to 0.958) are correct, but were not the original values. The starting thresholds were 0.75 and 0.60, set before any real data existed. After the first run produced 196 papers in must-read, the score distribution was inspected (min ~0.825, median ~0.924, p90 ~0.958) and the thresholds were recalibrated to match observed percentiles. They are empirically derived, not a priori.

Infrastructure

The pipeline runs via GitHub Actions on a daily schedule. SPECTER2 is cached on the runner to avoid repeated downloads. The Obsidian vault doubles as the Git repository.

Current state

Steps 1 through 5 are complete: fetching, deduplication, SPECTER2 ranking, Obsidian formatting, GitHub Actions scheduling, and threshold recalibration.

The current review workflow is intentionally manual. Each day I pull the repository, read the digest, and check off papers in Obsidian before pushing. That checkpoint is where the real filtering happens for now: the ranker proposes, the reading confirms or overrides. Step 6 will eventually parse those checkbox states to sharpen the ranker automatically, but that requires a few weeks of consistent signal before it is worth building.

Out of scope for v1: LLM abstract summaries, citation counts, Europe PMC / bioRxiv / medRxiv / arXiv, and Telegram or email notifications.

Update: Step 6 is not next in priority. The observation period is over; the decision is to write the first weekly IBD Digest column before building the feedback loop. Writing will reveal whether ranking quality is actually the bottleneck. The feedback loop remains the one unbuilt pipeline step, but it is deliberately deferred.

What is next

A few weeks of daily use will tell me whether the tier thresholds are well calibrated or need adjustment. After that, Step 6 becomes the next meaningful milestone.

Longer term, I plan to publish a weekly IBD Digest on this site: a short roundup of the must-read papers from the previous seven days, with brief commentary on what is interesting and why. That serves two purposes. It forces me to actually read the digest rather than just generate it, and it creates a public record of what is worth paying attention to in this field.

Update: The first weekly column is now the immediate next step, ahead of Step 6. The threshold calibration question has been answered by use; ranking quality is good enough to write from.

Built in one day using Python, SPECTER2, SQLite, and GitHub Actions, with Claude Code.