Downloads PDFs from LibGen (primary) or Anna's Archive API (fallback), converts to markdown via marker_single, and prints to stdout. Includes XDG-compliant caching, nix flake with marker-pdf packaging, and a Claude Code skill for paper-reader integration. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1.3 KiB
1.3 KiB
paper CLI — Design
A CLI tool that downloads academic papers by DOI from Anna's Archive and converts them to markdown.
CLI Interface
paper <DOI>
Single positional argument. Markdown output goes to stdout.
paper 10.1038/nature12373 > paper.md
Download Flow
- Request
https://annas-archive.org/scidb/<DOI>with a browser-like User-Agent - Parse HTML for
<iframe>or<embed>withid="pdf"— extractsrcfor direct PDF URL - Fallback: find any link ending in
.pdf - Download PDF to a temp file
- Exit with clear error if no PDF found
Conversion
- Shell out to
marker_single <tempfile.pdf> --output_dir <tempdir> - Read the generated
.mdfile from the output dir - Print to stdout
- Clean up temp dir
Error Handling
marker_singlenot on PATH: tell user to install (pip install marker-pdf)- Conversion failure: forward marker's stderr
- Network errors: surface reqwest errors clearly
- No PDF found: specific error message with the DOI
Dependencies
clap— argument parsingreqwest(blocking, rustls-tls) — HTTPscraper— HTML parsingtempfile— temp directoryanyhow— error handling
Dev Environment
The nix flake includes Rust nightly toolchain and marker-pdf in the devshell.