Skip to content
doc-build-dev logo

doc-build-dev

Verified

Docs from Hugging Face PRs updating official documentation.

DatasetAI & Machine Learning649K/moFree
Open dataset
Updated 2026-06-15

What is doc-build-dev?

This dataset contains documentation files extracted from pull requests that modify the official Hugging Face documentation site.

It serves developers and contributors working on documentation for Hugging Face libraries and models by providing historical update data.

What you can build with doc-build-dev

Track Documentation Evolution

Analyze patterns in how Hugging Face library documentation updates across multiple PRs over time.

Train Documentation Models

Fine-tune NLP models on real doc diffs to suggest improvements or detect outdated sections.

Build Change Monitoring Tools

Create scripts that summarize or alert on documentation modifications from open-source PR activity.

Load doc-build-dev

Python
from datasets import load_dataset

ds = load_dataset("hf-doc-build/doc-build-dev")
  1. 1pip install datasets
  2. 2from datasets import load_dataset
  3. 3ds = load_dataset('hf-doc-build/doc-build-dev')
  4. 4Explore splits and filter by PR or doc page
  5. 5Process diffs with pandas or Hugging Face tokenizers

doc-build-dev: pros & cons

Pros

  • +Automatically updated via GitHub Actions
  • +Contains real PR-based documentation changes
  • +Spans multiple Hugging Face libraries
  • +Structured for direct NLP or analysis use

Cons

  • Scope limited to Hugging Face docs only
  • Data quality tied to original PR content
  • May need extra parsing for complex diffs
Did you find this helpful?

Frequently asked questions

A dataset that aggregates documentation changes from pull requests targeting Hugging Face docs.

User reviews

Verified reviews from the community shape this listing's rating.

Loading reviews…

Sign in to review

Promote doc-build-dev

Add this badge to your website, or share the tool.

DFeatured on Dhanasvidoc-build-dev 0