Processed Islamic books from Prophet's Mosque Library in PDF, TXT, and DOCX.
This dataset consists of PDF files from the Prophet's Mosque Library along with TXT and DOCX versions generated via document AI processing.
It supports image-to-text and NLP research involving large-scale Arabic Islamic texts across religious and scholarly topics.
Fine-tune transformers or embeddings on the extracted TXT files for tasks like classical Arabic understanding or Islamic topic classification.
Index the DOCX/TXT content to create retrieval systems that surface passages across 70+ categories of religious literature.
Run corpus-wide statistics, theme extraction, or cross-category comparisons using the 23 million pages of processed Islamic books.
from datasets import load_dataset
ds = load_dataset("ieasybooks-org/prophet-mosque-library")A collection of 70,884 PDFs from 48,717 Islamic books with TXT and DOCX text extracted via Google Document AI, spanning over 70 categories.
Verified reviews from the community shape this listing's rating.
Loading reviews…