Overview and Sub-Projects
The Digital Tolkien Project consists of a wide range of sub-projects at varying levels of activity and progress.
At the highest level, these can be divided into:
- Data Projects involving text preparation, structural markup and citation systems, annotation, and analysis
- Websites and other outputs for navigating, searching, querying and visualizing information
- Software Tools for facilitating all of the above
Underpinning much of this is the Tolkien Linked Open Data project which aims to achieve greater connectivity between sub-projects and greater accessibility to data to facilitate collaboration.
Data Projects for Tolkien’s Mythopoeic Works
Bibliographic Modelling
Bibliographic information and metadata and the relationship between editions, impressions, and variants of text.
Relevant Videos and Clips
Status and Plans
Further expanding our data model and filling it out with edition and impression information.
Text Preparation
We distinguish between published works consisting entirely of narrative (e.g. The Lord of the Rings) from those that combine fragments with commentary (e.g. Unfinished Tales or The History of Middle-earth). In the latter case, we both model the published work as a whole and extract individual texts which can then be treated separately (e.g. Quenta Noldorinwa).
Texts are marked up structurally and in the case of commented works prior to extraction, this structure distinguishes the source texts from the notes and commentary. This structural markup then forms the basis for a citation system.
Once a citation system exists, the text can be made available on Search Tolkien and Cite Tolkien (see below).
Older Blog Posts
- Marking Up The Hobbit in XML
- Punctuation and Structure in Marking Up Direct Speech
- Aligning with the LR Citation System
- The Hobbit Citation System
- Minimal Prefixes to Identify Hobbit Paragraphs
Relevant Videos and Clips
- Text Preparation (base texts, standoff annotations, footnotes, quotation marks)
- HoMe and Túrinsaga Text Preparation
- Great Tales Text Preparation
Status and Plans
Currently working on text preparation of The History of Middle-earth and The Great Tales.
Text Annotation and Analysis
Once a text is prepared various annotations and analyses can be performed including:
- word extraction for glossary
- direct speech markup and speaker identification
- name index and name mapping
- linguistic analysis (postagging, etc)
- alignment with other versions
- scene segmentation
- characters and locations
- dates and times
- narrative features
Relevant Videos and Clips
- Speaker Identification
- Name Index Background
- Name Index Continued
- Silmarillion Name Mapping
- Version Alignment with N-grams (Ainulindalë example)
- Text Reuse Prototype (Grey Annals vs Published Silmarillion)
- The Túrinsaga Alignment
- Diffing Drafts and Versions of the Hobbit
- Hobbit Scene Segmentation and Annotation
Status and Plans
Speaker identification has been completed for The Hobbit, The Lord of the Rings, and The Silmarillion. Currently working on name mapping, characters, locations, and scene segmentation for The Hobbit, The Lord of the Rings, and The Silmarillion. Progress has been made on aligning drafts and multiple editions of The Hobbit. Currently working on aligning various versions of the Túrin story.
As more texts are prepared, words will be extracted for the glossary and other annotation tasks will be performed.
Glossary / Word-level Data
Phonological, etymological, grammatical, and semantic information about individual words in the texts. Published on the Tolkien Glossary site.
Relevant Videos and Clips
Status and Plans
Tagging various semantic domains; expanding coverage to other texts; adding pronunciation, etymology, and part-of-speech; distinguishing speakers and, in the case of commented works, other authors.
Timeline Data
Extraction of events, dates, and times from both explicit chronologies and annals as well as narrative text.
Status and Plans
Eventually support filtering and visualizing on the Timelines website.
Location Data / Maps
Extraction of place names and location information from texts and maps as well as the development of authority lists.
Status and Plans
Currently annotating locations at the paragraph level in The Lord of the Rings and mapping place names to places in The Silmarillion.
Older Blog Posts
Relevant Videos and Clips
Character Data / Genealogy
Extraction of character names and relationships from texts and paratextual material as well as the development of authority lists.
Status and Plans
Status and Plans
Currently annotating characters at the paragraph level in The Lord of the Rings and mapping character names to characters in The Silmarillion.
Poetry Data
Chronology, analysis, and citation of Tolkien’s poetry. Published on the Tolkien Poetry website.
Status and Plans
Improving metadata and chronological information, metrical and rhyme analysis.
Audiobook Data
Linking timecodes and passage citations.
Status and Plans
Initial transcriptions of audio files to align with text.
Other Data Projects
Tolkien’s Life
Chronology, letters, biography, archival records
See Tolkien’s Life on the Tolkien Linked Open Data Project site.
Status and Plans
Annotating the Scull and Hammond Chronology and Carpenter’s Biography.
Secondary Material / Scholarship
Art, music, blogs, podcasts, articles, and monographs
See Linking Secondary Material on the Tolkien Linked Open Data Project site.
Status and Plans
Gathering initial material to link to.
Websites
Main Website
Search Tolkien
Search across works, see the distribution of terms and phrases, look up the citation reference for passages.
https://search.digitaltolkien.com/
Relevant Videos and Clips
Status and Plans
Adding more texts, supporting searching of multiple terms, visualization of results, filtering on speech, etc.
Cite Tolkien
Look up a citation reference, make citation references linkable, and navigate the structure of texts.
https://cite.digitaltolkien.com/
https://cite-draft.digitaltolkien.com/
Relevant Videos and Clips
- How to use Search Tolkien and Cite Tolkien
- Draft Cite Tolkien
- Cite Tolkien ANNOUNCEMENT!
- Cite / Search Improvements
Status and Plans
Adding more texts. Drafts of new citations systems are shown on the DRAFT Cite Tolkien site.
Tolkien Glossary
Presenting word-level information (see above).
https://glossary.digitaltolkien.com/
Relevant Videos and Clips
Status and Plans
New ways of visualizing and searching the data.
Tolkien Poetry
Search, navigate, and visualize information about Tolkien’s poetry (see above).
https://poetry.digitaltolkien.com/
Relevant Videos and Clips
Status and Plans
Full text search and visualization of analysis.
Tolkien Timelines
Upcoming site that will enable visualization and filtering of events on a timeline.
https://timelines.digitaltolkien.com/
Status and Plans
Once the Linked Open Data project has annotated event data, we can start to show it here.
Little Delvings
Standalone visualizations based on the text and annotations of the Digital Tolkien Project.
https://delvings.digitaltolkien.com/
Relevant Videos and Clips
- Delving into The Little Delvings 0001–0006
- Delving into The Little Delvings 0007–0013
- Little Delvings Reboot
Status and Plans
Continue to expand the capabilities of Belladonna, come up with new ideas of things to visualize from existing data, highlight new analyses coming from ongoing annotation work.
ROP Data site
Search and visualizations of characters, locations, scenes, and dialogue in Amazon Prime’s Rings of Power.
https://rop.digitaltolkien.com/
Relevant Videos and Clips
Status and Plans
Finish S2E8 ingestion. Potentially add more features. Generalize to other adaptations.
LOD Project site
Informational site about the Tolkien Linked Open Data Project.
https://lod.digitaltolkien.com/
Relevant Videos and Clips
- What is Linked Open Data? A Tolkien Example
- Linked Open Data Project Pre-Launch
- Linked Open Data Launch Update
Status and Plans
As the LOD project develops, we will continue to document guidelines and progress on this site.
Software Tools
Belladonna
A Python tool for making beautiful infographics to post to social media. Primarily used for Little Delvings.
Relevant Videos and Clips
Status and Plans
Continuing to add more recipes, clean up the code, and eventually open source.
Rimbë
A web application for crowd-sourced text annotation. We are currently using this for annotating characters, locations, scenes, and narrative features in The Lord of the Rings.
Relevant Videos and Clips
Status and Plans
Continuing to develop features to support the annotation team. May eventually open source.
Arda
An open-source Python library for Tolkien-related data processing. It currently includes some code for Middle-earth calendrical calculations and Elvish syllabification.
http://github.com/digitaltolkien/arda
Relevant Videos and Clips
Status and Plans
Continue to improve both the calendrical and linguistic code.
Various Alignment Tools
Various ad-hoc tools built to assist in the alignment of different versions of texts.
Relevant Videos and Clips
- Alignment Tooling and Demo
- Alignment Tool Example
- Version Alignment with N-grams (Ainulindalë example)
- Text Reuse Prototype (Grey Annals vs Published Silmarillion)
- The Túrinsaga Alignment
- Diffing Drafts and Versions of the Hobbit
Status and Plans
Consolidate and build into a more generally usable tool for others.
Documentary Transcript Viewer
An experimental tool for viewing videos with timestamped transcriptions. Originally built as part of a project to annotate Tolkien-related documentaries.
Relevant Videos and Clips
Reading Environments
One of our long term goals is to develop an online reading environment to view texts and annotations.