Embedding Metadata and Other Semantics in Word Processing Documents

Peter Sefton, Ian Barnes, Ron Ward, Jim Downing


This paper describes a technique for embedding document metadata, and potentially other semantic references inline in word processing documents, which the authors have implemented with the help of a software development team. Several assumptions underly the approach; It must be available across computing platforms and work with both Microsoft Word (because of its user base) and OpenOffice.org (because of its free availability). Further the application needs to be acceptable to and usable by users, so the initial implementation covers only small number of features, which will only be extended after user-testing. Within these constraints the system provides a mechanism for encoding not only simple metadata, but for inferring hierarchical relationships between metadata elements from a ‘flat’ word processing file.The paper includes links to open source code implementing the techniques as part of a broader suite of tools for academic writing. This addresses tools and software, semantic web and data curation, integrating curation into research workflows and will provide a platform for integrating work on ontologies, vocabularies and folksonomies into word processing tools.

Full Text:


DOI: http://dx.doi.org/10.2218/ijdc.v4i2.96

The International Journal of Digital Curation. ISSN: 1746-8256
The IJDC is published by the University of Edinburgh
and is a publication of the Digital Curation Centre.