Bridging the Gaps between Digital Humanities, Lexicography, and Linguistics: A TEI Dictionary for the Documentation of Mixtepec-Mixtec

Abstract : This paper discusses the digital dictionary component in an ongoing language documentation project for the Mixtepec-Mixtec language (iso 639-3: mix). Mixtepec-Mixtec (Sa'an Savi 'rain language') is an Oto-monguean language spoken by roughly 9,000-10,000 people in the Juxtlahuaca district of Oaxaca and in parts of the Guerrero and Puebla states of Mexico. Creating a digital dictionary for an under-resourced language entails a number of challenges that require unique and nuanced encoding solutions in which a delicate balance between the linguistic content, data structure, potential linked resources, and editorial metadata must be found. Herein we demonstrate how we use TEI to create a reusable, extensible, and machine readable language resource with an emphasis on how our solutions using a combination of novel and established TEI dictionary structures enable us to address our specific needs for Mixtepec-Mixtec and also provide a relevant roadmap for similar under-resourced language projects.
Complete list of metadatas

Cited literature [30 references]  Display  Hide  Download

https://hal.inria.fr/hal-01968871
Contributor : Jack Bowers <>
Submitted on : Thursday, January 3, 2019 - 2:00:30 PM
Last modification on : Saturday, April 20, 2019 - 1:57:46 AM

File

04_RWiP_Bowers-Romary-edited-A...
Files produced by the author(s)

Identifiers

  • HAL Id : hal-01968871, version 1

Collections

Citation

Jack Bowers, Laurent Romary. Bridging the Gaps between Digital Humanities, Lexicography, and Linguistics: A TEI Dictionary for the Documentation of Mixtepec-Mixtec. Dictionaries: Journal of the Dictionary Society of North America, Dictionary Society of North America, 2018, 39 (2), pp.79-106. ⟨hal-01968871⟩

Share

Metrics

Record views

108

Files downloads

131