Open Access. Powered by Scholars. Published by Universities.®

Physical Sciences and Mathematics Commons

Open Access. Powered by Scholars. Published by Universities.®

Artificial Intelligence and Robotics

SelectedWorks

Juan-Manuel Torres-Moreno

Articles 1 - 1 of 1

Full-Text Articles in Physical Sciences and Mathematics

Diseg 1.0: The First System For Spanish Discourse Segmentation, Iria Da Cunha, Eric Sanjuan, Juan-Manuel Torres-Moreno, Marina Lloberes, Irene Castellon Jan 2012

Diseg 1.0: The First System For Spanish Discourse Segmentation, Iria Da Cunha, Eric Sanjuan, Juan-Manuel Torres-Moreno, Marina Lloberes, Irene Castellon

Juan-Manuel Torres-Moreno

Nowadays discourse parsing is a very prominent research topic. However, there is not a discourse parser for Spanish texts. The first stage in order to develop this tool is discourse segmentation. In this work, we present DiSeg, the first discourse segmenter for Spanish, which uses the framework of Rhetorical Struc- ture Theory and is based on lexical and syntactic rules. We describe the system and we evaluate its per- formance against a gold standard corpus, divided in a medical and a terminological subcorpus. We obtain promising results, which means that discourse segmentation is possible using shallow parsing.