Can You See Whose Speech Is Overlapping?

Authors

  • Charles F. Meyer
  • Ed Blanchman
  • Robert A. Morris

Abstract

Recently in linguistics there has developed an increased interest in the analysis of computer corpora — examples of speech and writing distributed in machine-readable form. Computer corpora are typically annotated with markup to indicate such phenomena as paragraph boundaries and titles in written texts and pauses and speaker turns in spoken texts. As computer corpora become more common in linguistics, linguists need to concern themselves not just with developing standards for the markup they use but with ensuring that this markup is presented to the user in as readable a format as possible. In our discussion, we focus on a common characteristic of speech that any annotation system must deal with — overlapping speech — and describe software that we have developed that not only accurately marks the boundaries of overlaps but presents them to the user in a very readable format. First we discuss the types of overlapping speech that any markup system will have to describe and then we critique two types of current systems for marking overlaps: those that stress readability and those that emphasize descriptive adequacy. We describe the problems inherent in each of these systems and conclude by discussing a system we have developed which is based on sophisticated document processing software. This software presents speech overlaps in vertical columns and balances the necessity of accurately describing the boundaries of overlaps with the need of the user to be presented this information in as readable manner as possible.

Downloads

Published

1994-04-01

Issue

Section

Journal Article