Authors | Harper MP, Dorr BJ, Hale J, Roark B, Shafran I, Lease M, Liu Y, Snover M, Yung L, Krasnyanskaya A |
Abstract | This report describes research conducted by the Parsing and Spoken Structural Event Detection(PaSSED) team as part of the 2005 Johns Hopkins Summer Workshop on Language Engineering.
This project investigated the interaction between parsing and the detection of structural metadata in
conversational speech, including sentence boundaries, edits (the reparandum portion of speech repairs),
and fillers. In terms of parsing, we explored alternative methods of exploiting metadata information in
parsing models and measured how varying accuracy in transcription and metadata information affects
parsing accuracy. In the other direction, we similarly considered how syntactic and prosodic knowledge
could be leveraged in metadata detection, measuring how this knowledge impacts metadata detection
accuracy.
As part of this work, we investigated metrics for evaluating parse accuracy in the presence of tran-
scription and metadata detection errors, and we report on our experience using these metrics with several
parsers and across varying experimental conditions. A range of methods for handling edits during pars-
ing were evaluated in this research (excision, addition of markups to the input string, and grammar
modification). We also developed a ToBI (a prosodic structure annotation scheme [SBP+92]) prosodic
event classifier and describe its evaluation. Finally, we present methods for effective n-best sentence
boundary candidate generation and reranking using syntactic, prosodic, and other features. These stud-
ies are complemented by a second set of reranking investigations wherein we optimize sentence boundary
detection explicitly to improve parse accuracy.
The PaSSED project has:
• investigated various techniques to enhance parsing of speech given metadata detection on conver-
sational speech;
• defined metrics for evaluating speech parsing accuracy, implemented them in the publically avail-
able SParseval software package, and evaluated them under a wide variety of conditions;
• recast SU detection as an n-best reranking problem with a relatively small n. Using this approach,
we demonstrated significant improvements over a very strong baseline SU detection system.
• reported on the interaction between parsing and metadata detection and their synergy;
• fostered new collaborations and identified a number of interesting avenues for future work.
|