Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Andreas Stolcke

Publications

Publications are grouped by research area and ordered by recency. Most papers are available in Postscript format, and where indicated in HTML and PDF as well.

Research topics:

Some papers are listed under more than one topic.

Language Modeling

W. Wang, A. Stolcke, & J. Zheng (2007), Reranking Machine Translation Hypotheses With Structured and Web-based Language Models. Proc. IEEE Automatic Speech Recognition and Understanding Workshop, pp. 159-164, Kyoto. (PDF)

I. Bulyko, M. Ostendorf, M. Siu, T. Ng, A. Stolcke, & Ö. Çetin (2007), Web resources for language modeling in conversational speech recognition, ACM Transactions on Speech and Language Processing 5(1), Article 1, 25 pages. (PDF, abstract)

M. Creutz, T. Hirsimäki, M. Kurimo, A. Puurula, J. Pylkkönen, V. Siivola, M. Varjokallio, E. Arisoy, M. Saraçlar, & A. Stolcke (2007), Morph-based speech recognition and modeling of out-of-vocabulary words across languages, ACM Transactions on Speech and Language Processing 5(1), Article 3, 29 pages. (PDF, abstract)

W. Wang & A. Stolcke (2007), Integrating MAP, Marginals, and Unsupervised Language Model Adaptation, Proc. Interspeech/Eurospeech, pp. 618-621, Antwerp. (PDF)

G. Tur & A. Stolcke (2007), Unsupervised Language Model Adaptation for Meeting Recognition, Proc. IEEE ICASSP, vol. 4, pp. 173-176, Honolulu, Hawaii. (PDF)

M. Creutz, T. Hirsimäki, M. Kurimo, A. Puurula, J. Pylkkönen, V. Siivola, M. Varjokallio, E. Arisoy, M. Saraclar,& A. Stolcke (2007), Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages. Proc. HLT/NAACL, pp. 380-387, Rochester, NY. (PDF)

K. Kirchhoff, D. Vergyri, J. Bilmes, K. Duh, & A. Stolcke (2006), Morphology-based language modeling for conversational Arabic speech recognition, Computer Speech and Language 20(4), 589-608. (PDF, abstract)

D. Vergyri, K. Kirchhoff, K. Duh, & A. Stolcke (2004), Morphology-Based Language Modeling for Arabic Speech Recognition. Proc. Intl. Conf. Spoken Language Processing, pp. 2245-2248, Jeju, Korea. (PDF)

W. Wang, A. Stolcke, & M. P. Harper (2004), The Use of a Linguistically Motivated Language Model in Conversational Speech Recognition. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 261-264, Montreal. (PDF)

I. Buyko, M. Ostendorf, & A. Stolcke (2003), Class-dependent Interpolation for Estimating Language Models from Multiple Text Sources, Technical Report UWEETR-2003-0003, Dept. of Electrical Engineering, University of Washington, Seattle.
Shorter version appeared as Getting More Mileage from Web Text Sources for Conversational Speech Language Modeling using Class-Dependent Mixtures in Proc. HLT-NAACL Conference, vol. 2, pp. 7-9, Edmonton, Canada, May 2003.

W. Wang, M. P. Harper, & A. Stolcke (2003), The Robustness of an Almost-Parsing Language Model Given Errorful Training Data. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 240-243, Hong Kong. (PDF)

A. Stolcke (2002), SRILM -- An Extensible Language Modeling Toolkit. Proc. Intl. Conf. on Spoken Language Processing, vol. 2, pp. 901-904, Denver. (PDF)

A. Stolcke, K. Ries, N. Coccaro, E. Shriberg, R. Bates, D. Jurafsky, P. Taylor, R. Martin, C. Van Ess-Dykema, & M. Meteer (2000), Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech, Computational Linguistics 26(3), 339-373. (PDF, abstract)

F. Weng, A. Stolcke, & M. Cohen (2000), Language Modelling for Multilingual Speech Translation, Chapter 16 in M. Rayner, D. Carter, P. Bouillon, V. Digalakis, & M. Wirén (eds.), The Spoken Language Translator, pp. 250-264, Cambridge University Press. (PDF)

A. Stolcke (1998), Entropy-based Pruning of Backoff Language Models. Proc. DARPA Broadcast News Transcription and Understanding Workshop, pp. 270-274, Lansdowne, VA. (HTML, PDF)
NOTE: See erratum at end of postscript file.

A. Stolcke (1997), Modeling Linguistic Segment and Turn Boundaries for N-best Rescoring of Spontaneous Speech. Proc. EUROSPEECH, vol. 5, pp. 2779-2782, Rhodes, Greece. (PDF)

C. Chelba, D. Engle, F. Jelinek, V. Jimenez, S. Khudanpur, L. Mangu, H. Printz, E. Ristad, R. Rosenfeld, A. Stolcke, & D. Wu (1997), Structure and Performance of a Dependency Language Model. Proc. EUROSPEECH, vol. 5, pp. 2775-2778, Rhodes, Greece. (PDF)

F. Weng, A. Stolcke, & A. Sankar (1997), Hub4 Language Modeling Using Domain Interpolation and Data Clustering. Proc. DARPA Speech Recognition Workshop, pp. 147-151, Chantilly, VA. (HTML, PDF)

A. Stolcke & E. Shriberg (1996), Statistical language modeling for speech disfluencies. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 405-408, Atlanta, GA. (HTML, PDF)

M. Weintraub, Y. Aksu, S. Dharanipragada, S. Khudanpur, H. Ney, J. Prange, A. Stolcke, F. Jelinek, E. Shriberg (1996), LM95 Project Report: Fast Training and Portability. In 1995 Language Modeling Summer Research Workshop Technical Reports, Research Note 1, Center for Language and Speech Processing, Johns Hopkins University, Baltimore. PDF)

D. Jurafsky, C. Wooters, J. Segal, A. Stolcke, E. Fosler, G. Tajchman, & N. Morgan (1995), Using a Stochastic Context-Free Grammar as a Language Model for Speech Recognition. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 189-192, Detroit.

A. Stolcke & J. Segal (1994), Precise N-gram Probabilities from Stochastic Context-free Grammars. Proc. ACL, pp. 74-79, Las Cruces, NM. (HTML)

Speaker Recognition

E. Shriberg & A. Stolcke (2008), The Case for Automatic Higher-Level Features in Forensic Speaker Recognition, to appear in Proc. Interspeech, Brisbane, Australia. (PDF)

A. Stolcke, S. Kajarekar, & L. Ferrer (2008), Nonparametric Feature Normalization for SVM-based Speaker Verification, Proc. IEEE ICASSP, pp. 1577-1580, Las Vegas. (PDF)

E. Shriberg, L. Ferrer, S. Kajarekar, N. Scheffer, A. Stolcke, & M. Akbacak (2008), Detecting Nonnative Speech Using Speaker Recognition Approaches. Proc. Odyssey Speaker and Language Recognition Workshop, Stellenbosch, South Africa. (PDF)

A. Stolcke & S. Kajarekar (2008), Recognizing Arabic Speakers with English Phones. Proc. Odyssey Speaker and Language Recognition Workshop, Stellenbosch, South Africa. (PDF)

A. Stolcke, S. Kajarekar, L. Ferrer, & E. Shriberg (2007), Speaker Recognition with Session Variability Normalization Based on MLLR Adaptation Transforms, IEEE Transactions on Audio, Speech, and Language Processing, 15(7), 1987-1998. Special issue on speaker and language recognition. (PDF, abstract)

G. Tur, E. Shriberg, A. Stolcke, & S. Kajarekar (2007), Duration and Pronunciation Conditioned Lexical Modeling for Speaker Verification Proc. Interspeech/Eurospeech, pp. 2049-2052, Antwerp. (PDF)

S. Kajarekar & A. Stolcke (2007), NAP and WCCN: Comparison of Approaches Using MLLR-SVM Speaker Verification System, Proc. IEEE ICASSP, vol. 4, pp. 249-252, Honolulu, Hawaii. (PDF)

A. Stolcke, E. Shriberg, L. Ferrer, S. Kajarekar, K. Sonmez, & G. Tur (2007), Speech Recognition as Feature Extraction for Speaker Recognition, Proc. SAFE 2007: Workshop on Signal Processing Applications for Public Security and Forensics, pp. 39-43, Washington, D.C. (PDF)

M. Graciarena, S. Kajarekar, A. Stolcke, E. Shriberg (2007), Noise Robust Speaker Identification for Spontaneous Arabic Speech, Proc. IEEE ICASSP, vol. 4, pp. 245-248, Honolulu, Hawaii. (PDF)

A. O. Hatch, S. Kajarekar, & A. Stolcke (2006), Within-Class Covariance Normalization for SVM-based Speaker Recognition. Proc. ICSLP, pp. 1471-1474, Pittsburgh. (PDF)

A. Stolcke, L. Ferrer, & S. Kajarekar (2006), Improvements in MLLR-Transform-based Speaker Recognition. Proc. IEEE Odyssey 2006 Speaker and Language Recognition Workshop, pp. 1-6, San Juan, Puerto Rico. (PDF)

L. Ferrer, E. Shriberg, S. S. Kajarekar, A. Stolcke, K. Sonmez, A. Venkataraman, & H. Bratt (2006), The Contribution of Cepstral and Stylistic Features to SRI's 2005 NIST Speaker Recognition Evaluation System. Proc. IEEE ICASSP, vol. 1, pp. 101-104, Toulouse. (PDF)

A. O. Hatch & A. Stolcke (2006), Generalized Linear Kernels for One-Versus-All Classification: Application to Speaker Recognition. Proc. IEEE ICASSP, vol. 5, pp. 585-588, Toulouse. (PDF)

A. O. Hatch, A. Stolcke, & B. Peskin (2005), Combining Feature Sets with Support Vector Machines: Application to Speaker Recognition. Proc. IEEE Speech Recognition and Understanding Workshop, pp. 75-79, San Juan, Puerto Rico. (PDF)

E. Shriberg, L. Ferrer, S. Kajarekar, A. Venkataraman, & A. Stolcke (2005), Modeling Prosodic Feature Sequences for Speaker Recognition. Speech Communication 46(3-4), 455-472. Special Issue on Quantitative Prosody Modelling for Natural Speech Description and Generation. (abstract)

A. Stolcke, L. Ferrer, S. Kajarekar, E. Shriberg, & A. Venkataraman (2005), MLLR Transforms as Features in Speaker Recognition. Proc. Eurospeech, Lisbon, pp. 2425-2428. (PDF)

S. S. Kajarekar, L. Ferrer, E. Shriberg, K. Sonmez, A. Stolcke, A. Venkataraman, and J. Zheng (2005), SRI's 2004 NIST Speaker Recognition Evaluation System, Proc. IEEE ICASSP, Philadelphia, vol. 1, pp. 173-176. (PDF)

A. O. Hatch, B. Peskin, & A. Stolcke (2005), Improved Phonetic Speaker Recognition Using Lattice Decoding, Proc. IEEE ICASSP, Philadelphia, vol. 1, pp. 169-172. (PDF)

S. Kajarekar, L. Ferrer, K. Sonmez, J. Zheng, E. Shriberg, & A. Stolcke (2004), Modeling NERFs for Speaker Recognition. Proc. Odyssey 04 Speaker and Language Recognition Workshop, pp. 51-56, Toledo, Spain. (PDF)

S. Kajarekar, L. Ferrer, A. Venkataraman, K. Sonmez, E. Shriberg, A. Stolcke, & R. R. Gadde (2003), Speaker Recognition using Prosodic and Lexical Features. Proc. IEEE Speech Recognition and Understanding Workshop, pp. 19-24, St. Thomas, U.S. Virgin Islands. (PDF)

L. Ferrer, H. Bratt, V. R. R. Gadde, S. Kajarekar, E. Shriberg, K. Sonmez, A. Stolcke, & A. Venkataraman (2003), Modeling Duration Patterns for Speaker Recognition. Proc. Eurospeech, pp. 2017-2020, Geneva. (PDF)

S. Kajarekar, K. Sonmez, L. Ferrer, V. Gadde, A. Venkataraman, E. Shriberg, A. Stolcke, & H. Bratt (2003), "TalkPrinting": Improving Speaker Recognition by Modeling Stylistic Features In Intelligence and Security Informatics. First NSF/NIJ Symposium, ISI 2003, Springer Lecture Notes in Computer Science Series, Volume 2665, H. Chen, R. Miranda, D.D. Zeng, C. Demchak, J. Schroeder, & T. Madhusudan, editors, pp. 350-354. © 2003 Springer-Verlag. (PDF, abstract)

Detecting Emotions and Deception

F. Enos, E. Shriberg, M. Graciarena, J. Hirschberg, & A. Stolcke (2007), Detecting Deception Using Critical Segments, Proc. Interspeech/Eurospeech, pp. 2281-2284, Antwerp. (PDF)

M. Graciarena, E. Shriberg, A. Stolcke, F. Enos, J. Hirschberg, S. Kajarekar (2006), Combining Prosodic, Lexical and Cepstral Systems for Deceptive Speech Detection. Proc. IEEE ICASSP, vol. 1, pp. 1033-1036, Toulouse. (PDF)

J. Hirschberg, S. Benus, J. M. Brenier, F. Enos, S. Friedman, S. Gilman, C. Girand, M. Graciarena, A. Kathol, L. Michaelis, B. Pellom, E. Shriberg, & A. Stolcke (2005), Distinguishing Deceptive from Non-Deceptive Speech. Proc. Eurospeech, Lisbon, pp. 1833-1836. (PDF)

J. Ang, R. Dhillon, A. Krupski, E. Shriberg, & A. Stolcke (2002), Prosody-Based Automatic Detection of Annoyance and Frustration in Human-Computer Dialog. Proc. Intl. Conf. on Spoken Language Processing, vol. 3, pp. 2037-2040, Denver. (PDF)

Multiparty Meeting Recognition and Modeling

A. Stolcke, X. Anguera, K. Boakye, O. Cetin, A. Janin, M. Magimai-Doss, C. Wooters, & J. Zheng (2008), The SRI-ICSI Spring 2007 Meeting and Lecture Recognition System, R. Stiefelhagen, R. Bowers, and J. Fiscus (eds.), Multimodal Technologies for Perception of Humans. International Evaluation Workshops CLEAR 2007 and RT 2007, Springer Lecture Notes in Computer Science 4625, pp. 450-463. (PDF, abstract)

J. Zheng & A. Stolcke (2007), fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains, Proc. Interspeech/Eurospeech, pp. 1573-1576, Antwerp (PDF)

G. Tur & A. Stolcke (2007), Unsupervised Language Model Adaptation for Meeting Recognition, Proc. IEEE ICASSP, vol. 4, pp. 173-176, Honolulu, Hawaii. (PDF)

A. Janin, A. Stolcke, X. Anguera, K. Boakye, O. Cetin, J. Frankel, J. Zheng (2006), The ICSI-SRI Spring 2006 Meeting Recognition System. In Machine Learning for Multimodal Interaction: Third International Workshop, MLMI 2006, Springer Lecture Notes in Computer Science Series, S. Renals, S. Bengio, & J. Fiscus, editors, pp. 444-456. © 2006 Springer-Verlag. (PDF, abstract)

K. Boakye & A. Stolcke (2006), Improved Speech Activity Detection Using Cross-Channel Features for Recognition of Multiparty Meetings. Proc. ICSLP, pp. 1962-1965, Pittsburgh. (PDF)

M. Zimmermann, A. Stolcke, & E. Shriberg (2006), Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings. Proc. IEEE ICASSP, vol. 1, pp. 581-584, Toulouse. (PDF)

M. Zimmermann, Y. Liu, E. Shriberg, & A. Stolcke (2005), A* based Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings. Proc. IEEE Speech Recognition and Understanding Workshop, pp. 215-219, San Juan, Puerto Rico. (PDF)

A. Stolcke, X. Anguera, K. Boakye, O. Cetin, F. Grezl, A. Janin, A. Mandal, B. Peskin, C. Wooters, & J. Zheng (2005), Further Progress in Meeting Recognition: The ICSI-SRI Spring 2005 Speech-to-Text Evaluation System. Proc. NIST MLMI Meeting Recognition Workshop, Edinburgh.
Also in Machine Learning for Multimodal Interaction: Second International Workshop, MLMI 2005, Springer Lecture Notes in Computer Science Series, Volume 3869, S. Renals & S. Bengio, editors, pp. 463-475. © 2006 Springer-Verlag. (PDF, abstract)

O. Cetin & A. Stolcke (2005), Language Modeling in the ICSI-SRI Spring 2005 Meeting Speech Recognition Evaluation System. Technical Report TR-05-006, International Computer Science Institute, Berkeley, CA.

M. Zimmermann, Y. Liu, E. Shriberg, & A. Stolcke (2005), Toward Joint Segmentation and Classification of Dialog Acts in Multiparty Meetings, in Machine Learning for Multimodal Interaction: Second International Workshop, MLMI 2005, Springer Lecture Notes in Computer Science Series, Volume 3869, S. Renals & S. Bengio, editors, pp. 187-193. © 2006 Springer-Verlag. (Abstract)

N. Mirghafori, A. Stolcke C. Wooters, T. Pirinen, I. Bulyko, D. Gelbart, M. Graciarena, S. Otterson, B. Peskin, & M. Ostendorf (2004), From Switchboard to Meetings: Development of the 2004 ICSI-SRI-UW Meeting Recognition System. Proc. Intl. Conf. Spoken Language Processing, pp. 1957-1960, Jeju, Korea. (PDF)

A. Stolcke, C. Wooters, N. Mirghafori, T. Pirinen, I. Bulyko, D. Gelbart, M. Graciarena, S. Otterson, B. Peskin, & M. Ostendorf (2004), Progress in Meeting Recognition: The ICSI-SRI-UW Spring 2004 Evaluation System. NIST ICASSP 2004 Meeting Recognition Workshop, Montreal. (PDF)

N. Morgan, D. Baron, S. Bhagat, H. Carvey, R. Dhillon, J. Edwards, D. Gelbart, A. Janin, A. Krupski, B. Peskin, T. Pfau, E. Shriberg, A. Stolcke, & C. Wooters (2003), Meetings about meetings: research at ICSI on speech in multiparty conversations . Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 4, pp. 740-743, Hong Kong.

D. Baron, E. Shriberg, & A. Stolcke (2002), Automatic Punctuation and Disfluency Detection in Multi-Party Meetings Using Prosodic and Lexical Cues. Proc. Intl. Conf. on Spoken Language Processing, vol. 2, pp. 949-952, Denver. (PDF)

T. Pfau, D.P.W. Ellis, & A. Stolcke (2001), Multispeaker Speech Activity Detection for the ICSI Meeting Recorder. Proc. IEEE Automatic Speech Recognition and Understanding Workshop, pp. 107-110, Madonna di Campiglio, Italy. (PDF)

E. Shriberg, A. Stolcke, & D. Baron (2001), Can Prosody Aid the Automatic Processing of Multi-Party Meetings? Evidence from Predicting Punctuation, Disfluencies, and Overlapping Speech. In M. Bacchiani, J. Hirschberg, D. Litman, & M. Ostendorf (eds.), Proc. ISCA Tutorial and Research Workshop on Prosody in Speech Recognition and Understanding, pp. 139-146, Red Bank, NJ. (PDF)

E. Shriberg, A. Stolcke, & D. Baron (2001), Observations on Overlap: Findings and Implications for Automatic Processing of Multi-Party Conversation. Proc. EUROSPEECH, vol. 2, pp. 1359-1362, Aalborg, Denmark. (PDF)

N. Morgan, D. Baron, J. Edwards, D. Ellis, D. Gelbart, A. Janin, T. Pfau, E. Shriberg, & A. Stolcke (2001), The Meeting Project at ICSI, Proc. of HLT 2001, First International Conference on Human Language Technology Research, pp. 246-252, San Diego, CA. (PDF)

Spontaneous Speech, Disfluency, and other Metadata Modeling

Y. Liu, E. Shriberg, A. Stolcke, D. Hillard, M. Ostendorf, & M. Harper (2006), Enriching Speech Recognition with Automatic Detection of Sentence Boundaries and Disfluencies. IEEE Trans. Audio, Speech and Language Processing 14(5), 1526-1540. (PDF, abstract)

Y. Liu, N. V. Chawla, M. P. Harper, E. Shriberg, & A. Stolcke (2006), A study in machine learning from imbalanced data for sentence boundary detection in speech, Computer Speech and Language 20(4), 468-494. (PDF, abstract)

D. Jones, W. Shen, E. Shriberg, A. Stolcke, T. Kamm, & D. Reynolds (2005), Two Experiments Comparing Reading with Listening for Human Processing of Conversational Telephone Speech. Proc. Eurospeech, Lisbon, pp. 1145-1148. (PDF)

Y. Liu, E. Shriberg, A. Stolcke, & M. Harper (2005), Comparing HMM, Maximum Entropy, and Conditional Random Fields for Disfluency Detection. Proc. Eurospeech, Lisbon, pp. 3313-3316. (PDF)

Y. Liu, A. Stolcke, E. Shriberg, & M. Harper (2005), Using Conditional Random Fields for Sentence Boundary Detection in Speech, Proc. ACL, Ann Arbor, MI, pp. 451-458.

Y. Liu, E. Shriberg, A. Stolcke, B. Peskin, D. Hillard, M. Ostendorf, M. Tomalin, P. Woodland, and M. Harper (2005), Structural Metadata Research in the EARS Program, Proc. IEEE ICASSP, Philadelphia, vol. 5, pp. 957-980. (PDF)

Y. Liu, E. Shriberg, A. Stolcke, D. Hillard, M. Ostendorf, B. Peskin, & M. Harper (2004), The ICSI-SRI-UW Metadata Extraction System. Proc. Intl. Conf. on Spoken Language Processing, pp. 577-580, Jeju, Korea. (PDF)

Y. Liu, E. Shriberg, A. Stolcke, & M. Harper (2004), Using Machine Learning to Cope with Imbalanced Classes in Natural Speech: Evidence from Sentence Boundary and Disfluency Detection. Proc. Intl. Conf. on Spoken Language Processing, pp. 1525-1528, Jeju, Korea. (PDF)

Y. Liu, A. Stolcke, E. Shriberg, & M. Harper (2004), Comparing and Combining Generative and Posterior Probability Models: Some Advances in Sentence Boundary Detection in Speech. Proc. Conf. on Empirical Methods in Natural Language Processing, pp. 64-71, Barcelona. (PDF)

D. Hillard, M. Ostendorf, A. Stolcke, Y. Liu, & E. Shriberg (2004), Improving Automatic Sentence Boundary Detection with Confusion Networks. Proc. HLT-NAACL Conference, Short papers, pp. 69-72, Boston. (PDF)

Y. Liu, E. Shriberg, & A. Stolcke (2003), Automatic disfluency identification in conversational speech using multiple knowledge sources. Proc. Eurospeech, pp. 957-960, Geneva. (PDF)

A. Stolcke, E. Shriberg, R. Bates, M. Ostendorf, D. Hakkani, M. Plauche, G. Tur, & Y. Lu (1998), Automatic Detection of Sentence Boundaries and Disfluencies based on Recognized Words. Proc. Intl. Conf. on Spoken Language Processing, vol. 5, pp. 2247-2250, Sydney, Australia. (PDF)

E. Shriberg & A. Stolcke (1998), How far do speakers back up in their repairs? A quantitative model. Proc. Intl. Conf. on Spoken Language Processing, vol. 5, pp. 2183-2186, Sydney, Australia. (PDF)

E. Shriberg, R. Bates, & A. Stolcke (1997), A Prosody-Only Decision-Tree Model for Disfluency Detection. Proc. EUROSPEECH, vol. 5, pp. 2383-2386, Rhodes, Greece. (PDF)

A. Stolcke & E. Shriberg (1996), Statistical language modeling for speech disfluencies. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 405-409, Atlanta, GA. (HTML, PDF)

E. Shriberg & A. Stolcke (1996), Word predictability after filled pauses: A corpus-based study. Proc. Intl. Conf. on Spoken Language Processing, vol. 3, pp. 1868-1871, Philadelphia, PA. (PDF)

A. Stolcke & E. Shriberg (1996), Automatic linguistic segmentation of conversational speech. Proc. Intl. Conf. on Spoken Language Processing, vol. 2, pp. 1005-1008, Philadelphia, PA. (HTML, PDF)

E. Shriberg, D. R. Ladd, J. Terken, & A. Stolcke (1996), Modeling Pitch Range Variation Within and Across Speakers: Predicting F0 Targets when "Speaking Up". Proc. Intl. Conf. on Spoken Language Processing, Addendum, pp. 1-4, Philadelphia, PA. (PDF)

Dialog Modeling

A. Venkataraman, Y. Liu, E. Shriberg, & A. Stolcke (2005), Does Active Learning Help Automatic Dialog Act Tagging in Meeting Data?. Proc. Eurospeech, Lisbon, pp. 2777-2780. (PDF)

A. Venkataraman, L. Ferrer, A. Stolcke, & E. Shriberg (2003), Training a Prosody-based Dialog Act Tagger from Unlabeled Data. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 272-275, Hong Kong. (PDF)

A. Venkataraman, A. Stolcke, & E. Shriberg (2002), Automatic Dialog Act Tagging with Minimal Supervision. Proc. 9th Australian International Conference on Speech Science and Technology, Melbourne. (PDF)

A. Stolcke, K. Ries, N. Coccaro, E. Shriberg, R. Bates, D. Jurafsky, P. Taylor, R. Martin, C. Van Ess-Dykema, & M. Meteer (2000), Dialogue Act Modeling for Automatic Tagging and Recognition of Conversational Speech, Computational Linguistics 26(3), 339-373. (PDF, abstract)

E. Shriberg, R. Bates, A. Stolcke, P. Taylor, D. Jurafsky, K. Ries, N. Coccaro, R. Martin, M. Meteer, & C. Van Ess-Dykema (1998), Can Prosody Aid the Automatic Classification of Dialog Acts in Conversational Speech? Language and Speech 41(3-4), 439-487. (PDF, abstract)

A. Stolcke, E. Shriberg, R. Bates, N. Coccaro, D. Jurafsky, R. Martin, M. Meteer, K. Ries, P. Taylor, & C. Van Ess-Dykema (1998), Dialog Act Modeling for Conversational Speech. In Applying Machine Learning to Discourse Processing. Papers from the 1998 AAAI Spring Symposium, Technical Report SS-98-01, pp. 98-105. AAAI Press, Menlo Park, CA. (PDF)

D. Jurafsky, R. Bates, N. Coccaro, R. Martin, M. Meteer, K. Ries, E. Shriberg, A. Stolcke, Paul Taylor, & C. Van Ess-Dykema (1997), Automatic Detection of Discourse Structure for Speech Recognition and Understanding. Proc. IEEE Workshop on Speech Recognition and Understanding, pp. 88-95, Santa Barbara, CA. (PDF)

Other Speech Understanding

M. Akbacak, D. Vergyri, & A. Stolcke (2008), Open-Vocabulary Spoken Term Detection Using Graphone-Based Hybrid Recognition Systems, Proc. IEEE ICASSP, pp. 5240-5243, Las Vegas. (PDF)

D. Vergyri, I. Shafran, A. Stolcke, R. R. Gadde, M. Akbacak, B. Roark, & W. Wang (2007), The SRI/OGI 2006 Spoken Term Detection System, Proc. Interspeech/Eurospeech, pp. 2393-2396, Antwerp. (PDF)

D. Gelbart, J. Bryant, A. Stolcke, R. Porzel, M. Baudis, & Nelson Morgan (2006), SmartKom-English: From Robust Recognition to Felicitous Interaction, SmartKom: Foundations of Multimodal Dialogue Systems, Springer, pp. 453-470. (Abstract)

E. Shriberg & A. Stolcke (2004), Direct Modeling of Prosody: An Overview of Applications to Automatic Speech Processing. Proc. International Conference on Speech Prosody, Nara, Japan. (PDF)

E. Shriberg & A. Stolcke (2004), Prosody Modeling for Automatic Speech Recognition and Understanding Mathematical Foundations of Speech and Language Processing, M. Johnson, S. Khudanpur, M. Ostendorf, and R. Rosenfeld (eds.), Volume 138 in IMA Volumes in Mathematics and its Applications, pp. 105-114, Springer-Verlag. (PDF)

L. Ferrer, E. Shriberg, & A. Stolcke (2003), A prosody-based approach to end-of-utterance detection that does not require speech recognition. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 608-611, Hong Kong. (PDF)

L. Ferrer, E. Shriberg, & A. Stolcke (2002), Is the Speaker Done Yet? Faster and More Accurate End-of-Utterance Detection Using Prosody. Proc. Intl. Conf. on Spoken Language Processing, vol. 3, pp. 2061-2064, Denver. (PDF)

A. Stolcke & E. Shriberg (2001), Markovian Combination of Language and Prosodic Models for better Speech Understanding and Recognition . Invited talk at the IEEE Workshop on Speech Recognition and Understanding, Madonna di Campiglio, Italy, December 2001. (PDF)

E. Shriberg & A. Stolcke (2001), Prosody Modeling for Automatic Speech Understanding: An Overview of Recent Research at SRI. In M. Bacchiani, J. Hirschberg, D. Litman, & M. Ostendorf (eds.), Proc. ISCA Tutorial and Research Workshop on Prosody in Speech Recognition and Understanding, pp. 13-16, Red Bank, NJ. (PDF)

G. Tur, D. Hakkani-Tur, A. Stolcke, & E. Shriberg (2001), Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation, Computational Linguistics, 27(1), 31-57. (PDF, abstract)

E. Shriberg, A. Stolcke, D. Hakkani-Tur, & G. Tur (2000), Prosody-Based Automatic Segmentation of Speech into Sentences and Topics, Speech Communication 32(1-2), 127-154 (Special Issue on Accessing Information in Spoken Audio). (PDF, abstract)

D. Hakkani-Tur, G. Tur, A. Stolcke, & E. Shriberg (1999), Combining Words and Prosody for Information Extraction from Speech. Proc. EUROSPEECH, vol. 5, pp. 1991-1994, Budapest. (PDF)

A. Stolcke, E. Shriberg, D. Hakkani-Tur, G. Tur, Z. Rivlin, K. Sonmez (1999), Combining Words and Speech Prosody for Automatic Topic Segmentation. Proc. DARPA Broadcast News Workshop, pp. 61-64, Herndon, VA. (HTML, PDF)

Z. Rivlin, D. Appelt, R. Bolles, A. Cheyer, D. Hakkani-Tur, D. Israel, L. Julia, D. Martin, G. Myers, K. Nitz, B. Sabata, A. Sankar, E. Shriberg, K. Sonmez, A. Stolcke, & G. Tur (2000), MAESTRO: Conductor of Multimedia Analysis Technologies, Communications of the ACM 43(2), 57-63, Special Issue on News on Demand, February 2000.

Other Speech Recognition

D. Vergyri, A. Mandal, W. Wang, A. Stolcke, J. Zheng, M. Graciarena, D. Rybach, C. Gollan, R. Schlüter, K. Kirchhoff, A. Faria, & N. Morgan (2008), Development of the SRI/Nightingale Arabic ASR system, to appear in Proc. Interspeech, Brisbane, Australia. (PDF)

J. Zheng & A. Stolcke (2007), fMPE-MAP: Improved Discriminative Adaptation for Modeling New Domains, Proc. Interspeech/Eurospeech, pp. 1573-1576, Antwerp (PDF)

J. Zheng, O. Cetin, M.-Y. Hwang, X. Lei, A. Stolcke, & N. Morgan (2007), Combining Discriminative Feature, Transform, and Model Training for Large Vocabulary Speech Recognition, Proc. IEEE ICASSP, vol. 4, pp. 633-636, Honolulu, Hawaii. (PDF)

A. Stolcke, B. Chen, H. Franco, V. R. R. Gadde, M. Graciarena, M.-Y. Hwang, K. Kirchhoff, A. Mandal, N. Morgan, X. Lin, T. Ng, M. Ostendorf, K. Sonmez, A. Venkataraman, D. Vergyri, W. Wang, J. Zheng, & Q. Zhu (2006), Recent Innovations in Speech-to-Text Transcription at SRI-ICSI-UW. IEEE Trans. Audio, Speech and Language Processing 14(5), 1729-1744. (PDF, abstract)

A. Mandal, M. Ostendorf, & A. Stolcke (2006), Speaker Clustered Regression-Class Trees for MLLR Adaptation. Proc. ICSLP, pp. 1133-1136, Pittsburgh. (PDF)

A. Stolcke, F. Grezl, M.-Y. Hwang, X. Lei, N. Morgan, & D. Vergyri (2006), Cross-domain and Cross-language Portability of Acoustic Features Estimated by Multilayer Perceptrons. Proc. IEEE ICASSP, vol. 1, pp. 321-324, Toulouse. (PDF)

N. Morgan, Q. Zhu, A. Stolcke, K. Sonmez, S. Sivadas, T. Shinozaki, M. Ostendorf, P. Jain, H. Hermansky, D. Ellis, G. Doddington, B. Chen, O. Cetin, H. Bourlard and M. Athineos (2005), Pushing the Envelope -- Aside, IEEE Signal Processing Magazine 22(5), 81-88. (PDF, abstract)

Q. Zhu, A. Stolcke, B. Chen, & N. Morgan (2005), Using MLP Features in SRI's Conversational Speech Recognition System. Proc. Eurospeech, Lisbon, pp. 2141-2144. (PDF)

J. Zheng & A. Stolcke (2005), Improved Discriminative Training Using Phone Lattices. Proc. Eurospeech, Lisbon, pp. 2125-2128. (PDF)

A. Mandal, M. Ostendorf, & A. Stolcke (2005), Leveraging Speaker-dependent Variation of Adaptation. Proc. Eurospeech, Lisbon, pp. 1793-1796. (PDF)

D. Vergyri, K. Kirchhoff, R. Gadde, A. Stolcke, & J. Zheng (2005), Development of a Conversational Telephone Speech Recognizer for Levantine Arabic. Proc. Eurospeech, Lisbon, pp. 1613-1616. (PDF)

Q. Zhu, B. Chen, N. Morgan, & A. Stolcke (2005), Tandem Connectionist Feature Extraction for Conversational Speech Recognition, in Machine Learning for Multimodal Interaction. First International Workshop, MLMI-2004, pp. 223-231. (Abstract)

M. Hwang, X. Lei, T. Ng, I. Bulyko, M. Ostendorf, A. Stolcke, W. Wang, J. Zheng, V. R. R. Gadde, M. Graciarena, M. Siu, Y. Huang (2004), Progress on Mandarin Conversational Telephone Speech Recognition. Proc. 4th Intl. Symposium on Chinese Spoken Lanugage Processing, Hong Kong.

Q. Zhu, A. Stolcke, B. Chen, & N. Morgan (2004), Incorporating Tandem/HATs MLP Features into SRI's Conversational Speech Recognition System. Proc. DARPA RT-04F Rich Transcription Workshop, Palisades, New York, November 2004. (PDF)

M. Hwang, X. Lei, T. Ng, M. Ostendorf, A. Stolcke, W. Wang, J. Zheng, & V. Gadde (2004), Porting Decipher from English to Mandarin. Proc. DARPA RT-04F Rich Transcription Workshop, Palisades, New York, November 2004.

J. Zheng, H. Franco, & A. Stolcke, A. Venkataraman, A. Stolcke, W. Wang, D. Vergyri, V. R. R. Gadde, & J. Zheng (2004), Effective Acoustic Modeling for Rate-of-Speech Variation in Large Vocabulary Conversational Speech Recognition. Proc. Intl. Conf. on Spoken Language Processing, pp. 401-404, Jeju, Korea. (PDF)

A. Venkataraman, A. Stolcke, W. Wang, D. Vergyri, V. R. R. Gadde, & J. Zheng (2004), An Efficient Repair Procedure For Quick Transcriptions. Proc. Intl. Conf. on Spoken Language Processing, pp. 1961-1964, Jeju, Korea. (PDF)

Q. Zhu, B. Chen, N. Morgan, & A. Stolcke (2004), On Using MLP Features in LVCSR. Proc. Intl. Conf. on Spoken Language Processing, pp. 921-924, Jeju, Korea. (PDF)

N. Morgan, B. Y. Chen, Q. Zhu, & A. Stolcke (2004), TRAPping Conversational Speech: Extending TRAP/Tandem approaches to conversational telephone speech recognition. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 536-539, Montreal. (PDF)

M. Graciarena, H. Franco, J. Zheng, D. Vergyri, & A. Stolcke (2004), Voicing Feature Integration in SRI's Decipher LVCSR System. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 921-924, Montreal. (PDF)

J. Zheng, H. Franco, & A. Stolcke (2003), Modeling word-level rate-of-speech variation in large vocabulary conversational speech recognition. Speech Communication 41(2-3), 273-285. (PDF, abstract)

B. Hodjat, H. Franco, H. Bratt, K. Precoda, A. Stolcke, A. Venkataraman, D. Vergyri, & J. Zheng (2003), Iterative statistical language model generation for use with an agent-oriented natural language interface. Proc. 10th International Conference on Human-Computer Interaction, Crete. (PDF)

D. Vergyri, A. Stolcke, V. R. R. Gadde, L. Ferrer, & E. Shriberg (2003), Prosodic Knowledge Sources for Automatic Speech Recognition. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 1, pp. 208-211, Hong Kong. (PDF)

V. R. Rao Gadde, A. Stolcke, D. Vergyri, J. Zheng, K. Sonmez, & A. Venkataraman (2002), Building an ASR System for Noisy Environments: SRI's 2001 SPINE Evaluation System. Proc. Intl. Conf. on Spoken Language Processing, vol. 3, pp. 1577-1580, Denver. (PDF)

A. Sankar, V. R. Rao Gadde, A. Stolcke, & F. Weng (2002), Improved Modeling and Efficiency for Automatic Transcription of Broadcast News, Speech Communication 37(1-2), 133-158. (PDF, abstract)

H. Franco, J. Zheng, J. Butzberger, F. Cesari, M. Frandsen, J. Arnold, R. Rao, A. Stolcke, & V. Abrash (2002), DynaSpeak: SRI's scalable speech recognizer for embedded and mobile systems. Proc. Human Language Technology Conference (HLT-2002), San Diego, CA. (PDF)

J. Zheng, J. Butzberger, H. Franco, & A. Stolcke (2001), Improved Maximum Mutual Information Estimation Training of Continuous Density HMMs. Proc. EUROSPEECH, vol. 2, pp. 679-682, Aalborg, Denmark. (PDF)

J. Zheng, H. Franco, & A. Stolcke (2000), Rate-dependent Acoustic Modeling for Large Vocabulary Conversational Speech Recognition. Proc. ISCA Tutorial and Research Workshop on Automatic Speech Recognition: Challenges for the new Millenium, Paris. (PDF)

J. Zheng, H. Franco, & A. Stolcke (2000), Rate-dependent Acoustic Modeling for Large Vocabulary Conversational Speech Recognition. Proc. NIST Speech Transcription Workshop, College Park, MD. (Preliminary version of paper above, HTML, PDF)

A. Stolcke, H. Bratt, J. Butzberger, H. Franco, V. R. Rao Gadde, M. Plauche, C. Richey, E. Shriberg, K. Sonmez, F. Weng, J. Zheng (2000), The SRI March 2000 Hub-5 Conversational Speech Transcription System. Proc. NIST Speech Transcription Workshop, College Park, MD. (HTML, PDF)

L. Mangu, E. Brill, & A. Stolcke (2000), Finding consensus in speech recognition: word error minimization and other applications of confusion networks, Computer Speech and Language 14(4), 373-400. (PDF, abstract) [2003 CSL Paper Award]

A. Stolcke, E. Shriberg, D. Hakkani-Tur, & G. Tur (1999), Modeling the Prosody of Hidden Events for Improved Word Recognition. Proc. EUROSPEECH, vol. 1, pp. 307-310, Budapest. (PDF)

L. Mangu, E. Brill, & A. Stolcke (1999), Finding Consensus Among Words: Lattice-based Word Error Minimization. Proc. EUROSPEECH, vol. 1, pp. 495-498, Budapest. (PDF)

F. Weng, A. Stolcke, & A. Sankar (1998), Efficient Lattice Representation and Generation. Proc. Intl. Conf. on Spoken Language Processing, vol. 6, pp. 2531-2534. Sydney, Australia. (PDF)

A. Sankar, F. Weng, Z. Rivlin, A. Stolcke, & R. Rao Gadde (1998), Development of SRI's 1997 Broadcast News Transcription System. Proc. DARPA Broadcast News Transcription and Understanding Workshop, pp. 91-96, Lansdowne, VA. (HTML, PDF)

F. Weng, A. Stolcke, & A. Sankar (1998), New Developments in Lattice-Based Search Strategies in SRI's Hub4 System. Proc. DARPA Broadcast News Transcription and Understanding Workshop, pp. 138-143, Lansdowne, VA. (HTML, PDF)

A. Stolcke (1997), Linguistic Knowledge and Empirical Methods in Speech Recognition. AI Magazine 18(4): Winter 1997, pp. 13-24.

A. Stolcke, Y. Konig, & M. Weintraub (1997), Explicit Word Error Minimization in N-best List Rescoring. Proc. EUROSPEECH, vol. 1, pp. 163-166. Rhodes, Greece. (PDF)

F. Weng, H. Bratt, L. Neumeyer, & A. Stolcke (1997), A Study on Multilingual Speech Recognition. Proc. EUROSPEECH, vol. 1, pp. 359-362, Rhodes, Greece. (PDF)

M. Weintraub, F. Beaufays, Z. Rivlin, Y. Konig, & A. Stolcke (1997), Neural-Network Based Measures of Confidence for Word Recognition. Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Processing, vol. 2, pp. 887-890, Munich. (PDF)

A. Sankar, L. Heck, & A. Stolcke (1997), Acoustic Modeling for the SRI Hub4 Partitioned Evaluation Continuous Speech Recognition System. Proc. DARPA Speech Recognition Workshop, pp. 127-132, Chantilly, VA. (HTML , PDF)

A. Sankar, A. Stolcke, T. Chung, L. Neumeyer, M. Weintraub, H. Franco, & F. Beaufays (1996), Noise-resistant Feature Extraction and Model Training for Robust Speech Recognition. Proc. ARPA Speech Recognition Workshop, Harriman, NY. (PDF)

C. Wooters & A. Stolcke (1994), Multiple-pronunciation Lexical Modeling in a Speaker-independent Speech Understanding System. Proc. Intl. Conf. on Spoken Language Processing, vol. 3, pp. 1363-1366, Yokohama.

D. Jurafsky, C. Wooters, G. Tajchman, J. Segal, A. Stolcke, E. Fosler, & N. Morgan (1994), The Berkeley Restaurant Project. Proc. Intl. Conf. on Spoken Language Processing, vol. 4, pp. 2139-2142, Yokohama.

Grammar Induction

J. Feldman, G. Lakoff, D. Bailey, S. Narayanan, T. Regier, & A. Stolcke (1996), L0 -- The First Five Years of an Automated Language Acquisition Project. Artificial Intelligence Review, 10(1-2), 103-129. Special Volume on Integration of Natural Language and Vision Processing: Grounding Representations, P. McKevitt (ed.). (Abstract)

A. Stolcke (1994), Bayesian Learning of Probabilistic Language Models. Doctoral dissertation, Dept. of Electrical Engineering and Computer Science, University of California at Berkeley.

A. Stolcke & S. Omohundro (1994), Best-first Model Merging for Hidden Markov Model Induction. Technical Report TR-94-003, ICSI, Berkeley, CA.

A. Stolcke & S. Omohundro (1994), Inducing Probabilistic Grammars by Bayesian Model Merging. In Grammatical Inference and Applications, R. C. Carrasco & J. Oncina, editors, Springer, pp. 106-118. (Abstract)

A. Stolcke & S. Omohundro (1992), Hidden Markov Model Induction by Bayesian Model Merging. In Advances in Neural Information Processing Systems 5, S. J. Hanson, J. D. Cowan, & C. L. Giles, editors, Morgan Kaufman, pp. 11-18.

A. Stolcke (1991), Syntactic Category Formation with Vector Space Grammars. Proc. COGSCI, pp. 908-912, Chicago.

J. A. Feldman, G. Lakoff, A. Stolcke & S. H. Weber (1990), Miniature Language Acquisition: A touchstone for cognitive science. Proc. COGSCI, pp. 686-693, Cambridge, MA.

Other Computational Linguistics

A. Stolcke (1995), An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities. Computational Linguistics 21(2), 165-201. (HTML, PDF, abstract)

A. Stolcke (1993), An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities. Technical Report TR-93-065, ICSI, Berkeley, CA. (Extended version of article above.)

A. Stolcke (1990), Gapping and Frame Semantics: A fresh look from a cognitive perspective. Proc. COLING, vol. 2, pp. 341-346, Helsinki.

Structured Connectionist Representations

A. Stolcke & D. Wu (1992), Tree Matching with Recursive Distributed Representations. Technical Report TR-92-025, ICSI, Berkeley, CA. Also in Workshop on Integrating Neural and Symbolic Processes, AAAI, San Jose, CA.

A. Stolcke (1991), Syntactic Category Formation with Vector Space Grammars. Proc. COGSCI, pp. 908-912, Chicago.

A. Stolcke (1989), Unification as Constraint Satisfaction in Structured Connectionist Networks. Neural Computation 1(4), 559-567.

A. Stolcke (1989), Processing Unification-based Grammars in a Connectionist Framework. Proc. COGSCI, pp. 908-915, Ann Arbor, MI.

A. Stolcke (1988). Generation of natural language sentences in unification-based grammars -- A connectionist approach. Diploma thesis (in German). Report FKI-94-88, Computer Science Dept., Technische Universität München.

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2006 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Sep 05, 2008