Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Search SRILM-USER Archives

Match: Format: Sort by:
Search:

error in discount estimator for order 3

From: "Rebecca Madsen" <rmadsen at ADDRESS HIDDEN>
Date: Thu, 3 Aug 2006 15:02:46 -0600

Is there a reason why duplicating my data would give me the following error:

using ModKneserNey for 3-grams
Kneser-Ney smoothing 3-grams
n1 = 0
n2 = 94762
n3 = 0
n4 = 37773
one of required modified KneserNey count-of-counts is zero
error in discount estimator for order 3

I can build a language model using the following command line with the
normal data, but concatenating two copies of the data together gives
me the discount estimator error.

$ /home/tools/srilm/bin/i686/ngram-count -text my_data_doubled.txt
-interpolate -kndiscount1 -kndiscount2 -kndiscount3 -lm
my_data_doubled.lm

Thanks for your help,
Rebecca

Click here to go to the SRILM home page.

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2006 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Dec 02, 2008