Divider
  Speech Technology and Research Laboratory
  People
  Current Research Activities
  Past Research Activities
  Publications
  Career Opportunities
  Seminars
  Technologies for License
  In the News
  Contact Us
  STAR Search
  Information and Computing Sciences Division
SpacerAbout UsDividerR and D DivisionsDividerCareersDividerNewsroomDividerContact UsDividerSRI HomeSpacer

Spacer
         
  SRI Logo

Search SRILM-USER Archives

Match: Format: Sort by:
Search:

Re: Perplexity in "ngram"

From: Andreas Stolcke <stolcke at ADDRESS HIDDEN>
Date: Fri, 01 Jun 2007 12:39:08 -0700

Mats Svenson wrote:
> Hi,
>  I have tried to use "ngram" to count perplexity of my
> LMs. However, I am not sure how does the srilm
> implementation treat OOVs in terms of counting
> perplexity. Is it that "log P(<unk>|history) != 0" or
> OOVs are just ignored? If a model with a higher number
>  
SRILM excludes words with zero probability from the perplexity
computation and
reports their tally separately.  That includes OOV words when the LM
doesn't contain
an unknown word (<unk>) token.

> of OOVs has a lower perplexity than another LM, does
> it mean that it is "better" in this -ppl
> implementation?
>  
Possibly.  You should not compare perplexities of LMs with different
vocabularies.
> Second, in some discussions, I have heard about -ppl1
> option, but the current version does not seem to have
> it. In what -ppl1 differs from -ppl?
>  
There is no -ppl1 option.  -ppl reports a statistic labeled "ppl1",
which is explained
in the ngram man page.
> Third, is there a way how to meaningfully compute
> perplexity for a hidden event LM? Or another way how
> to evaluate hidden event LM quality?
>  
Hidden event LMs are LMs, so you can compute a word-based perplexity just
like for any other LM.  If the goal of the HE-LM is to decode hidden events
(like sentence boundaries) then you can obviously evaluate that task as
well.

Andreas

Click here to go to the SRILM home page.

 

About Us  Vertical divider  R&D Divisions  Divider  Careers  Divider  Newsroom  Divider  Contact Us
©2006 SRI International, 333 Ravenswood Avenue, Menlo Park, CA 94025-3493
SRI International is an independent, nonprofit corporation. Privacy policy

Last modified Dec 02, 2008