nlp - Ngram model and smoothing algorithm -
which smoothing algorithm easy , effective in case of implementation point of view?
my training corpus hex dump looks like,
64 fa eb 63 31 d2 62 22 19 bd 64 b5 63 17 4f 48 62 a8 64 11 0f 62 15 9b 64 9b 1f e1 63 62 63
i build 2,3,4,5-gram language model on it. , need smoothing! smoothing algorithm suitable , easy implement in case?
laplace (add-one) smoothing should easy implement. when comes robustness, n-gram tools (kenlm, srilm, ...) default kneser-ney smoothing.
for overview of performance of different smoothing techniques, see http://www.aclweb.org/anthology/p/p96/p96-1041.pdf
Comments
Post a Comment