George Mason University
George Mason University Mason
George Mason University

Suffix Trees for Universal Source Modeling with Applications

by Bernd-Peter Paris

Publication Details MORE LESS

  • Published Date: March 23, 2011

Abstract

Suffix trees are shown to reveal significant information about the internal structure of an individ- ual sequence S. Specifically it is shown how the number of occurrences of any subsequence of S, the recurrence period for any subsequence that occurs at least twice in S, and the longest subsequence that occurs at least twice in S are determined from the sequence's suffix tree. Since suffix trees can be constructed with low, O(N) computational complexity, they provide a pow- erful tool for information-theoretic sequence analysis. We further demonstrate the utility of suffix trees by using the collected statistics for two common problems in information theory: variable order source modeling and entropy estimation. Index Terms—Suffix tree, finite state machine, vari- able order Markov model, entropy estimation

Other Contributors

Jay Gibble
Expertise