Jewels of Stringology

Jewels of Stringology

Maxime Crochemore

Language: English

Pages: 320

ISBN: 9810248970

Format: PDF / Kindle (mobi) / ePub


The term “stringology” is a popular nickname for text algorithms, or algorithms on strings. This book deals with the most basic algorithms in the area. Most of them can be viewed as “algorithmic jewels” and deserve reader-friendly presentation. One of the main aims of the book is to present several of the most celebrated algorithms in a simple way by omitting obscuring details and separating algorithmic structure from combinatorial theoretical background. The book reflects the relationships between applications of text-algorithmic techniques and the classification of algorithms according to the measures of complexity considered. The text can be viewed as a parade of algorithms in which the main purpose is to discuss the foundations of the algorithms and their interconnections. One can partition the algorithmic problems discussed into practical and theoretical problems. Certainly, string matching and data compression are in the former class, while most problems related to symmetries and repetitions in texts are in the latter. However, all the problems are interesting from an algorithmic point of view and enable the reader to appreciate the importance of combinatorics on words as a tool in the design of efficient text algorithms.In most textbooks on algorithms and data structures, the presentation of efficient algorithms on words is quite short as compared to issues in graph theory, sorting, searching, and some other areas. At the same time, there are many presentations of interesting algorithms on words accessible only in journals and in a form directed mainly at specialists. This book fills the gap in the book literature on algorithms on words, and brings together the many results presently dispersed in the masses of journal articles. The presentation is reader-friendly; many examples and about two hundred figures illustrate nicely the behaviour of otherwise very complex algorithms.

Educational Testing and Measurement: Classroom Application and Practice (10th Edition)

Syntax: A Generative Introduction (3rd Edition)

The Macro Economy Today

Nakama 1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Information and, no doubt, at this moment m a n y computers are solving this problem as a frequently used operation in some application system. P a t t e r n matching is comparable in this sense to sorting, or to basic arithmetic operations. Consider the problem of a reader of the French dictionary " G r a n d Larousse," who wants all entries related to the n a m e "Marie-Curie-Sklodowska." This is a n example of a p a t t e r n matching problem, or string matching. In this case, the n a m e.

Contradiction. Assume that in some earlier iteration we scan the "forbidden" part of the text (shaded in Figure 3.5). Let q be the end position of the match in this iteration. Then q is not a critical position, and this match is contained completely in the current match (its overlap with the current match is shorter than k and q lies too far from the beginning of the current match). By the same argument, the rightmost critical position in the current match is to the right of q. Hence, we have.

Text, we have parsel(text) = 2.firstl(text) — 1; and for the third text, we have parsel(text) = firstl(text). It happens that it is a general rule that only these cases are possible. Lemma 8.2 Let x be a palstar, then parsel(x) G {firstl(x), 2.firstl(x) — 1,2.firstl(x) + 1}. Proof. The proof is similar to the proof of the preceding lemma. In fact, the two special cases (2.firstl(text) ± 1) are caused by the irregularity implied at critical points by considering odd and even palindromes together.

Space and linear time with MP algorithm modified by the shift function Shift. If the pattern pat is not GS-good, we consider its GS-decomposition (u,v). Searching for the whole pattern is done by the previous search for its part v, together with naive tests for the prefix part u. Since \u\ < 2.period(v), comparisons for the latter tests are no more than \text\. This gives the following informal description of Galil-Seiferas algorithm. Observe that it is quite similar to MaxSuffix-Matching, where.

Automaton SMA(H) with a failure function. The advantage in doing so is to represent the automaton within space 0(statesize(Tree(II))), quantity that is independent of the alphabet. The search then becomes analogous to MP algorithm of Chapter 3. Continuing the analogy, a function Bord related to IT can then be defined as follows. For a non-empty word u, Bord(u) = longest proper suffix of u that is a prefix of some pattern in II. We also denote by Bord the failure table defined on nodes of Tree.

Download sample

Download