General definitions
An ordered alphabet Σ is a set of symbols {σ1, σ2,...,σℓ} such that σ1 < σ2 < ... < σℓ.
A word w = w0w1..wn-1 of length |w| = n on alphabet Σ is the concatenation of n symbols wk (0 ≤ k < n) drawn from Σ. The empty word of length 0 is denoted by ε.
Given a word w, if w can be written as the concatenation of three possibly empty words u, v and z (w = uvz), then:
- u is called a prefix of w
- v is called a factor of w
- z is called a suffix of w
A prefix, factor or suffix is said to be proper iff it is different from w itself.
The set of all the prefixes (resp. factors, suffixes) of a word w is denoted by Pref(w) (resp. Fact(w), Suf(w)).
The prefix (resp. suffix) of length k of a word w is denoted by Prefk(w) (resp. Sufk(w)).
Given a word w, any word in Pref(w) ∩ Suf(w) is called a border of w. The set of all the borders of w is denoted by Bord(w).
A word w of length n has period p, 0 < p < n, iff, ∀ 0 ≤ k < n - p, wk = wk+p.
If no such period exists, the word is said to be aperiodic.
A word is primitive iff there does not exist any p < n dividing n such that p is a period of w.
If a word w can be written as the concatenation of two words u and v (w = uv), then the word vu is called a conjugate of w.
The set of all the conjugates of a word w is denoted by Conj(w) and is called the conjugacy class of w.
The lexicographic order <lex (lexorder for short) is defined as follows. Given two words w and w' on an ordered alphabet Σ, then w < lex w' iff one of these conditions is verified:
- w is a proper prefix of w'
- ∃ k ≥ 0 such that Prefk(w) = Prefk(w') and wk < w'k
Lyndon words
A word is a Lyndon word iff it is strictly less (in lexorder) than any of its proper suffixes.
Example:
- ananas is a Lyndon word since ananas <lex anas <lex as <lex nanas <lex nas <lex s
- banana is not a Lyndon word since a <lex ana <lex anana <lex banana <lex na <lex nana
A word is a Lyndon word iff it is the only smallest (in lexorder) element of its conjugacy class.
Example:
- ananas is a Lyndon word since ananas <lex anasan <lex asanan <lex nanasa <lex nasana <lex sanana
- banana is not a Lyndon word since abanan <lex anaban <lex ananab <lex banana <lex nabana <lex nanaba
A word w of length n is a Lyndon word iff ∀ 0 < k < n, Prefk(w) <lex Sufk(w).
Example:
- ananas is a Lyndon word since a <lex s, an<lex as, ana <lex nas, anan <lex anas, anana <lex nanas
- banana is not a Lyndon word since a <lex b...
- A Lyndon word is primitive.
- A Lyndon word is aperiodic.
- Given a Lyndon word w of length n > 1, there always exist two Lyndon words u and v such that w=uv and u <lex v.
Lyndon array
Given a word w of length n, the Lyndon array of w, denoted by Lw is defined as: ∀ 0 ≤ k < n, Lw[k] is the length of the longest prefix of wk..wn-1 that is a Lyndon word.
Example:
k | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
w | a | b | a | a | b | b | a | b | b | a | a | a | b | b | a |
Lw | 2 | 1 | 7 | 3 | 1 | 1 | 3 | 1 | 1 | 5 | 4 | 3 | 1 | 1 | 1 |