General definitions
An ordered alphabet Σ is a set of symbols {σ_{1}, σ_{2},...,σ_{ℓ}} such that σ_{1} < σ_{2} < ... < σ_{ℓ}.
A word w = w_{0}w_{1}..w_{n-1} of length |w| = n on alphabet Σ is the concatenation of n symbols w_{k} (0 ≤ k < n) drawn from Σ. The empty word of length 0 is denoted by ε.
Given a word w, if w can be written as the concatenation of three possibly empty words u, v and z (w = uvz), then:
- u is called a prefix of w
- v is called a factor of w
- z is called a suffix of w
A prefix, factor or suffix is said to be proper iff it is different from w itself.
The set of all the prefixes (resp. factors, suffixes) of a word w is denoted by Pref(w) (resp. Fact(w), Suf(w)).
The prefix (resp. suffix) of length k of a word w is denoted by Pref_{k}(w) (resp. Suf_{k}(w)).
Given a word w, any word in Pref(w) ∩ Suf(w) is called a border of w. The set of all the borders of w is denoted by Bord(w).
A word w of length n has period p, 0 < p < n, iff, ∀ 0 ≤ k < n - p, w_{k} = w_{k+p}.
If no such period exists, the word is said to be aperiodic.
A word is primitive iff there does not exist any p < n dividing n such that p is a period of w.
If a word w can be written as the concatenation of two words u and v (w = uv), then the word vu is called a conjugate of w.
The set of all the conjugates of a word w is denoted by Conj(w) and is called the conjugacy class of w.
The lexicographic order <_{lex} (lexorder for short) is defined as follows. Given two words w and w' on an ordered alphabet Σ, then w < _{lex} w' iff one of these conditions is verified:
- w is a proper prefix of w'
- ∃ k ≥ 0 such that Pref_{k}(w) = Pref_{k}(w') and w_{k} < w'_{k}
Lyndon words
A word is a Lyndon word iff it is strictly less (in lexorder) than any of its proper suffixes.
Example:
- ananas is a Lyndon word since ananas <_{lex} anas <_{lex} as <_{lex} nanas <_{lex} nas <_{lex} s
- banana is not a Lyndon word since a <_{lex} ana <_{lex} anana <_{lex} banana <_{lex} na <_{lex} nana
A word is a Lyndon word iff it is the only smallest (in lexorder) element of its conjugacy class.
Example:
- ananas is a Lyndon word since ananas <_{lex} anasan <_{lex} asanan <_{lex} nanasa <_{lex} nasana <_{lex} sanana
- banana is not a Lyndon word since abanan <_{lex} anaban <_{lex} ananab <_{lex} banana <_{lex} nabana <_{lex} nanaba
A word w of length n is a Lyndon word iff ∀ 0 < k < n, Pref_{k}(w) <_{lex} Suf_{k}(w).
Example:
- ananas is a Lyndon word since a <_{lex} s, an<_{lex} as, ana <_{lex} nas, anan <_{lex} anas, anana <_{lex} nanas
- banana is not a Lyndon word since a <_{lex} b...
- A Lyndon word is primitive.
- A Lyndon word is aperiodic.
- Given a Lyndon word w of length n > 1, there always exist two Lyndon words u and v such that w=uv and u <_{lex} v.
Lyndon array
Given a word w of length n, the Lyndon array of w, denoted by L_{w} is defined as: ∀ 0 ≤ k < n, L_{w}[k] is the length of the longest prefix of w_{k}..w_{n-1} that is a Lyndon word.
Example:
k | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 |
w | a | b | a | a | b | b | a | b | b | a | a | a | b | b | a |
L_{w} | 2 | 1 | 7 | 3 | 1 | 1 | 3 | 1 | 1 | 5 | 4 | 3 | 1 | 1 | 1 |