From 446ef2e4cc8746a59270ba461d8c4c122be08005 Mon Sep 17 00:00:00 2001 From: George Datseris Date: Wed, 13 Aug 2025 14:14:55 +0100 Subject: [PATCH] Update lempel_ziv.jl with binarization tip --- src/complexity_measures/lempel_ziv.jl | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/src/complexity_measures/lempel_ziv.jl b/src/complexity_measures/lempel_ziv.jl index 4642f42c5..813eff557 100644 --- a/src/complexity_measures/lempel_ziv.jl +++ b/src/complexity_measures/lempel_ziv.jl @@ -5,7 +5,8 @@ export LempelZiv76 LempelZiv76() The Lempel-Ziv, or `LempelZiv76`, complexity measure [LempelZiv1976](@cite), -which is used with [`complexity`](@ref) and [`complexity_normalized`](@ref). +which is used with [`complexity`](@ref) and [`complexity_normalized`](@ref) +along with a timeseries/vector input data. For results to be comparable across sequences with different length, use the normalized version. Normalized `LempelZiv76`-complexity is implemented as given in [Amigó2004](@citet). @@ -18,6 +19,9 @@ two-element alphabet (precisely two distinct outcomes). For performance optimiza we do not check the number of unique elements in the input. If your input sequence is not binary, you must [`encode`](@ref) it first using one of the implemented [`Encoding`](@ref) schemes (or encode your data manually). + +A common binarization done on a timeseries is to transform each value to 0 if it is +less than the mean, or 1 if it is higher than the mean. """ struct LempelZiv76 <: ComplexityEstimator end