The linguistic summarization of time series data (TSD) has been examined extensively because the extracted knowledge represented as summary sentences in natural language is interpretable for all people. The existing extracting methods use manually designed fuzzy partitions of value domains, so the word semantic depends on the subjective opinions of designers. Besides, the number of linguistic words with the fuzzy set-based computational semantics used to describe the TSD, the quantifier, and the summarizer is usually limited to 7±2. That cardinality is not rich enough to describe the special characteristics in a certain period in the TSD. In this paper, enlarge hedge algebra is applied to create a mathematical formalism for automatically designing interpretable and scalable multi-level semantic structures for the corresponding value domains of linguistic variables and these structures can be arbitrarily extended as needed. The objectives of the applied genetic algorithm were also adjusted to improve the optimization goals. The experimental results on the patient admission data have shown that our proposed methods obtain the outstanding results in terms of accuracy, conciseness, and coverage.
Copyrights © 2026