CPM 2021

32nd Annual Symposium on Combinatorial Pattern Matching

Wrocław, Poland, July 5–7, 2021

Conference

Papers and proceedings

Local information

Travel

Other

Keynote Speakers

Hideo Bannai
Tokyo Medical and Dental University, Japan

Repetitions in strings: a "constant" problem

Repeating structures in strings is one of the most fundamental characteristics of strings, and has been an important topic in the field of combinatorics on words and combinatorial pattern matching since its beginnings. In this talk, I will focus on squares and maximal repetitions, and review the "runs" theorem and related results which address the two main questions: how many can be contained in a string of given length, and algorithms for computing them.

Michal Koucký
Charles University, Czech Republic

Computing edit distance

The edit distance (or Levenshtein distance) between two strings x, y is the minimum number of character insertions, deletions, and substitutions needed to convert x into y. It has numerous applications in various fields from text processing to bioinformatics so algorithms for edit distance computation attract lot of attention. In this talk I will survey recent progress on computational aspects of edit distance in several contexts: computing edit distance approximately, sketching and computing it in streaming model, exchanging strings in communication complexity model, and building error correcting codes for edit distance. I will point out many problems that are still open in those areas.

Nadia Pisanti
University of Pisa, Italy & Erable Team INRIA, France

On-line Algorithms and Applications for Pattern Matching on Degenerate Texts

The "Elastic Degenerate String Matching" (EDSM) problem was defined in CPM 2017 as that of finding an occurrence of a pattern P of length m in an ED-text T. A D-text (degenerate text) is a string that actually represents a set of similar and aligned strings by collapsing common fragments into a standard linear string, and representing variants with sets of alternative substrings. When such substrings are not bound to have the same size, then we talk about elastic D-strings (ED-strings). In CPM 2017 we gave an O(nm2+N) time on-line algorithm for EDSM, where n is the length of T and N is its size, defined as the total number of letters. A fundamental toolkit of our algorithm is the O(m2+N) time solution of the (later called) "Active Prefixes" (AP) problem". In CPM 2018, Aoyama et al. gave a O(m1.5 √log m+N) solution for AP leading to a O(nm1.5 √log m+N) time solution for EDSM using our CPM 2017 algorithmic framework. The natural open problem thus because whether the 1.5 exponent could furtherly be decreased. In our ICALP 2019, we prove several properties that answer this and other questions. First, we give a conditional O(nm1.5+N) lower bound for EDSM, proving that a combinatorial algorithm solving EDSM in O(nm1.5-ϵ +N) time implies to break the Boolean Matrix Multiplication (BMM) conjecture, which states that there is no truly sub-cubic combinatorial algorithm solving BMM. Second, we use this result as a hint to devise a non-combinatorial algorithm that solves EDSM in O(nm1.381+N) time. We do so by succesfully combining Fast Fourier Transform and properties of string periodicity.