CodeForces 427D Match & Catch 后缀数组

简介: CodeForces 427D Match & Catch 后缀数组

Police headquarter is monitoring signal on different frequency levels. They have got two suspiciously encoded strings s1 and s2 from two different frequencies as signals. They are suspecting that these two strings are from two different criminals and they are planning to do some evil task.


Now they are trying to find a common substring of minimum length between these two strings. The substring must occur only once in the first string, and also it must occur only once in the second string.


Given two strings s1 and s2 consist of lowercase Latin letters, find the smallest (by length) common substring p of both s1 and s2, where p is a unique substring in s1 and also in s2. See notes for formal definition of substring and uniqueness.


Input

The first line of input contains s1 and the second line contains s2 (1 ≤ |s1|, |s2| ≤ 5000). Both strings consist of lowercase Latin letters.


Output

Print the length of the smallest common unique substring of s1 and s2. If there are no common unique substrings of s1 and s2 print -1.

题目大意:

就是对于两个字符串s1和s2, 求出他们最短的一个公共字串t,满足t在s1中只出现1次, t在s2中也只出现一次, 输出t最小的长度, 如果这样的t不存在, 输出-1


思路:将这两个字符串拼成一个串,中间用一个没出现过的字符隔开(我用的空格),对合成的串求后缀数组,这样两个数组的公共子串所在的后缀在sa中应该是相邻的。通过观察,只有当这两个sa中相邻的后缀sa[i]和sa[i-1]左右的后缀都不含这个子串时,才能保证子串在两个串都只出现一次。


具体来说就是

(1)后缀sa[i]和sa[i-1]分别属于两个字符串

(2)height[i-1] < height[i] && height[i+1] < height[i]

(3)这个最小公共子串长度就是max(height[i-1], height[i+1]) + 1


整体扫过一遍height数组统计最小值即可,没有满足的情况时输出-1

#include <iostream>
#include <stdio.h>
#include <string.h>
using namespace std;
const int maxn = 2e6 + 5;
char s1[maxn], s2[maxn], s[maxn];
int n, p1, p2, m, p;
int tx[maxn], sa[maxn], rak[maxn];
int height[maxn], num1[maxn];
const int inf = 1e9;
struct node {
  int x, y, id;
}a[maxn], b[maxn];
void rsort() {
  for (int i = 1; i <= m; i++) {
    tx[i] = 0;
  }
  for (int i = 1; i <= n; i++) {
    tx[a[i].y]++;
  }
  for (int i = 1; i <= m; i++) {
    tx[i] += tx[i - 1];
  }
  for (int i = 1; i <= n; i++) {
    b[tx[a[i].y]--] = a[i];
  }
  for (int i = 1; i <= m; i++) {
    tx[i] = 0;
  }
  for (int i = 1; i <= n; i++) {
    tx[b[i].x]++;
  }
  for (int i = 1; i <= m; i++) {
    tx[i] += tx[i - 1];
  }
  for (int i = n; i >= 1; i--) {
    a[tx[b[i].x]--] = b[i];
  }
}
void solve() {
  rsort();
  p = 0;
  for (int i = 1; i <= n; i++) {
    if (a[i].x == a[i - 1].x && a[i].y == a[i - 1].y) {
      rak[a[i].id] = p;
    } else {
      rak[a[i].id] = ++p;
    }
  }
  for (int i = 1; i <= n; i++) {
    a[i].x = rak[i];
    a[i].id = sa[rak[i]] = i;
    a[i].y = 0;
  }
  m = p;
}
void ssort() {
  m = 127;
  for (int i = 1; i <= n; i++) {
    a[i].x = a[i].y = s[i];
    a[i].id = i;
  }
  solve();
  for (int j = 1; j <= n; j <<= 1) {
    for (int i = 1; i + j <= n; i++) {
      a[i].y = a[i + j].x;
    }
    solve();
    if (p == n) {
      break;
    }
  }
} 
void get_Height() {
  int k = 0;
  for (int i = 1; i <= n; i++) {
    if (k) {
      k--;
    }
    int j = sa[rak[i] - 1];
    while (s[i + k] == s[j + k]) {
      k++;
    }
    height[rak[i]] = k;
  }
}
int main() {
  scanf("%s", s1 + 1);
  scanf("%s", s2 + 1);
  p1 = strlen(s1 + 1);
  p2 = strlen(s2 + 1);
  for (int i = 1; i <= p1; i++) {
    s[++n] = s1[i];
    num1[n] = 2;
  }
  s[++n] = '#';
  for (int i = 1; i <= p2; i++) {
    s[++n] = s2[i];
    num1[n] = 1;
  }
  ssort();
  get_Height();
  int ans = inf;
  for (int i = 2; i <= n; i++) {
      if (num1[sa[i]] + num1[sa[i - 1]] == 3 && height[i - 1] < height[i] && height[i] > height[i + 1]) {
        ans = min(ans, max(height[i - 1], height[i + 1]) + 1);
    }
  }
  printf("%d\n", ans == inf ? -1 : ans);
  return 0;
}
相关文章
|
算法
light oj 1255 - Substring Frequency (KMP)
输入两个字符串,计算二串在一串中出现的次数。 裸裸的KMP,参考刘汝佳《算法竞赛入门经典训练指南》 P212 或数据结构。
27 1
|
机器学习/深度学习
CF1304B Longest Palindrome(可逆转符号,数学分析,回文串)
CF1304B Longest Palindrome(可逆转符号,数学分析,回文串)
45 0
|
算法 Java 程序员
【手绘算法】力扣 3 无重复的最长字符串(Longest Substring Without Repeating Characters)
Hi,大家好,我是Tom。一个美术生转到Java开发的程序员。今天给大家分享的是力扣题第3题,无重复的最长字符串。在解题过程中,也会手写一些伪代码。当然,如果想要完整的源码的话,可以到我的个人主页简介中获取。 这道题呢属于中等难度,评估为四颗星,它的通过率只有39%。
91 0
|
算法 Java 程序员
【手绘算法】力扣 20 有效的括号(Valid Parentheses)
Hi,大家好,我是Tom。一个美术生转到Java开发的程序员。今天给大家分享的是力扣题第20题,有效的括号。在解题过程中,也会手写一些伪代码。当然,如果想要完整的源码的话,可以到我的个人主页简介中获取。 这道题呢比较简单,但是曾经成为B站的算法面试题,而且通过率只有44.5%。
84 0
|
Java 程序员 开发者
LeetCode第三题(Longest Substring Without Repeating Characters)三部曲之一:解题思路
本文回顾了笔者对LeetCode第三题(Longest Substring Without Repeating Characters)的解题和优化思路,希望有兴趣的读者来一起广泛讨论
109 1
LeetCode第三题(Longest Substring Without Repeating Characters)三部曲之一:解题思路
|
Java 程序员 开发者
LeetCode第三题(Longest Substring Without Repeating Characters)三部曲之二:编码实现
本文是《LeetCode第三题(Longest Substring Without Repeating Characters)三部曲》的第二篇,前一篇文章已经列出了完整的解题思路,今天来将此思路转化为具体的Java代码
103 0
LeetCode第三题(Longest Substring Without Repeating Characters)三部曲之二:编码实现
|
人工智能 JavaScript BI
AtCoder Beginner Contest 222 D - Between Two Arrays(前缀和优化dp)
AtCoder Beginner Contest 222 D - Between Two Arrays(前缀和优化dp)
106 0
|
人工智能
UPC-Match Matching(完全背包dp+字符串)
UPC-Match Matching(完全背包dp+字符串)
80 0
HDOJ/HDU 2163 Palindromes(判断回文串~)
HDOJ/HDU 2163 Palindromes(判断回文串~)
94 0
HDOJ/HDU 1062 Text Reverse(字符串翻转~)
HDOJ/HDU 1062 Text Reverse(字符串翻转~)
113 0