30.串联所有单词的子串

30. 串联所有单词的子串

鲁迅曾说，人生就是一道道关，闯过去了，你就进阶了，闯不过去，你就原地踏

步。年后这段时间不仅是春招、日常实习、社招都在同步进行，大家一定要抓紧

时间。就算是离秋招，25 届也没有多少时间了，顶多 5 个月。每天一道二哥的

LeetCode 刷题笔记，继续进步。

题意

给定一个字符串 s 和一个字符串数组 words 。words 中所有字符串长度相同。

s 中的串联子串是指一个包含 words 中所有字符串以任意顺序排列连接起来的子串。

例如，如果 words = ["ab","cd","ef"]，那么 “abcdef”， “abefcd”，“cdabef”，

“cdefab”，“efabcd”，和 “efcdab” 都是串联子串。 “acdbef” 不是串联子串，因为他不是任

何 words 排列的连接。

返回所有串联子串在 s 中的开始索引。你可以以任意顺序返回答案。

难度

困难

示例

输入：s = "barfoothefoobarman", words = ["foo","bar"]

输出：[0,9]

解释：因为 words.length == 2 同时 words[i].length == 3，连接的子字符串的长度必须为

6。

子串 "barfoo" 开始位置是 0。它是 words 中以 ["bar","foo"] 顺序排列的连接。

子串 "foobar" 开始位置是 9。它是 words 中以 ["foo","bar"] 顺序排列的连接。

输出顺序无关紧要。返回 [9,0] 也是可以的。

输入：s = "wordgoodgoodgoodbestword", words = ["word","good","best","word"]

输出：[]

解释：因为 words.length == 4 并且 words[i].length == 4，所以串联子串的长度必须为

16。

s 中没有子串长度为 16 并且等于 words 的任何顺序排列的连接。

所以我们返回一个空数组。

输入：s = "barfoofoobarthefoobarman", words = ["bar","foo","the"]

输出：[6,9,12]

解释：因为 words.length == 3 并且 words[i].length == 3，所以串联子串的长度必须为

9。

子串 "foobarthe" 开始位置是 6。它是 words 中以 ["foo","bar","the"] 顺序排列的连接。

子串 "barthefoo" 开始位置是 9。它是 words 中以 ["bar","the","foo"] 顺序排列的连接。

子串 "thefoobar" 开始位置是 12。它是 words 中以 ["the","foo","bar"] 顺序排列的连接。

分析 1

这道题，说实话，一开始看上去是有点懵逼的，因为描述了一大堆，有点抓不到重点。

以前版本的 LeetCode 描述中，例子也没有，我第一次看上去，确实有点摸不着头脑。

最新的 LeetCode 版本中加入了例子和解释，就清晰多了。

首先我们来搞清楚一个概念，什么是串联子串？

比如说一个字符串 s = abc，words = ["a","b","c"]。

words 所能组成的串联子串有：

• abc

• bac

• cab

• acb

• bca

• cba

然后，我们再来判断这些串联子串是否在 s 中，如果在的话，就返回它的起始位置。

那么显然，串联子串 abc 在 s 中，起始位置是 0。别的串联子串就不在。

那么很容易就能想到一种暴力解法，先排列组合出所有的串联子串，然后在 s 中查找这些

子串，如果找到了，就记录它的起始位置。

①、我们使用 Guava 的 Collections2.permutations 方法来生成所有 words 的排列。借助

API 可以帮我们减轻排列组合的工作量，当然了，也可以直接使用回溯算法来完成。

// 生成 words 的所有排列

private List<String> generatePermutations(String[] words) {

List<String> permutations = new ArrayList<>();

// 把数组转成集合

List<String> list = new ArrayList<>();

Collections.addAll(list, words);

// 生成所有排列

for (List<String> permutation : Collections2.permutations(list)) {

StringBuilder sb = new StringBuilder();

for (String word : permutation) {

sb.append(word);

}

permutations.add(sb.toString());

}

return permutations;

}

比如说，我们有一个 words = [“foo”,“bar”]，那么我们可以生成所有的排列：

• foobar

• barfoo

当然了，这里也可以使用回溯算法来完成，代码如下：

class PermutationsGenerator {

// 生成所有排列的方法

public List<String> generatePermutations(String[] words) {

List<String> results = new ArrayList<>();

permute(words, 0, results);

return results;

}

// 辅助方法：递归地生成排列

private void permute(String[] array, int start, List<String> result) {

if (start >= array.length) {

// 将当前排列转换为字符串并添加到结果列表

result.add(String.join("", array));

} else {

for (int i = start; i < array.length; i++) {

swap(array, start, i); // 交换元素

permute(array, start + 1, result); // 递归地生成剩余元素的排列

swap(array, start, i); // 撤销交换

}

// 交换数组中的两个元素

private void swap(String[] array, int i, int j) {

String temp = array[i];

array[i] = array[j];

array[j] = temp;

}

public static void main(String[] args) {

PermutationsGenerator generator = new PermutationsGenerator();

String[] words = {"a", "b", "c"};

List<String> permutations = generator.generatePermutations(words);

System.out.println("排列组合: " + permutations);

}

②、然后我们再来检查每个排列是否为 s 的子串。判断子串就比较简单了，直接使用

indexOf 方法就可以了。

当然了，一次 indexOf 是不够的，容易漏掉所有的情况，所以要使用 while 循环配合

indexOf 的重载方法 index = s.indexOf(permutation, index + 1)来完成。

public List<Integer> findSubstring(String s, String[] words) {

// 这是 LeetCode 的第 30 题

List<Integer> result = new ArrayList<>();

// 生成所有 words 的排列

List<String> permutations = generatePermutations(words);

// 检查每个排列是否为 s 的子串

for (String permutation : permutations) {

int index = s.indexOf(permutation);

while (index != -1) {

// 如果找到，添加到结果列表中

if (!result.contains(index)) {

result.add(index);

}

// 继续查找下一个匹配的子串

index = s.indexOf(permutation, index + 1);

}

return result;

}

不过，由于这个暴力解法的效率很低，我们就直接使用 ACM 的模式在 Intellij IDEA 中进

行尝试了。这样方便帮助大家理解整个题解的过程。

class Main03001 {

public static void main(String[] args) {

Solution03001 solution = new Solution03001();

String s = "barfoobarfoothefoobarman";

String[] words = {"foo", "bar"};

List<Integer> result = solution.findSubstring(s, words);

System.out.println(result);

}

class Solution03001 {

public List<Integer> findSubstring(String s, String[] words) {

// 这是 LeetCode 的第 30 题

List<Integer> result = new ArrayList<>();

// 生成所有 words 的排列

List<String> permutations = generatePermutations(words);

// 检查每个排列是否为 s 的子串

for (String permutation : permutations) {

int index = s.indexOf(permutation);

while (index != -1) {

// 如果找到，添加到结果列表中

if (!result.contains(index)) {

result.add(index);

}

// 继续查找下一个匹配的子串

index = s.indexOf(permutation, index + 1);

}

return result;

}

// 生成 words 的所有排列

private List<String> generatePermutations(String[] words) {

List<String> permutations = new ArrayList<>();

// 把数组转成集合

List<String> list = new ArrayList<>();

Collections.addAll(list, words);

// 生成所有排列

for (List<String> permutation : Collections2.permutations(list)) {

StringBuilder sb = new StringBuilder();

for (String word : permutation) {

sb.append(word);

}

permutations.add(sb.toString());

}

return permutations;

}

来看一下运行结果：

[3, 15, 0, 6]

也是完全可以找出来的，不过，很遗憾，如果直接在 LeetCode 上用这种方法来完成的

话，会超出时间限制。

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227102310.png

但是，刷 LeetCode 其实并不只是为了刷题，更重要的是要学会分析问题，尤其是对于算

法基础比较薄弱的球友，如果一上来就搞效率很高的解法，但看又看不懂，那就没有意义

了。

分析 2

那我们接着来分析，看看能不能在暴力的基础上进一步优化，这次我们就不排列组合了，

因为排列组合确实会比较花费时间，尤其是当 words 数组长度比较大的时候。

这次采用逆向思维。

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227104117.png

第一步，我们直接在 s 中先截取一个长度为 words.length * words[0].length 的子串

（words 数组中的字符串长度都相等），那么这个子串要么是串联子串，要么就不是，对

吧？

第二步，再从这个子串中截取长度为 words[0].length 的子串，然后判断这个子串是否在

words 数组中，如果在的话，就继续截取下一个子串，直到截取完毕。

我们借助了两个哈希表来帮我们完成这项工作：

①、一个哈希表 cnt 用于记录 words 中每个单词应出现的次数，比如说：

• words = ["foo","bar"]，那么 cnt = {foo=1, bar=1}。

• words = ["foo","bar","foo"]，那么 cnt = {foo=2, bar=1}。

并且哈希表 cnt 的 key 还可以帮我们判断第二步中当前截取的子串是否在 words 数组中。

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227104931.png

不在的话，就可以直接跳过这次循环了。

②、一个哈希表 com 用于记录当前考察的子串中各单词的出现次数

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227105313.png

如果 com 中统计到的单词出现次数超过了它在 words 中应有的次数，那么说明当前子串

不符合条件，立即跳出。

最后，如果 com 中统计到的和 cnt 完全一样，那么说明当前子串是符合条件的，将其起

始索引添加到结果列表中。

我们来看一下完整的题解代码：

class Solution {

public List<Integer> findSubstring(String s, String[] words) {

List<Integer> res = new ArrayList<>();

if (s == null || words == null || words.length == 0) return res;

// 单词的长度，假设所有单词长度相同

int wordLength = words[0].length();

// 所有单词串联起来的总长度

int allWordsLength = wordLength * words.length;

// 使用 cnt 哈希表来记录 words 中每个单词应出现的次数

Map<String, Integer> cnt = new HashMap<>();

for (String word : words) {

cnt.put(word, cnt.getOrDefault(word, 0) + 1);

}

// 遍历字符串 s，考虑每一个可能的起始位置

for (int i = 0; i <= s.length() - allWordsLength; i++) {

// com 哈希表用于记录当前考察的子串中各单词的出现次数

Map<String, Integer> com = new HashMap<>();

// 提取从 i 开始长度为 allWordsLength 的子串

String temp = s.substring(i, i + allWordsLength);

boolean isValid = true; // 标记当前子串是否有效

// 再次遍历子串，每次移动一个单词的长度

for (int j = 0; j < temp.length(); j += wordLength) {

String t = temp.substring(j, j + wordLength);

// 如果 t 不在 cnt 中，说明当前子串不符合条件，立即跳出

if (!cnt.containsKey(t)) {

isValid = false;

break;

}

com.put(t, com.getOrDefault(t, 0) + 1);

// 如果当前单词 t 的出现次数超过了它在 words 中应有的次数，同样说明当前子串

不符合条件

if (com.get(t) > cnt.get(t)) {

isValid = false;

break;

}

// 如果子串有效，将其起始索引添加到结果列表中

if (isValid && com.equals(cnt)) {

res.add(i);

}

return res;

}

虽然题解效率并不乐观，但很好理解，对吧？

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227110005.png

分析 3

我们来观察一下分析 2 中的过程：

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227110113.png

对于每个长度为 6 的子串（words = [“foo”,“bar”]），我们都要检验一次，但是可以注意

到，第一次和第四次，本质上就是去掉了一个 bar 加上一个 the，第二次和第五次，本质

上就是去掉了一个 arf 加上一个 hef，第三次和第六次，本质上就是去掉了一个 rfo 加上

了一个 efo。

所以……我们可以考虑将他们分成单词长度组来看看他们的性质？

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227110241.png

分组之后，每次只需要添加一个单词，再删去一个单词就可以得到了下一次的状态，这样

子就不用费劲的重新构造这个字符串了。

每次添加一个单词和删去一个单词的代价也不高！

其实这样子就相当于设立了一个窗口，在字符串 s 上，以单词长度/每次的速度在移动，所

以这也经常被称作滑动窗口。

我们在 003.无重复字符的最长子串中也曾详细讲过滑动窗口，其核心原理就在于利用滑

动的窗口减少不必要的计算。

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227115834.png

但其实思路和分析 2 的差不多，只是在实现上有所不同。

第一步，同样是遍历 s，准备一个滑动窗口，对窗口内的子串进行验证。

for (int i = 0; i < wordLength; ++i) {

// 处理一个窗口内的所有单词

}

因为是滑动窗口，所以第一个 for 循环只需要遍历 0 到 wordLen - 1 这个范围，见下图的

三个红色框。如果 wordLen = 3，那么就是遍历 0、1、2 这三个位置，后续的处理交给滑

动窗口。

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227141910.png

第二步，在窗口内部，每次滑动的步长为一个单词的长度 j += wordLength;start +=

wordLength;，滑动窗口的宽度为所有单词的总长度 s.length() - windowLength。

for (int j = i; j <= s.length() - windowLength; j += wordLength) {

while (start < end) {

// 处理一个窗口内的所有单词

// 左窗口右移一个单词的长度

start += wordLength;

}

如果窗口内的单词正好匹配 words 数组中的所有单词，则添加到结果中。

if (start == end) {

result.add(j);

}

来看完整的题解：

class Solution {

public List<Integer> findSubstring(String s, String[] words) {

List<Integer> result = new ArrayList<>();

if (s == null || words == null || words.length == 0) return result;

// 创建一个 HashMap 来存储 words 数组中每个单词的出现次数

Map<String, Integer> wordCount = new HashMap<>();

for (String word : words) {

wordCount.put(word, wordCount.getOrDefault(word, 0) + 1);

}

// 单个单词的长度和窗口的长度（所有单词的总长度）

int wordLength = words[0].length();

int windowLength = words.length * wordLength;

// 遍历字符串 s，每次增加的步长为一个单词的长度

for (int i = 0; i < wordLength; ++i) {

for (int j = i; j <= s.length() - windowLength; j += wordLength) {

// 用于记录当前窗口中单词的出现情况

Map<String, Integer> seenWords = new HashMap<>();

int start = j;

int end = j + windowLength;

// 在窗口内部，每次也是按照单词的长度来切分和检查

while (start < end) {

String currentWord = s.substring(start, start + wordLength);

seenWords.put(currentWord, seenWords.getOrDefault(currentWord,

0) + 1);

// 如果当前单词不在 words 中，或者出现的次数超过了 words 中该单词的次数，

则跳出循环

if (!wordCount.containsKey(currentWord) ||

seenWords.get(currentWord) > wordCount.get(currentWord)) {

break;

}

start += wordLength;

}

// 如果窗口内的单词正好匹配 words 数组中的所有单词，则添加到结果中

if (start == end) {

result.add(j);

}

return result;

}

在这个题解中，同样借助了两个 HashMap 来帮助我们完成这项工作。

①、一个 HashMap wordCount 用于记录 words 数组中每个单词应出现的次数。

②、一个 HashMap seenWords 用于记录当前窗口中单词的出现情况。

也很好理解，大家看完代码注释就明白为什么了，来看一下题解效率：

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227154314.png

显然比之前的效率提高了不少，对吧？

分析 4

我们再来看一种解法，这种解法是在分析 3 的基础上进一步优化的。

class Solution {

public List<Integer> findSubstring(String s, String[] words) {

List<Integer> res = new ArrayList<>();

if (s == null || words == null || s.length() == 0 || words.length == 0) return

res;

int wordLen = words[0].length(); // 单词的长度

int wordCount = words.length; // 单词的数量

// 统计 words 中每个单词应该出现的次数

Map<String, Integer> wordMap = new HashMap<>();

for (String word : words) {

wordMap.put(word, wordMap.getOrDefault(word, 0) + 1);

}

// 遍历所有的可能起始位置，0 到 wordLen - 1

for (int i = 0; i < wordLen; i++) {

int left = i; // 窗口左边界

int count = 0; // 窗口中符合条件的单词数量

Map<String, Integer> windowMap = new HashMap<>(); // 窗口中单词的计

数

// 移动窗口的右边界

for (int j = i; j <= s.length() - wordLen; j += wordLen) {

String word = s.substring(j, j + wordLen);

// 如果 word 是有效的

if (wordMap.containsKey(word)) {

windowMap.put(word, windowMap.getOrDefault(word, 0) + 1);

count++;

// 如果窗口中该单词的数量超过了应有的数量，需要调整窗口的左边界

while (windowMap.get(word) > wordMap.get(word)) {

String leftWord = s.substring(left, left + wordLen);

windowMap.put(leftWord, windowMap.get(leftWord) - 1);

count--;

left += wordLen;

}

// 如果窗口中的单词数量正好等于 words 中的单词数量，记录起始位置

if (count == wordCount) {

res.add(left);

// 移动左边界以寻找新的匹配

String leftWord = s.substring(left, left + wordLen);

windowMap.put(leftWord, windowMap.get(leftWord) - 1);

count--;

left += wordLen;

}

} else {

// 如果 word 不是有效的，则重置窗口

windowMap.clear();

count = 0;

left = j + wordLen;

}

return res;

}

分析 4 和分析 3 在解决问题的思路上都是一样，都是通过滑动窗口来遍历 s，然后在窗口

内部，每次按照单词的长度来切分和检查。

但分析 4 采用了更细粒度的窗口移动策略，当遇到不匹配的单词时，它不是直接跳过整个

窗口，而是逐个单词地调整窗口的左边界，直到窗口中的单词再次满足条件，或者窗口被

完全重置。

这种策略的优势在于，它可以更快地找到下一个匹配的窗口，而不是直接跳过整个窗口。

所以效率也有了不少的提升，但实现比分析 3 复杂了不少。

030.%E4%B8%B2%E8%81%94%E6%89%80%E6%9C%89%E5%8D%95%E8%AF%8D%E7

%9A%84%E5%AD%90%E4%B8%B2-20240227160050.png

总结

本题是一道 hard，但通过分析 1、分析 2、分析 3、分析 4 这么一套循序渐进的方式讲解

下来，相信大家也都能够掌握了。

分析 1 非常直观，排列组合然后在 s 中查找，但效率不高。因为排列组合就需要花费很多

时间。

分析 2 采用了逆向思维，先在 s 中截取一个长度为 words.length * words[0].length 的子

串，然后再截取长度为 words[0].length 的子串，判断这个子串是否在 words 数组中，如

果在的话，就继续截取下一个子串，直到截取完毕。这种方法效率就比分析 1 提高了一

些。

分析 3 采用了滑动窗口的思想，也非常直观和容易掌握，效率也比分析 2 提高了不少。

分析 4 在分析 3 的基础上进一步优化，采用了更细粒度的窗口移动策略，效率提高了一大

截，但代码量也多了不少。

好，我们来总结一下题解过程中所遇到的 Java 基础知识：

①、HashMap 的使用，包括 put、get、getOrDefault、clear 等方法的使用：HashMap 的

深度解析

②、字符串的截取：Java 字符串截取

③、滑动窗口的使用：滑动窗口算法详解

力扣链接：https://leetcode.cn/problems/substring-with-concatenation-of-all-

words/

30.串联所有单词的子串

开通会员本次下载免费

开通网站会员享专属特权

热门资源

推荐资源

30.串联所有单词的子串

开通会员 本次下载免费

开通网站会员 享专属特权

热门资源

推荐资源

开通会员本次下载免费

开通网站会员享专属特权