这两天正好在做有关字典树的题,写篇文章记录一下
简介
字典树又称单词查找树,Trie树,是一种树形结构,是一种哈希树的变种。典型应用是用于统计,排序和保存大量的字符串(但不仅限于字符串),所以经常被搜索引擎系统用于文本词频统计。它的优点是:利用字符串的公共前缀来减少查询时间,最大限度地减少无谓的字符串比较,查询效率比哈希树高。
上面这段话cv自百度百科,简单来说,字典树就是一颗像字典一样的树。
下面放张图感受一下。
字典树核心是用 边
来代表有无字符,使用 点
来记录是否是 单词结尾
以及 其在字符串的后续字符是什么
。这样从根结点到树上某一结点的路径就代表了一个字符串。
举个🌰:如上图路径 1 -> 4 -> 8 -> 12 -> 5
就表示了 caaa
这个字符串。
实现方法
1、二维数组
一个朴素的想法是用 二维数组
来实现 Trie树 。
- 使用二维数组
son[][]
来存储字符串出现的所有字符、 - 使用
idx
来记录当前以及使用掉了多少格子(就是给当前的字符进行编号)。 - 使用
cnt[]
数组来标记以某字符结尾的单词个数(例如 $cnt[idx] = n$ 表示以编号idx
的字符结尾的单词个数为n
个。
代码实现
public class Trie {
private static final int N = 100010;
private int[][] son;
private int[] cnt;
private int idx;
public Trie() {
this.son = new int[N][26];
this.cnt = new int[N];
this.idx = 0;
}
public void insert(String word) {
int p = 0;
for (int i = 0; i < word.length(); i++) {
int u = word.charAt(i) - 'a';
if (son[p][u] == 0) son[p][u] = ++idx;
p = son[p][u];
}
cnt[p]++;
}
public boolean search(String word) {
int p = 0;
for (int i = 0; i < word.length(); i++) {
int u = word.charAt(i) - 'a';
if (son[p][u] == 0) return false;
p = son[p][u];
}
return cnt[p] > 0;
}
public boolean startsWith(String prefix) {
int p = 0;
for (int i = 0; i < prefix.length(); i++) {
int u = prefix.charAt(i) - 'a';
if (son[p][u] == 0) return false;
p = son[p][u];
}
return true;
}
}
2、Node
相比于二维数组,Java中更加常规的做法是使用 Node
结点。
每个 Node
结点包含以下信息:
private static class Node {
public Node[] son;
public int endCount;
public Node() {
this.son = new Node[26];
this.endCount = 0;
}
}
代码实现
public class Trie {
private static class Node {
public Node[] son;
public int endCount;
public Node() {
this.son = new Node[26];
this.endCount = 0;
}
}
private Node root;
public Trie() {
this.root = new Node();
}
public void insert(String word) {
Node p = root;
for (int i = 0; i < word.length(); i++) {
int u = word.charAt(i) - 'a';
if (p.son[u] == null) p.son[u] = new Node();
p = p.son[u];
}
p.endCount++;
}
public boolean search(String word) {
Node p = root;
for (int i = 0; i < word.length(); i++) {
int u = word.charAt(i) - 'a';
if (p.son[u] == null) return false;
p = p.son[u];
}
return p.endCount > 0;
}
public boolean startsWith(String prefix) {
Node p = root;
for (int i = 0; i < prefix.length(); i++) {
int u = prefix.charAt(i) - 'a';
if (p.son[u] == null) return false;
p = p.son[u];
}
return true;
}
}
字典树例题
1、实现 Trie (前缀树)
纯模板题,链接:https://leetcode-cn.com/problems/implement-trie-prefix-tree/
2、添加与搜索单词 - 数据结构设计
Trie + DFS,链接:https://leetcode-cn.com/problems/design-add-and-search-words-data-structure/
代码
class WordDictionary {
private static class Node {
public Node[] son;
private int endCount;
public Node() {
this.son = new Node[26];
this.endCount = 0;
}
}
private Node root;
public WordDictionary() {
root = new Node();
}
public void addWord(String word) {
Node p = root;
for (int i = 0; i < word.length(); i++) {
int u = word.charAt(i) - 'a';
if (p.son[u] == null) p.son[u] = new Node();
p = p.son[u];
}
p.endCount++;
}
private boolean dfs(String word, Node x, int u) {
if (u == word.length() && x.endCount > 0) return true;
if (u == word.length()) return false;
if (word.charAt(u) != '.') {
if (x.son[word.charAt(u) - 'a'] == null) return false;
return dfs(word, x.son[word.charAt(u) - 'a'], u + 1);
}
for (int i = 0; i < 26; i++)
if (x.son[i] != null)
if (dfs(word, x.son[i], u + 1))
return true;
return false;
}
public boolean search(String word) {
return dfs(word, root, 0);
}
}
3、单词搜索 II
Trie + dfs剪枝,链接:https://leetcode-cn.com/problems/word-search-ii/
代码
class Solution {
private static class Node {
public Node[] son;
private int endCount;
private boolean marked;
public Node() {
this.son = new Node[26];
this.endCount = 0;
this.marked = false;
}
}
private Node root;
private int n, m, maxL;
private int[] dx = {0, 1, 0, -1}, dy = {1, 0, -1, 0};
private List<String> res;
private void insert(String word) {
Node p = root;
for (int i = 0; i < word.length(); i++) {
int u = word.charAt(i) - 'a';
if (p.son[u] == null) p.son[u] = new Node();
p = p.son[u];
}
p.endCount++;
}
private boolean search(String word) {
Node p = root;
for (int i = 0; i < word.length(); i++) {
int u = word.charAt(i) - 'a';
if (p.son[u] == null) return false;
p = p.son[u];
}
if (p.endCount > 0 && !p.marked) {
p.marked = true;
return true;
}
return false;
}
private void dfs(char[][] board, boolean[][] marked, String s, int x, int y) {
if (s.length() > maxL) return;
if (search(s)) res.add(s);
for (int i = 0; i < 4; i++) {
int a = x + dx[i], b = y + dy[i];
if (a >= 0 && a < n && b >= 0 && b < m && !marked[a][b]) {
marked[a][b] = true;
dfs(board, marked, s + board[a][b], a, b);
marked[a][b] = false;
}
}
}
public List<String> findWords(char[][] board, String[] words) {
root = new Node();
res = new ArrayList<>();
n = board.length;
m = board[0].length;
for (var word : words) {
maxL = Math.max(maxL, word.length());
insert(word);
}
for (int i = 0; i < n; i++)
for (int j = 0; j < m; j++) {
boolean[][] marked = new boolean[n][m];
marked[i][j] = true;
dfs(board, marked, String.valueOf(board[i][j]), i, j);
}
return res;
}
}
4、数组中两个数的最大异或值
Trie + 位运算,链接:https://leetcode-cn.com/problems/maximum-xor-of-two-numbers-in-an-array/
代码
class Solution {
private static class Node {
public Node[] son;
private int endCount;
public Node() {
this.son = new Node[2];
this.endCount = 0;
}
}
private Node root = new Node();
private void insert(int x) {
Node p = root;
for (int i = 30; i >= 0; i--) {
int u = (x >>> i) & 1;
if (p.son[u] == null) p.son[u] = new Node();
p = p.son[u];
}
p.endCount++;
}
private int search(int x) {
Node p = root;
int res = 0;
for (int i = 30; i >= 0; i--) {
int u = (x >>> i) & 1;
if (p.son[u ^ 1] != null) {
res += (1 << i);
p = p.son[u ^ 1];
} else p = p.son[u];
}
return res;
}
public int findMaximumXOR(int[] nums) {
int res = 0;
for (int i = 0; i < nums.length; i++) insert(nums[i]);
for (int i = 0; i < nums.length; i++)
res = Math.max(res, search(nums[i]));
return res;
}
}
5、键值映射
Trie + dfs,链接:https://leetcode-cn.com/problems/map-sum-pairs/
代码
class MapSum {
private static class Node {
public Node[] son;
private int endCount;
private int val;
public Node() {
this.son = new Node[26];
this.endCount = 0;
}
}
private Node root;
public MapSum() {
root = new Node();
}
public void insert(String key, int val) {
Node p = root;
for (int i = 0; i < key.length(); i++) {
int u = key.charAt(i) - 'a';
if (p.son[u] == null) p.son[u] = new Node();
p = p.son[u];
}
p.endCount++;
p.val = val;
}
private int dfs(Node x) {
int res = 0;
if (x.endCount > 0) res += x.val;
for (int i = 0; i < 26; i++)
if (x.son[i] != null)
res += dfs(x.son[i]);
return res;
}
public int sum(String prefix) {
Node p = root;
for (int i = 0; i < prefix.length(); i++) {
int u = prefix.charAt(i) - 'a';
if (p.son[u] == null) return 0;
p = p.son[u];
}
return dfs(p);
}
}
6、与数组中元素的最大异或值
Trie + 排序,链接:https://leetcode-cn.com/problems/maximum-xor-with-an-element-from-array/
代码
class Solution {
private static class Node {
public Node[] son;
private int endCount;
public Node() {
this.son = new Node[2];
this.endCount = 0;
}
}
private Node root = new Node();
private void insert(int val) {
Node p = root;
for (int i = 30; i >= 0; i--) {
int u = (val >>> i) & 1;
if (p.son[u] == null) p.son[u] = new Node();
p = p.son[u];
}
p.endCount++;
}
private int search(int val) {
int res = 0;
Node p = root;
for (int i = 30; i >= 0; i--) {
int u = (val >>> i) & 1;
if (p.son[u ^ 1] != null) {
res += (1 << i);
p = p.son[u ^ 1];
} else p = p.son[u];
}
return res;
}
public int[] maximizeXor(int[] nums, int[][] qs) {
int n = nums.length, m = qs.length;
Map<int[], Integer> hash = new HashMap<>();
for (int i = 0; i < m; i++) hash.put(qs[i], i);
Arrays.sort(nums);
Arrays.sort(qs, (a, b) -> a[1] - b[1]);
int[] res = new int[m];
int pos = 0;
for (var q : qs) {
while (pos < n && nums[pos] <= q[1]) insert(nums[pos++]);
if (pos == 0) res[hash.get(q)] = -1;
else res[hash.get(q)] = search(q[0]);
}
return res;
}
}
评论区