The thumbnail was generated with this site. Go check it out!
The name
The Trie takes its name from its main operation, the reTRIEval, which is more a bad pun than an actual operation. It’s also known as “Prefix Tree” because of the way it is structured. A Trie is a tree built to store a large amount of strings as well as integer numbers. Actually, it can store any sequence of symbols, given that the alphabet of those symbols is finite and known. More specifically, a trie is d-ary tree, where d is the size of the alphabet of the symbols used.
Each node of a Trie can contain at most d sons, one for each symbol. Each node represents a certain symbol, except for the root of the tree which represents nothing, or the beginning of the string.
The structure
We need three fields for each node of the Trie:
- a pointer to the father node
- a fixed length array of sons (in this case, exactly 26)
- a boolean to tell us if a given node represents the end of a word
Notice that we don’t need a variable to hold the symbol, because we can extract it from the index of the node in its father’s array.
typedef struct _trie {
struct _trie *father;
struct _trie* sons[26];
bool end_of_word;
} TrieNode;
The behavior
In the following functions, i intentionally omit NULL checks on the input and the return of malloc
to make the code shorter and more readable.
init
This functions creates a new empty node with no sons.
TrieNode* init() {
TrieNode* node = (TrieNode*) malloc(sizeof(TrieNode));
for (int i=0; i<26; i++) {
node->sons[i] = NULL;
}
node->father = NULL;
node->end_of_word = false;
return node;
}
insert
This function adds the given string str
to the Trie rooted at the given root
. If str
is already present, no changes are made to the Trie.
void insert(TrieNode *root, char* str) {
TrieNode *tmp = root;
const int len = strlen(str);
for (int i=0; i<len; i++) {
const int index = str[i] - 'a';
if (tmp->sons[index] == NULL) {
tmp->sons[index] = init();
tmp->sons[index]->father = tmp;
}
tmp = tmp->sons[index];
}
tmp->end_of_word = true;
}
search
This function looks for the given string str
in the Trie rooted at the given root
. It returns true
if str
was present, false
otherwise.
bool search(TrieNode *root, char* str) {
TrieNode *tmp = root;
const int len = strlen(str);
int i = 0;
while (tmp != NULL && i < len) {
tmp = tmp->sons[str[i] - 'a'];
i++;
}
return tmp != NULL && tmp->end_of_word;
}
delete
This function deletes the given string str
from the Trie rooted at the given root
. If str
was not present, no changes are made to the Trie. The deletion of a string from the Trie is actually setting the boolean end_of_word
to false. In this implementation, however, if the “deleted” node has no sons, it is deallocated and this procedure continues on all father nodes until one with more than 1 son (or the root) is found.
void trie_delete (TrieNode *root) {
if (root == NULL) return;
for(int i=0; i<26; i++) {
if( root->sons[i] != NULL ) trie_delete(root->sons[i]);
}
free(root);
}
NOTE: A personal implementation of the Trie can be found on my GitHub.