A trie is a tree-like data structure in which every node stores a character. A trie can store one or multiple words. If a trie stores all suffixes of a word, it is called a suffix trie. After you build a suffix trie, substrings can be retrieved easily. This post is about suffix trie implementation, search, and retrieval of a substring.

suffix trie implementation

All substrings in a word are all contiguous sequences of characters within a string with all possible lengths. For example, a word “ana”, its substrings include “a”, “an”, “ana”, “n”, “na”, total 5 of them. Note, that all substrings are different from all suffixes of a word. In the example of “ana”, its suffixes are “ana”, “na”, “a”, total of 3 of them. A suffix trie holds all substrings, including suffixes, of a word.

This implementation uses a hash map (a dictionary in Python) to store children, which uses a character as a key to find the node holding the next character in this branch. The alternative solution uses an array to store children. The implementation using a hash map is more space-efficient than the array implementation.

Table of Content


Map of suffix trie implementations

Part 1 – Suffix trie implementation using array

you are here

Part 2 – Suffix trie implementation using hash map


Define a SuffixTrieNode class

A SuffixTrieNode is a class to create nodes in a suffix trie. It has 2 variables. children is a hash map data structure, in which the key is a character, and the value is the node that stores the next character in a suffix. indices is a list data structure that stores the start position of a substring in a word.

Java

Javascript

Python


Build a suffix trie

A SuffixTrie class has one variable root. All operations, such as insertion, search, or traversal, start from the root. In a suffix trie, only the root has multiple children. Other nodes except leaves have one child, which is the node of the next character in the suffix. You can initialize the root in the constructor.

To build a suffix trie from a word, you use the word as the input of the constructor. You subtract the suffix and call the insertSuffix method with the root, a suffix, and the index which is the start position of the suffix in the word as input.

insertSuffix method is a recursive function. The first thing is to save the index in the input node’s indices. It records the position of the substring inside the input node. It will be used in search. If the input string is null or its length is 0, it is a terminal condition. All recursive calls return to calling functions.

Otherwise, you retrieve the first character of the input string. If the character is one of the keys in the input node’s children, you get the next node by using the character as the key. If the character doesn’t exist in the key set, a child node is created. You save the new entry with the character as the key, and the new node as the value in children. Then the method calls itself with the new node, the substring after the first character, and the index.

Java

Javascript

Python

Search is to find a substring in the word. The search method is a wrapper method of searchHelper. The start node is the root. searchHelper is a recursive method. When the input string is null or the length is 0, it is a terminal condition. You return with the result which is stored in the node’s indices. Otherwise, you take the first character of the input string. If the character is one of the keys in the input node’s children, get the next node by using the character as the key. Then the method calls itself to search for the next character. If the first character doesn’t exist in the children‘s key set, return null.

Java

Javascript

Python


Print all nodes in trie

Now you can print all nodes in a trie, you use recursion again. When the input node is null, it is a terminal condition, you can return. Otherwise, you print the input string, which is the substring from the root to the current node. Then you check the entries in the input node’s children. If an entry is not null, the method calls itself with the entry’s key and value to print the next substring in this branch.

Java

Javascript

Python


Free download

Download suffix trie implementation using map in Java, JavaScript and Python
Trie implmentation using map
Data structures introduction PDF