liblevenshtein 4.0.0
A library for generating Finite State Transducers based on Levenshtein Automata.
|
A specific type of Dawg that is constructed over lexicographically sorted terms. More...
#include <sorted_dawg.h>
Public Member Functions | |
~SortedDawg () override | |
Cleans up any residual intermediate members used to construct the Dawg. | |
auto | add (const std::string &term) -> bool override |
Adds a new term to this dictionary. | |
auto | remove (const std::string &term) -> bool override |
Removes a term from this dictionary. | |
void | init () |
Prepares this dictionary for initialization. | |
void | clean_up () |
Cleans up any intermediate members used to initialize this Dawg. | |
void | finish () |
Specifies the dictionary has been fully initialized. | |
void | minimize (std::size_t lower_bound) |
Minimizes the suffixes of the unchecked paths until there are no more than lower_bound unchecked transitions. | |
auto | minimized_node (DawgNode *key) const -> DawgNode * |
Returns a DawgNode whose following suffix is equivalent to the following suffix of the given DawgNode, or nullptr if none exist. | |
Dawg (DawgNode *root, std::size_t size) | |
Constructs a new DAWG with the given root node and number of dictionary terms (size). | |
Dawg () | |
Constructs an empty DAWG with a default root node (there are no outgoing edges or reachable terms from the root, not even the empty string). | |
auto | contains (const std::string &term) const -> bool |
Determines whether the given term is contained within this dictionary. | |
auto | root () const -> DawgNode * |
Returns a pointer to the root node of this dictionary from which all its terms may be determined. | |
auto | size () const -> size_t |
Returns the number of terms in this dictionary. | |
auto | begin () const -> DawgIterator |
Returns an iterator pointing to the first term in this dictionary, from which all terms may be iterated over. | |
auto | end () const -> DawgIterator |
Returns an iterator representing the boundary following the final term in this dictionary. | |
auto | operator== (const Dawg &other) const -> bool |
Determines whether another DAWG is equivalent to this one. | |
Protected Member Functions | |
auto | all_nodes () const -> std::unordered_set< DawgNode * > |
Returns a set of all nodes reachable from the root, including the root. | |
Protected Attributes | |
DawgNode * | _root |
Root node of this DAWG from which all its terms may be reached. | |
std::size_t | _size |
Number of terms reachable from the root node. | |
Private Attributes | |
std::stack< Transition > * | unchecked_transitions = nullptr |
Transitions whose redundancies have not been checked. | |
std::unordered_map< DawgNode, DawgNode * > * | minimized_nodes = nullptr |
Mapping of suffixes to DawgNodes that have been minimized. | |
std::unordered_set< DawgNode * > * | floating_nodes = nullptr |
Nodes that have not been added to this Dawg. | |
std::string | _prev_term |
Previous term inserted into this Dawg. | |
A specific type of Dawg that is constructed over lexicographically sorted terms.
The precondition that the terms are sorted, lexigraphically, in ascending order is a requirement by the algorithm used to construct this type of Dawg.
Definition at line 20 of file sorted_dawg.h.
|
override |
Cleans up any residual intermediate members used to construct the Dawg.
The super destructor is subsequently called to clean up the rest of its nodes.
Definition at line 11 of file sorted_dawg.cpp.
References clean_up().
Adds a new term to this dictionary.
The term must be lexicographically larger than the previously added term.
term | A term to add to this dictionary. |
Implements liblevenshtein::Dawg.
Definition at line 15 of file sorted_dawg.cpp.
References liblevenshtein::DawgNode::is_final(), and query().
Returns a set of all nodes reachable from the root, including the root.
Definition at line 28 of file dawg.cpp.
References liblevenshtein::Dawg::_root, and query().
Referenced by clean_up(), and liblevenshtein::Dawg::~Dawg().
|
inherited |
Returns an iterator pointing to the first term in this dictionary, from which all terms may be iterated over.
Definition at line 59 of file dawg.cpp.
References liblevenshtein::Dawg::_root.
void liblevenshtein::SortedDawg::clean_up | ( | ) |
Cleans up any intermediate members used to initialize this Dawg.
Definition at line 98 of file sorted_dawg.cpp.
References liblevenshtein::Dawg::all_nodes(), floating_nodes, minimized_nodes, and unchecked_transitions.
Referenced by ~SortedDawg().
Determines whether the given term is contained within this dictionary.
term | Query term whose membership in this dictionary must be determined. |
Definition at line 43 of file dawg.cpp.
References liblevenshtein::DawgNode::is_final(), query(), and liblevenshtein::DawgNode::transition().
Referenced by liblevenshtein::from_protobuf().
liblevenshtein::Dawg::Dawg | ( | ) |
liblevenshtein::Dawg::Dawg | ( | DawgNode * | root, |
std::size_t | size ) |
Constructs a new DAWG with the given root node and number of dictionary terms (size).
The number of terms reachable by combining the subsequent outgoing edges from the root must be equal to the size.
root | Node from which all dictionary terms may be reached by subsequently traversing its outgoing edges. |
size | Number of terms in the dictionary. |
Definition at line 35 of file dawg.cpp.
References query().
|
inherited |
Returns an iterator representing the boundary following the final term in this dictionary.
The value of the iterator is not a term, but represent a boundary that must not be crossed.
Definition at line 63 of file dawg.cpp.
References liblevenshtein::Dawg::_size.
void liblevenshtein::SortedDawg::finish | ( | ) |
Specifies the dictionary has been fully initialized.
This is only called during initialization.
Definition at line 57 of file sorted_dawg.cpp.
References minimize().
void liblevenshtein::SortedDawg::init | ( | ) |
Prepares this dictionary for initialization.
This function is only called during initialization.
Definition at line 92 of file sorted_dawg.cpp.
References floating_nodes, minimized_nodes, and unchecked_transitions.
void liblevenshtein::SortedDawg::minimize | ( | std::size_t | lower_bound | ) |
Minimizes the suffixes of the unchecked paths until there are no more than lower_bound unchecked transitions.
This is only called during initialization.
lower_bound | Desired size of remaining unchecked transitions once the other have been minimized. |
Definition at line 69 of file sorted_dawg.cpp.
References liblevenshtein::DawgNode::add_edge(), liblevenshtein::Transition::label(), minimized_node(), query(), liblevenshtein::Transition::source(), liblevenshtein::Transition::target(), and unchecked_transitions.
Referenced by finish().
Returns a DawgNode whose following suffix is equivalent to the following suffix of the given DawgNode, or nullptr if none exist.
This is used to determine whether the given DawgNode is redundant and may be discarded.
key | DawgNode whose redundancy must be determined. |
Definition at line 61 of file sorted_dawg.cpp.
References query().
Referenced by minimize().
Removes a term from this dictionary.
Currently, this is just a placeholder and performs no action.
The | term to remove from this dictionary. |
Implements liblevenshtein::Dawg.
Definition at line 88 of file sorted_dawg.cpp.
Returns a pointer to the root node of this dictionary from which all its terms may be determined.
Definition at line 51 of file dawg.cpp.
References liblevenshtein::Dawg::_root.
Referenced by query().
Returns the number of terms in this dictionary.
Definition at line 55 of file dawg.cpp.
References liblevenshtein::Dawg::_size.
|
private |
Previous term inserted into this Dawg.
Definition at line 98 of file sorted_dawg.h.
|
protectedinherited |
Root node of this DAWG from which all its terms may be reached.
Definition at line 126 of file dawg.h.
Referenced by liblevenshtein::Dawg::all_nodes(), liblevenshtein::Dawg::begin(), and liblevenshtein::Dawg::root().
|
protectedinherited |
Number of terms reachable from the root node.
Definition at line 129 of file dawg.h.
Referenced by liblevenshtein::Dawg::end(), and liblevenshtein::Dawg::size().
Nodes that have not been added to this Dawg.
Definition at line 95 of file sorted_dawg.h.
Referenced by clean_up(), and init().
|
private |
Mapping of suffixes to DawgNodes that have been minimized.
Definition at line 92 of file sorted_dawg.h.
Referenced by clean_up(), and init().
|
private |
Transitions whose redundancies have not been checked.
Definition at line 89 of file sorted_dawg.h.
Referenced by clean_up(), init(), and minimize().