liblevenshtein 4.0.0
A library for generating Finite State Transducers based on Levenshtein Automata.
|
Various utilities regarding Levenshtein transducers. More...
Namespaces | |
namespace | demo |
namespace | distance |
Memoized, recursive distance metrics typically used to evaluate the correctness of Levenshtein automata. | |
Classes | |
class | Dawg |
A Directed Acyclic Word Graph (DAWG) maps sequences of characters to form words; the collection of words is known as a dictionary. More... | |
class | DawgIterator |
Iterates over all the terms in a DAWG dictionary. More... | |
class | DawgNode |
Represents a position within one or more terms of a DAWG dictionary. More... | |
class | Intersection |
Represents an Intersection between a dictionary automaton and Levenshtein automaton, guided by the query term. More... | |
class | LazyIterator |
Lazily traverses the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term, and yields each spelling candidate as it is matched. More... | |
class | LazyQuery |
Lazily traverses the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term, and yields each spelling candidate as it is matched. More... | |
class | Position |
Represents a location within the Levenshtein automaton. More... | |
class | Prefix |
Represents the prefix of a dictionary term. More... | |
class | SortedDawg |
A specific type of Dawg that is constructed over lexicographically sorted terms. More... | |
class | State |
Consists of a closure of Position nodes within the Levenshtein automaton. More... | |
class | StateIterator |
Iterates over the Position nodes in the linked-list of a Levenshtein State. More... | |
class | StateTransition |
Transitions Levenshtein States given a characteristic vector. More... | |
class | Transducer |
Constructs a Levenshtein Transducer that, when given a dictionary automaton, query term, and maximum edit distance, n , is able to match all spelling candidates in the dictionary automaton for the query term whose edit distances are no larger than n . More... | |
class | Transition |
Represents an edge from one DawgNode to another, annotated with a character label from the current position of the dictionary term. More... | |
class | UnsubsumeFn |
Removes (unsubsumes) all Positions from a State that are subsumed by another Position within the same State. More... | |
Typedefs | |
using | TransitionFn = std::function<State *(State *, std::vector<bool> &)> |
Returns the Levenshtein State mapped-to by another State and characteristic vector. | |
using | DistanceFn = std::function<std::size_t(State *, std::size_t)> |
Infers the Levenshtein distance from a final Levenshtein State and the length of the query term. | |
using | Comparator = std::function<int(Position *, Position *)> |
using | PositionTransitionFn |
Transitions Position nodes. | |
using | CompareFn = std::function<int(Position *, Position *)> |
Compares Position nodes. | |
using | MergeFn = std::function<void(State *, std::vector<Position *>)> |
Merges Position nodes into a State. | |
using | Candidate = std::pair<std::string, std::size_t> |
Spelling candidate that includes both the dictionary term and its edit distance from the query term. | |
using | SubsumesFn = std::function<bool(Position *, Position *, std::size_t)> |
Determines whether one Position subsumes another given the maximum edit distance to consider. | |
Enumerations | |
enum class | Algorithm { STANDARD , TRANSPOSITION , MERGE_AND_SPLIT } |
Enumerates the available Levenshtein distance algorithms. More... | |
Functions | |
auto | operator<< (std::ostream &out, const Dawg &dawg) -> std::ostream & |
auto | operator<< (std::ostream &out, const DawgNode &node) -> std::ostream & |
auto | operator<< (std::ostream &out, const Prefix &prefix) -> std::ostream & |
template<class IterType > | |
auto | sorted_dawg (IterType iter, IterType end) -> SortedDawg * |
Factory function that initializes a new SortedDawg using the terms yielded from an iterator. | |
template auto | sorted_dawg (std::list< std::string >::iterator iter, std::list< std::string >::iterator end) -> SortedDawg * |
template auto | sorted_dawg (std::vector< std::string >::iterator iter, std::vector< std::string >::iterator end) -> SortedDawg * |
template auto | sorted_dawg (std::set< std::string >::iterator iter, std::set< std::string >::iterator end) -> SortedDawg * |
auto | serialize_protobuf (Dawg *dawg, const fs::path &path) -> bool |
Serializes the given Dawg to protobuf at the given path. | |
auto | serialize_protobuf (Dawg *dawg, const std::string &path) -> bool |
Serializes the given Dawg to protobuf at the given path. | |
auto | serialize_protobuf (Dawg *dawg, const char *path) -> bool |
Serializes the given Dawg to protobuf at the given path. | |
auto | serialize_protobuf (Dawg *dawg, std::ostream &output) -> bool |
Serializes the given Dawg to protobuf into the given output stream. | |
auto | deserialize_protobuf (const fs::path &path) -> Dawg * |
Deserializes the protobuf containing a Dawg at the given path or returns nullptr if none exists. | |
auto | deserialize_protobuf (const std::string &path) -> Dawg * |
Deserializes the protobuf containing a Dawg at the given path or returns nullptr if none exists. | |
auto | deserialize_protobuf (const char *path) -> Dawg * |
Deserializes the protobuf containing a Dawg at the given path or returns nullptr if none exists. | |
auto | deserialize_protobuf (std::istream &input) -> Dawg * |
Deserializes the protobuf containing a Dawg from the given input stream or returns nullptr if none exists. | |
void | collect_nodes (DawgNode *source, std::set< uint64_t > &node_ids, std::set< uint64_t > &final_node_ids) |
Collects the DawgNode IDs and final DawgNode IDs of all nodes reachable from the source. | |
void | collect_edges (DawgNode *source, std::map< std::pair< uint64_t, char >, uint64_t > &edges) |
Collects the transitions from each source to its destination, and the respective character labels. | |
auto | to_protobuf (Dawg *dawg) -> llp::Dictionary * |
Serializes a Dawg to its protobuf equivalent. | |
auto | from_protobuf (const llp::Dictionary &dict_proto) -> Dawg * |
Deserializes a Dawg from its protobuf equivalent. | |
template<> | |
auto | compare< Algorithm::STANDARD > (Position *lhs, Position *rhs) -> int |
Compares two Positions for the standard Levenshtein transducer. | |
template<> | |
auto | compare< Algorithm::TRANSPOSITION > (Position *lhs, Position *rhs) -> int |
Compares two Positions for the standard Levenshtein transducer extended with transposition. | |
template<> | |
auto | compare< Algorithm::MERGE_AND_SPLIT > (Position *lhs, Position *rhs) -> int |
Compares two Positions for the standard Levenshtein transducer extended with merge and split. | |
template<Algorithm Type> | |
auto | compare (Position *lhs, Position *rhs) -> int |
Compares two Positions within the Levenshtein transducer. | |
template<> | |
auto | compare< Algorithm::STANDARD > (Position *lhs, Position *rhs) -> int |
Compares two Positions for the standard Levenshtein transducer. | |
template<> | |
auto | compare< Algorithm::TRANSPOSITION > (Position *lhs, Position *rhs) -> int |
Compares two Positions for the standard Levenshtein transducer extended with transposition. | |
template<> | |
auto | compare< Algorithm::MERGE_AND_SPLIT > (Position *lhs, Position *rhs) -> int |
Compares two Positions for the standard Levenshtein transducer extended with merge and split. | |
template<> | |
auto | distance< Algorithm::STANDARD > (State *state, std::size_t query_length) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance, which includes the elementary operations of insertion, deletion, and substitution. | |
template<> | |
auto | distance< Algorithm::TRANSPOSITION > (State *state, std::size_t query_length) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance extended with the elementary operation of transposition. | |
template<> | |
auto | distance< Algorithm::MERGE_AND_SPLIT > (State *state, std::size_t query_length) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance extended with the elementary operations of merge and split. | |
template<Algorithm Type> | |
auto | distance (State *state, std::size_t query_length) -> std::size_t |
Infers the Levenshtein distance from the given State and query length. | |
template<> | |
auto | distance< Algorithm::STANDARD > (State *state, std::size_t query_length) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance, which includes the elementary operations of insertion, deletion, and substitution. | |
template<> | |
auto | distance< Algorithm::TRANSPOSITION > (State *state, std::size_t query_length) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance extended with the elementary operation of transposition. | |
template<> | |
auto | distance< Algorithm::MERGE_AND_SPLIT > (State *state, std::size_t query_length) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance extended with the elementary operations of merge and split. | |
auto | operator<< (std::ostream &out, const Intersection &intersection) -> std::ostream & |
void | insert_after (State *state, Position *curr, Position *next) |
Inserts one Position after another. | |
template<> | |
void | merge< Algorithm::STANDARD > (State *state, const std::vector< Position * > &positions) |
template<> | |
void | merge< Algorithm::TRANSPOSITION > (State *state, const std::vector< Position * > &positions) |
template<> | |
void | merge< Algorithm::MERGE_AND_SPLIT > (State *state, const std::vector< Position * > &positions) |
template<Algorithm Type> | |
void | merge (State *state, const std::vector< Position * > &positions) |
Merges a list of Positions into the Levenshtein state. | |
auto | index_of (std::vector< bool > &characteristic_vector, std::size_t k, std::size_t i) -> std::size_t |
Returns the first index of the desired character in the characteristic vector, beginning at the offset \(i\). | |
template<> | |
auto | position_transition< Algorithm::STANDARD > (std::size_t n, Position *position, std::vector< bool > &characteristic_vector, std::size_t offset) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one. | |
template<> | |
auto | position_transition< Algorithm::TRANSPOSITION > (std::size_t n, Position *position, std::vector< bool > &characteristic_vector, std::size_t offset) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one. | |
template<> | |
auto | position_transition< Algorithm::MERGE_AND_SPLIT > (std::size_t n, Position *position, std::vector< bool > &characteristic_vector, std::size_t offset) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one. | |
template<Algorithm Type> | |
auto | position_transition (std::size_t n, Position *position, std::vector< bool > &characteristic_vector, std::size_t offset) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one. | |
template<> | |
auto | position_transition< Algorithm::STANDARD > (std::size_t n, Position *position, std::vector< bool > &characteristic_vector, std::size_t offset) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one. | |
template<> | |
auto | position_transition< Algorithm::TRANSPOSITION > (std::size_t n, Position *position, std::vector< bool > &characteristic_vector, std::size_t offset) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one. | |
template<> | |
auto | position_transition< Algorithm::MERGE_AND_SPLIT > (std::size_t n, Position *position, std::vector< bool > &characteristic_vector, std::size_t offset) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one. | |
template<> | |
auto | subsumes< Algorithm::STANDARD > (Position *lhs, Position *rhs, std::size_t n) -> bool |
Returns whether lhs subsumes rhs according to the subsumption rules of standard Levenshtein distance, which consists of the elementary operations of insertion, deletion, and substitution. | |
template<> | |
auto | subsumes< Algorithm::TRANSPOSITION > (Position *lhs, Position *rhs, std::size_t n) -> bool |
Returns whether lhs subsumes rhs according to the subsumption rules of standard Levenshtein distance extended with the elementary operation of transposition. | |
template<> | |
auto | subsumes< Algorithm::MERGE_AND_SPLIT > (Position *lhs, Position *rhs, std::size_t n) -> bool |
Returns whether lhs subsumes rhs according to the subsumption rules of standard Levenshtein distance extended with the elementary operations of merge and split. | |
template<Algorithm Type> | |
auto | subsumes (Position *lhs, Position *rhs, std::size_t n) -> bool |
Returns whether Position lhs subsumes Position rhs , given the maximum edit distance n . | |
template<> | |
auto | subsumes< Algorithm::STANDARD > (Position *lhs, Position *rhs, std::size_t n) -> bool |
Returns whether lhs subsumes rhs according to the subsumption rules of standard Levenshtein distance, which consists of the elementary operations of insertion, deletion, and substitution. | |
template<> | |
auto | subsumes< Algorithm::TRANSPOSITION > (Position *lhs, Position *rhs, std::size_t n) -> bool |
Returns whether lhs subsumes rhs according to the subsumption rules of standard Levenshtein distance extended with the elementary operation of transposition. | |
template<> | |
auto | subsumes< Algorithm::MERGE_AND_SPLIT > (Position *lhs, Position *rhs, std::size_t n) -> bool |
Returns whether lhs subsumes rhs according to the subsumption rules of standard Levenshtein distance extended with the elementary operations of merge and split. | |
Various utilities regarding Levenshtein transducers.
The classes of interest, to most, will be the various dictionaries, transducer implementations, and serializers. The remaining classes are likely of little interest to anyone but maintainers and the curiously minded.
using liblevenshtein::Candidate = std::pair<std::string, std::size_t> |
Spelling candidate that includes both the dictionary term and its edit distance from the query term.
Definition at line 17 of file transducer.h.
using liblevenshtein::Comparator = std::function<int(Position *, Position *)> |
using liblevenshtein::CompareFn = std::function<int(Position *, Position *)> |
Compares Position nodes.
Definition at line 20 of file state_transition.h.
using liblevenshtein::DistanceFn = std::function<std::size_t(State *, std::size_t)> |
Infers the Levenshtein distance from a final Levenshtein State and the length of the query term.
Definition at line 23 of file lazy_query.h.
using liblevenshtein::MergeFn = std::function<void(State *, std::vector<Position *>)> |
Merges Position nodes into a State.
Definition at line 23 of file state_transition.h.
Transitions Position nodes.
Definition at line 15 of file state_transition.h.
using liblevenshtein::SubsumesFn = std::function<bool(Position *, Position *, std::size_t)> |
Determines whether one Position subsumes another given the maximum edit distance to consider.
Definition at line 14 of file unsubsume.h.
using liblevenshtein::TransitionFn = std::function<State *(State *, std::vector<bool> &)> |
Returns the Levenshtein State mapped-to by another State and characteristic vector.
Definition at line 19 of file lazy_query.h.
|
strong |
Enumerates the available Levenshtein distance algorithms.
Definition at line 9 of file algorithm.h.
void liblevenshtein::collect_edges | ( | DawgNode * | source, |
std::map< std::pair< uint64_t, char >, uint64_t > & | edges ) |
Collects the transitions from each source to its destination, and the respective character labels.
source | Initial node for the transition. |
edges | Transition mapping of source DawgNode IDs and labels to their respective target DawgNode IDs. |
Definition at line 82 of file serializer.cpp.
References collect_edges(), liblevenshtein::DawgNode::for_each_edge(), and query().
Referenced by collect_edges(), and to_protobuf().
void liblevenshtein::collect_nodes | ( | DawgNode * | source, |
std::set< uint64_t > & | node_ids, | ||
std::set< uint64_t > & | final_node_ids ) |
Collects the DawgNode IDs and final DawgNode IDs of all nodes reachable from the source.
source | DawgNode whose ID is to be collected. |
node_ids | DawgNode IDs reachable from the source. |
final_node_ids | Final DawgNode IDs reachable from the source. |
Definition at line 68 of file serializer.cpp.
References collect_nodes(), liblevenshtein::DawgNode::for_each_edge(), liblevenshtein::DawgNode::is_final(), and query().
Referenced by collect_nodes(), and to_protobuf().
Compares two Positions within the Levenshtein transducer.
Referenced by liblevenshtein::State::merge(), liblevenshtein::State::merge_sort(), liblevenshtein::StateTransition::operator()(), and liblevenshtein::State::sort().
auto liblevenshtein::compare< Algorithm::MERGE_AND_SPLIT > | ( | Position * | lhs, |
Position * | rhs ) -> int |
Compares two Positions for the standard Levenshtein transducer extended with merge and split.
Definition at line 53 of file comparator.cpp.
References query().
auto liblevenshtein::compare< Algorithm::MERGE_AND_SPLIT > | ( | Position * | lhs, |
Position * | rhs ) -> int |
Compares two Positions for the standard Levenshtein transducer extended with merge and split.
Definition at line 53 of file comparator.cpp.
References query().
auto liblevenshtein::compare< Algorithm::STANDARD > | ( | Position * | lhs, |
Position * | rhs ) -> int |
Compares two Positions for the standard Levenshtein transducer.
Definition at line 10 of file comparator.cpp.
References query().
auto liblevenshtein::compare< Algorithm::STANDARD > | ( | Position * | lhs, |
Position * | rhs ) -> int |
Compares two Positions for the standard Levenshtein transducer.
Definition at line 10 of file comparator.cpp.
References query().
auto liblevenshtein::compare< Algorithm::TRANSPOSITION > | ( | Position * | lhs, |
Position * | rhs ) -> int |
Compares two Positions for the standard Levenshtein transducer extended with transposition.
Definition at line 31 of file comparator.cpp.
References query().
auto liblevenshtein::compare< Algorithm::TRANSPOSITION > | ( | Position * | lhs, |
Position * | rhs ) -> int |
Compares two Positions for the standard Levenshtein transducer extended with transposition.
Definition at line 31 of file comparator.cpp.
References query().
Deserializes the protobuf containing a Dawg at the given path or returns nullptr if none exists.
path | Path to the serialization file. |
Definition at line 52 of file serializer.cpp.
References deserialize_protobuf(), and query().
Deserializes the protobuf containing a Dawg at the given path or returns nullptr if none exists.
path | Path to the serialization file. |
Definition at line 44 of file serializer.cpp.
References deserialize_protobuf(), and query().
Referenced by deserialize_protobuf(), deserialize_protobuf(), deserialize_protobuf(), and main().
Deserializes the protobuf containing a Dawg at the given path or returns nullptr if none exists.
path | Path to the serialization file. |
Definition at line 48 of file serializer.cpp.
References deserialize_protobuf(), and query().
Deserializes the protobuf containing a Dawg from the given input stream or returns nullptr if none exists.
path | input Input stream containing the serialized Dawg. |
Definition at line 59 of file serializer.cpp.
References from_protobuf(), and query().
auto liblevenshtein::distance | ( | State * | state, |
std::size_t | query_length ) -> std::size_t |
Infers the Levenshtein distance from the given State and query length.
state | Levenshtein transducer final State from which we may infer the respective distance. |
query_length | Length of the query term which is necessary for inferring the distance from the given State. |
References distance().
Referenced by liblevenshtein::LazyQuery< Result >::advance(), liblevenshtein::distance::MergeAndSplitDistance::between(), liblevenshtein::distance::StandardDistance::between(), liblevenshtein::distance::TranspositionDistance::between(), distance(), distance< Algorithm::MERGE_AND_SPLIT >(), distance< Algorithm::STANDARD >(), distance< Algorithm::TRANSPOSITION >(), liblevenshtein::distance::MemoizedDistance::get(), liblevenshtein::distance::MemoizedDistance::set(), and liblevenshtein::LazyQuery< Result >::update_candidate().
auto liblevenshtein::distance< Algorithm::MERGE_AND_SPLIT > | ( | State * | state, |
std::size_t | query_length ) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance extended with the elementary operations of merge and split.
state | Levenshtein transducer final State from which we may infer the respective distance. |
query_length | Length of the query term which is necessary for inferring the distance from the given State. |
Definition at line 42 of file distance.cpp.
References distance(), and query().
auto liblevenshtein::distance< Algorithm::MERGE_AND_SPLIT > | ( | State * | state, |
std::size_t | query_length ) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance extended with the elementary operations of merge and split.
state | Levenshtein transducer final State from which we may infer the respective distance. |
query_length | Length of the query term which is necessary for inferring the distance from the given State. |
Definition at line 42 of file distance.cpp.
References distance(), and query().
auto liblevenshtein::distance< Algorithm::STANDARD > | ( | State * | state, |
std::size_t | query_length ) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance, which includes the elementary operations of insertion, deletion, and substitution.
state | Levenshtein transducer final State from which we may infer the respective distance. |
query_length | Length of the query term which is necessary for inferring the distance from the given State. |
Definition at line 10 of file distance.cpp.
References distance(), and query().
auto liblevenshtein::distance< Algorithm::STANDARD > | ( | State * | state, |
std::size_t | query_length ) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance, which includes the elementary operations of insertion, deletion, and substitution.
state | Levenshtein transducer final State from which we may infer the respective distance. |
query_length | Length of the query term which is necessary for inferring the distance from the given State. |
Definition at line 10 of file distance.cpp.
References distance(), and query().
auto liblevenshtein::distance< Algorithm::TRANSPOSITION > | ( | State * | state, |
std::size_t | query_length ) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance extended with the elementary operation of transposition.
state | Levenshtein transducer final State from which we may infer the respective distance. |
query_length | Length of the query term which is necessary for inferring the distance from the given State. |
Definition at line 25 of file distance.cpp.
References distance(), and query().
auto liblevenshtein::distance< Algorithm::TRANSPOSITION > | ( | State * | state, |
std::size_t | query_length ) -> std::size_t |
Infers the Levenshtein distance from the given State and query length for the standard Levenshtein distance extended with the elementary operation of transposition.
state | Levenshtein transducer final State from which we may infer the respective distance. |
query_length | Length of the query term which is necessary for inferring the distance from the given State. |
Definition at line 25 of file distance.cpp.
References distance(), and query().
Deserializes a Dawg from its protobuf equivalent.
dict_proto | Protobuf containing the serialized information necessary for reconstructing the Dawg. |
Definition at line 128 of file serializer.cpp.
References liblevenshtein::DawgNode::add_edge(), liblevenshtein::Dawg::contains(), and query().
Referenced by deserialize_protobuf().
auto liblevenshtein::index_of | ( | std::vector< bool > & | characteristic_vector, |
std::size_t | k, | ||
std::size_t | i ) -> std::size_t |
Returns the first index of the desired character in the characteristic vector, beginning at the offset \(i\).
The characteristic vector represents a sort of moving window over the characters of the query term, and flags the locations within that window whose corresponding query term characters match the desired character from the dictionary term.
characteristic_vector | Moving window over the which query term characters match the desired one from the dictionary term. |
k | Upper bound of the characteristic vector. |
i | Beginning index (offset) to seek the desired character. |
Definition at line 6 of file position_transition.cpp.
References query().
Referenced by position_transition< Algorithm::STANDARD >(), and position_transition< Algorithm::TRANSPOSITION >().
Inserts one Position after another.
state | Levenshtein State whose positions are being merged. |
curr | Position whose follow Position is being updated. |
next | Position to insert after curr. |
Definition at line 8 of file merge.cpp.
References liblevenshtein::State::head(), liblevenshtein::State::insert_after(), and query().
Referenced by merge< Algorithm::MERGE_AND_SPLIT >(), merge< Algorithm::STANDARD >(), and merge< Algorithm::TRANSPOSITION >().
void liblevenshtein::merge | ( | State * | state, |
const std::vector< Position * > & | positions ) |
Merges a list of Positions into the Levenshtein state.
This operation greatly speeds up the matching of spelling candidates.
state | Levenshtein State whose positions are to be merged with the others. |
positions | Positions to merge into the State. |
Referenced by liblevenshtein::State::merge_sort(), and liblevenshtein::StateTransition::operator()().
void liblevenshtein::merge< Algorithm::MERGE_AND_SPLIT > | ( | State * | state, |
const std::vector< Position * > & | positions ) |
Definition at line 108 of file merge.cpp.
References liblevenshtein::State::begin(), liblevenshtein::State::end(), insert_after(), query(), and liblevenshtein::Position::term_index().
void liblevenshtein::merge< Algorithm::STANDARD > | ( | State * | state, |
const std::vector< Position * > & | positions ) |
Definition at line 18 of file merge.cpp.
References liblevenshtein::State::begin(), liblevenshtein::State::end(), insert_after(), query(), and liblevenshtein::Position::term_index().
void liblevenshtein::merge< Algorithm::TRANSPOSITION > | ( | State * | state, |
const std::vector< Position * > & | positions ) |
Definition at line 60 of file merge.cpp.
References liblevenshtein::State::begin(), liblevenshtein::State::end(), insert_after(), query(), and liblevenshtein::Position::term_index().
Definition at line 77 of file dawg_node.cpp.
References query().
auto liblevenshtein::operator<< | ( | std::ostream & | out, |
const Intersection & | intersection ) -> std::ostream & |
Definition at line 31 of file intersection.cpp.
auto liblevenshtein::position_transition | ( | std::size_t | n, |
Position * | position, | ||
std::vector< bool > & | characteristic_vector, | ||
std::size_t | offset ) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one.
n | Maximum edit distance to consider when matching spelling candidates. |
position | Current Position within the Levenshtein automaton. |
characteristic_vector | Moving window over matching characters of the query term. |
offset | Where to begin examining the chracteristic_vector. |
auto liblevenshtein::position_transition< Algorithm::MERGE_AND_SPLIT > | ( | std::size_t | n, |
Position * | position, | ||
std::vector< bool > & | characteristic_vector, | ||
std::size_t | offset ) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one.
This covers transitions within the standard Levenshtein distance, extended with the elementary operations of merge and split.
n | Maximum edit distance to consider when matching spelling candidates. |
position | Current Position within the Levenshtein automaton. |
characteristic_vector | Moving window over matching characters of the query term. |
offset | Where to begin examining the chracteristic_vector. |
Definition at line 232 of file position_transition.cpp.
References query(), and liblevenshtein::Position::term_index().
auto liblevenshtein::position_transition< Algorithm::MERGE_AND_SPLIT > | ( | std::size_t | n, |
Position * | position, | ||
std::vector< bool > & | characteristic_vector, | ||
std::size_t | offset ) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one.
This covers transitions within the standard Levenshtein distance, extended with the elementary operations of merge and split.
n | Maximum edit distance to consider when matching spelling candidates. |
position | Current Position within the Levenshtein automaton. |
characteristic_vector | Moving window over matching characters of the query term. |
offset | Where to begin examining the chracteristic_vector. |
Definition at line 232 of file position_transition.cpp.
References query(), and liblevenshtein::Position::term_index().
auto liblevenshtein::position_transition< Algorithm::STANDARD > | ( | std::size_t | n, |
Position * | position, | ||
std::vector< bool > & | characteristic_vector, | ||
std::size_t | offset ) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one.
This covers transitions within the standard Levenshtein distance, which includes the elementary operations of insertion, deletion, and substituion.
n | Maximum edit distance to consider when matching spelling candidates. |
position | Current Position within the Levenshtein automaton. |
characteristic_vector | Moving window over matching characters of the query term. |
offset | Where to begin examining the chracteristic_vector. |
Definition at line 17 of file position_transition.cpp.
References index_of(), and query().
auto liblevenshtein::position_transition< Algorithm::STANDARD > | ( | std::size_t | n, |
Position * | position, | ||
std::vector< bool > & | characteristic_vector, | ||
std::size_t | offset ) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one.
This covers transitions within the standard Levenshtein distance, which includes the elementary operations of insertion, deletion, and substituion.
n | Maximum edit distance to consider when matching spelling candidates. |
position | Current Position within the Levenshtein automaton. |
characteristic_vector | Moving window over matching characters of the query term. |
offset | Where to begin examining the chracteristic_vector. |
Definition at line 17 of file position_transition.cpp.
References index_of(), and query().
auto liblevenshtein::position_transition< Algorithm::TRANSPOSITION > | ( | std::size_t | n, |
Position * | position, | ||
std::vector< bool > & | characteristic_vector, | ||
std::size_t | offset ) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one.
This covers transitions within the standard Levenshtein distance extended with the elementary operation, transposition.
n | Maximum edit distance to consider when matching spelling candidates. |
position | Current Position within the Levenshtein automaton. |
characteristic_vector | Moving window over matching characters of the query term. |
offset | Where to begin examining the chracteristic_vector. |
Definition at line 84 of file position_transition.cpp.
References index_of(), query(), and liblevenshtein::Position::term_index().
auto liblevenshtein::position_transition< Algorithm::TRANSPOSITION > | ( | std::size_t | n, |
Position * | position, | ||
std::vector< bool > & | characteristic_vector, | ||
std::size_t | offset ) -> std::vector< Position * > |
Returns the closure over possible next Positions that are reachable from the current Position by advancing the location in the dictionary by one.
This covers transitions within the standard Levenshtein distance extended with the elementary operation, transposition.
n | Maximum edit distance to consider when matching spelling candidates. |
position | Current Position within the Levenshtein automaton. |
characteristic_vector | Moving window over matching characters of the query term. |
offset | Where to begin examining the chracteristic_vector. |
Definition at line 84 of file position_transition.cpp.
References index_of(), query(), and liblevenshtein::Position::term_index().
Serializes the given Dawg to protobuf at the given path.
dawg | Dawg to serialize. |
path | Path to the serialization file. |
Definition at line 29 of file serializer.cpp.
References query(), and serialize_protobuf().
Serializes the given Dawg to protobuf at the given path.
dawg | Dawg to serialize. |
path | Path to the serialization file. |
Definition at line 21 of file serializer.cpp.
References query(), and serialize_protobuf().
Referenced by serialize_protobuf(), serialize_protobuf(), and serialize_protobuf().
Serializes the given Dawg to protobuf at the given path.
dawg | Dawg to serialize. |
path | Path to the serialization file. |
Definition at line 25 of file serializer.cpp.
References query(), and serialize_protobuf().
Serializes the given Dawg to protobuf into the given output stream.
dawg | Dawg to serialize. |
output | Output stream to write the serialized protobuf. |
Definition at line 37 of file serializer.cpp.
References query(), and to_protobuf().
auto liblevenshtein::sorted_dawg | ( | IterType | iter, |
IterType | end ) -> SortedDawg * |
Factory function that initializes a new SortedDawg using the terms yielded from an iterator.
iter | Iterator that yields lexicographically sorted terms. |
end | Iterator representing the end of the terms. |
Definition at line 124 of file sorted_dawg.cpp.
References query().
Referenced by main().
template auto liblevenshtein::sorted_dawg | ( | std::list< std::string >::iterator | iter, |
std::list< std::string >::iterator | end ) -> SortedDawg * |
template auto liblevenshtein::sorted_dawg | ( | std::set< std::string >::iterator | iter, |
std::set< std::string >::iterator | end ) -> SortedDawg * |
template auto liblevenshtein::sorted_dawg | ( | std::vector< std::string >::iterator | iter, |
std::vector< std::string >::iterator | end ) -> SortedDawg * |
auto liblevenshtein::subsumes | ( | Position * | lhs, |
Position * | rhs, | ||
std::size_t | n ) -> bool |
Returns whether Position lhs
subsumes Position rhs
, given the maximum edit distance n
.
We say that lhs
subsumes rhs
if the set of spelling candidates reachable from rhs
is a subset of the spelling candidates reachable from lhs
. As such, if lhs
subsumes rhs
, then rhs
is redundant and may be removed from the set of candidate Positions. In fact, it must be removed for the current set of algorithms to perform correctly.
lhs | Position to compare against rhs . |
rhs | Position to compare against lhs . |
n | Maximum edit distance to consider when matching spelling candidates. |
lhs
subsumes rhs
, given the maximum edit distance n
. auto liblevenshtein::subsumes< Algorithm::MERGE_AND_SPLIT > | ( | Position * | lhs, |
Position * | rhs, | ||
std::size_t | n ) -> bool |
Returns whether lhs
subsumes rhs
according to the subsumption rules of standard Levenshtein distance extended with the elementary operations of merge and split.
lhs | Position to compare against rhs . |
rhs | Position to compare against lhs . |
n | Maximum edit distance to consider when matching spelling candidates. |
lhs
subsumes rhs
, given the maximum edit distance n
. Definition at line 42 of file subsumes.cpp.
References query().
auto liblevenshtein::subsumes< Algorithm::MERGE_AND_SPLIT > | ( | Position * | lhs, |
Position * | rhs, | ||
std::size_t | n ) -> bool |
Returns whether lhs
subsumes rhs
according to the subsumption rules of standard Levenshtein distance extended with the elementary operations of merge and split.
lhs | Position to compare against rhs . |
rhs | Position to compare against lhs . |
n | Maximum edit distance to consider when matching spelling candidates. |
lhs
subsumes rhs
, given the maximum edit distance n
. Definition at line 42 of file subsumes.cpp.
References query().
auto liblevenshtein::subsumes< Algorithm::STANDARD > | ( | Position * | lhs, |
Position * | rhs, | ||
std::size_t | n ) -> bool |
Returns whether lhs
subsumes rhs
according to the subsumption rules of standard Levenshtein distance, which consists of the elementary operations of insertion, deletion, and substitution.
lhs | Position to compare against rhs . |
rhs | Position to compare against lhs . |
n | Maximum edit distance to consider when matching spelling candidates. |
lhs
subsumes rhs
, given the maximum edit distance n
. Definition at line 6 of file subsumes.cpp.
References query().
auto liblevenshtein::subsumes< Algorithm::STANDARD > | ( | Position * | lhs, |
Position * | rhs, | ||
std::size_t | n ) -> bool |
Returns whether lhs
subsumes rhs
according to the subsumption rules of standard Levenshtein distance, which consists of the elementary operations of insertion, deletion, and substitution.
lhs | Position to compare against rhs . |
rhs | Position to compare against lhs . |
n | Maximum edit distance to consider when matching spelling candidates. |
lhs
subsumes rhs
, given the maximum edit distance n
. Definition at line 6 of file subsumes.cpp.
References query().
auto liblevenshtein::subsumes< Algorithm::TRANSPOSITION > | ( | Position * | lhs, |
Position * | rhs, | ||
std::size_t | n ) -> bool |
Returns whether lhs
subsumes rhs
according to the subsumption rules of standard Levenshtein distance extended with the elementary operation of transposition.
lhs | Position to compare against rhs . |
rhs | Position to compare against lhs . |
n | Maximum edit distance to consider when matching spelling candidates. |
lhs
subsumes rhs
, given the maximum edit distance n
. Definition at line 16 of file subsumes.cpp.
References query().
auto liblevenshtein::subsumes< Algorithm::TRANSPOSITION > | ( | Position * | lhs, |
Position * | rhs, | ||
std::size_t | n ) -> bool |
Returns whether lhs
subsumes rhs
according to the subsumption rules of standard Levenshtein distance extended with the elementary operation of transposition.
lhs | Position to compare against rhs . |
rhs | Position to compare against lhs . |
n | Maximum edit distance to consider when matching spelling candidates. |
lhs
subsumes rhs
, given the maximum edit distance n
. Definition at line 16 of file subsumes.cpp.
References query().
Serializes a Dawg to its protobuf equivalent.
dawg | Dawg to serialize. |
Definition at line 95 of file serializer.cpp.
References collect_edges(), collect_nodes(), and query().
Referenced by serialize_protobuf().