liblevenshtein 4.0.0
A library for generating Finite State Transducers based on Levenshtein Automata.
Loading...
Searching...
No Matches
liblevenshtein::LazyQuery< Result > Class Template Reference

Lazily traverses the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term, and yields each spelling candidate as it is matched. More...

#include <lazy_query.h>

Collaboration diagram for liblevenshtein::LazyQuery< Result >:

Public Member Functions

 LazyQuery (const std::string &term, std::size_t max_distance, Intersection *intersection, TransitionFn transition, DistanceFn min_distance)
 Constructs a new LazyQuery which yields spelling candidates as they are matched while traversing the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term.
 
 LazyQuery ()=default
 Constructs a LazyQuery that represents the end of the spelling candidates.
 
 ~LazyQuery ()
 Frees any owned members.
 
auto operator++ () -> LazyQuery &
 Advances to the next spelling candidate.
 
auto operator* () const -> const Result &
 Returns the current spelling candidate.
 
auto operator== (const LazyQuery &other) const -> bool
 Returns whether this LazyQuery has matched all the spelling candidates.
 

Private Member Functions

void advance ()
 Advances to the next spelling candidate.
 
void update_candidate (std::string &term, std::size_t distance)
 Sets the current spelling candidate according to the desired Result type.
 
auto build_intersection (char label, DawgNode *node, State *state, Intersection *parent) -> Intersection *
 Factory function that constructs and records a new Intersection instance.
 
auto characteristic_vector (char x, std::string &term, std::size_t k, std::size_t i) -> std::vector< bool >
 Flags all instances of the given character in the query term within a window of consecutive characters.
 
void update_candidate (std::string &term, std::size_t distance)
 
void update_candidate (std::string &term, std::size_t distance)
 

Private Attributes

std::queue< Intersection * > _pending
 Intermediate Intersection instances representing paths along the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term.
 
std::queue< std::pair< char, DawgNode * > > _edges
 Outgoing dictionary node edges that are pending traversal along the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term.
 
std::vector< Intersection * > _intersections
 Tracks Intersection allocations so they may be freed.
 
bool _is_complete = false
 Whether all the spelling candidates have been matched.
 
Intersection_intersection = nullptr
 Current whose respective spelling candidates are being considered.
 
Result _candidate
 Presently matched spelling candidate.
 
std::string _term
 Query term whose spelling candidates are to be matched.
 
std::size_t _max_distance
 Maximum edit distance to consider when matching spelling candidates.
 
TransitionFn transition
 Maps one Levenshtein State to another given the State and a characteristic vector.
 
DistanceFn min_distance
 Infers the Levenshtein distance between the query term and a spelling candidate.
 
std::size_t _a
 Intermediate value used to compare edit distances without overflowing.
 
std::size_t _k
 Intermediate value used to compare edit distances without overflowing.
 
std::size_t _i
 Intermediate value used to compare edit distances without overflowing.
 

Detailed Description

template<class Result>
class liblevenshtein::LazyQuery< Result >

Lazily traverses the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term, and yields each spelling candidate as it is matched.

Definition at line 30 of file lazy_query.h.

Constructor & Destructor Documentation

◆ LazyQuery() [1/2]

template<class Result >
liblevenshtein::LazyQuery< Result >::LazyQuery ( const std::string & term,
std::size_t max_distance,
Intersection * intersection,
TransitionFn transition,
DistanceFn min_distance )

Constructs a new LazyQuery which yields spelling candidates as they are matched while traversing the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term.

Parameters
termQuery term whose spelling candidates are to be matched.
max_distanceMaximum edit distance to consider when matching spelling candidates.
intersectionInitial Intersection between the root node of the dictionary and the initial Levenshtein State.
transitionFunction that maps one Levenshtein State to another with input from a characteristic vector.
min_distanceInfers the Levenshtein distance between the query term and a spelling candidate.

Definition at line 9 of file lazy_query.cpp.

12 : _term(term), _max_distance(max_distance),
13 transition(std::move(transition)), min_distance(std::move(min_distance)),
14 _a(max_distance < (SIZE_MAX - 1) >> 1 ? (max_distance << 1) + 1
15 : SIZE_MAX) {
16 DawgNode *root = intersection->node();
18 if (root->is_final() && term.length() <= max_distance) { // special case
19 std::string candidate;
21 } else {
22 advance();
23 }
24}
void advance()
Advances to the next spelling candidate.
std::size_t _a
Intermediate value used to compare edit distances without overflowing.
Definition lazy_query.h:123
TransitionFn transition
Maps one Levenshtein State to another given the State and a characteristic vector.
Definition lazy_query.h:116
std::queue< Intersection * > _pending
Intermediate Intersection instances representing paths along the intersection between the dictionary ...
Definition lazy_query.h:89
std::size_t _max_distance
Maximum edit distance to consider when matching spelling candidates.
Definition lazy_query.h:112
std::string _term
Query term whose spelling candidates are to be matched.
Definition lazy_query.h:109
DistanceFn min_distance
Infers the Levenshtein distance between the query term and a spelling candidate.
Definition lazy_query.h:120
void update_candidate(std::string &term, std::size_t distance)
Sets the current spelling candidate according to the desired Result type.
void query(ll::Dawg *dawg, const std::string &query_term, std::size_t max_distance)
Definition main.cpp:25

References liblevenshtein::LazyQuery< Result >::_pending, liblevenshtein::LazyQuery< Result >::advance(), liblevenshtein::DawgNode::is_final(), query(), and liblevenshtein::LazyQuery< Result >::update_candidate().

Here is the call graph for this function:

◆ LazyQuery() [2/2]

template<class Result >
liblevenshtein::LazyQuery< Result >::LazyQuery ( )
default

Constructs a LazyQuery that represents the end of the spelling candidates.

◆ ~LazyQuery()

Frees any owned members.

Definition at line 26 of file lazy_query.cpp.

26 {
27 for (Intersection *intersection : _intersections) {
28 delete intersection;
29 }
30}
std::vector< Intersection * > _intersections
Tracks Intersection allocations so they may be freed.
Definition lazy_query.h:97

References query().

Here is the call graph for this function:

Member Function Documentation

◆ advance()

template<class Result >
void liblevenshtein::LazyQuery< Result >::advance ( )
private

Advances to the next spelling candidate.

Definition at line 33 of file lazy_query.cpp.

33 {
34 if (!_edges.empty() || !_pending.empty()) {
35 _is_complete = true;
36 do {
37 if (_edges.empty()) {
38 _intersection = _pending.front();
39 _pending.pop();
40 DawgNode *node = _intersection->node();
41 State *state = _intersection->state();
42 _i = state->head()->term_index();
43 std::size_t b = _term.length() - _i;
44 _k = (_a < b) ? _a : b;
45 node->for_each_edge([&](char label, DawgNode *target) {
46 _edges.emplace(label, target);
47 });
48 } else {
49 DawgNode *node = _intersection->node();
50 State *state = _intersection->state();
51 std::pair<char, DawgNode *> edge = _edges.front();
52 _edges.pop();
53 char next_label = edge.first;
54 DawgNode *next_node = edge.second;
55 std::vector<bool> characteristic_vector =
56 this->characteristic_vector(next_label, _term, _k, _i);
58 if (next_state != nullptr) {
62 if (next_node->is_final()) {
63 std::size_t distance = min_distance(next_state, _term.length());
64 if (distance <= _max_distance) {
65 std::string term = next_intersection->str();
66 update_candidate(term, distance);
67 _is_complete = false;
68 break;
69 }
70 }
71 }
72 }
73 } while (!_edges.empty() || !_pending.empty());
74 } else {
75 _is_complete = true;
76 }
77}
auto state() const -> State *
Returns the Levenshtein State from the Intersection.
auto node() const -> DawgNode *
Returns the dictionary DawgNode from the Intersection.
std::size_t _k
Intermediate value used to compare edit distances without overflowing.
Definition lazy_query.h:126
Intersection * _intersection
Current whose respective spelling candidates are being considered.
Definition lazy_query.h:103
auto characteristic_vector(char x, std::string &term, std::size_t k, std::size_t i) -> std::vector< bool >
Flags all instances of the given character in the query term within a window of consecutive character...
auto build_intersection(char label, DawgNode *node, State *state, Intersection *parent) -> Intersection *
Factory function that constructs and records a new Intersection instance.
std::queue< std::pair< char, DawgNode * > > _edges
Outgoing dictionary node edges that are pending traversal along the intersection between the dictiona...
Definition lazy_query.h:94
bool _is_complete
Whether all the spelling candidates have been matched.
Definition lazy_query.h:100
std::size_t _i
Intermediate value used to compare edit distances without overflowing.
Definition lazy_query.h:129
auto distance(State *state, std::size_t query_length) -> std::size_t
Infers the Levenshtein distance from the given State and query length.

References liblevenshtein::distance(), liblevenshtein::DawgNode::for_each_edge(), liblevenshtein::State::head(), and query().

Referenced by liblevenshtein::LazyQuery< Result >::LazyQuery().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ build_intersection()

template<class Result >
auto liblevenshtein::LazyQuery< Result >::build_intersection ( char label,
DawgNode * node,
State * state,
Intersection * parent ) -> Intersection *
private

Factory function that constructs and records a new Intersection instance.

Parameters
labelThe character annotating the incoming edge of the current dictionary node in the traversal.
nodeThe current dictionary node in the traversal.
stateThe current Levenshtein State in the traversal.
parentThe immediately preceding Intersection node used to reconstruct the spelling candidate.
Returns
A new Intersection with the given parameters.

Definition at line 80 of file lazy_query.cpp.

82 {
83 auto *intersection = new Intersection(label, node, state, parent);
85 return intersection;
86}

References query().

Here is the call graph for this function:

◆ characteristic_vector()

template<class Result >
auto liblevenshtein::LazyQuery< Result >::characteristic_vector ( char x,
std::string & term,
std::size_t k,
std::size_t i ) -> std::vector<bool>
private

Flags all instances of the given character in the query term within a window of consecutive characters.

Parameters
xCharacter from the dictionary automaton to match within the window over the characters in the query term.
termQuery term.
kWindow size of the characteristic vector.
iBase offset of the window for the characteristic vector.
Returns
The characteristic vector containing matched characters within the desired window over the consecutive characters of the query term.

Definition at line 119 of file lazy_query.cpp.

121 {
122 std::vector<bool> characteristic_vector(k);
123 for (int j = 0; j < k; j += 1) {
124 characteristic_vector[j] = (x == term[i + j]);
125 }
127}

References query().

Here is the call graph for this function:

◆ operator*()

Returns the current spelling candidate.

Returns
The current spelling candidate.

Definition at line 108 of file lazy_query.cpp.

108 {
109 return _candidate;
110}
Result _candidate
Presently matched spelling candidate.
Definition lazy_query.h:106

◆ operator++()

Advances to the next spelling candidate.

Returns
An advanced LazyQuery that points to the next spelling candidate.

Definition at line 89 of file lazy_query.cpp.

89 {
90 advance();
91 return *this;
92}

◆ operator==()

Returns whether this LazyQuery has matched all the spelling candidates.

Parameters
otherAn ignored instance of LazyQuery.
Returns
Whether all the spelling candidates have been matched.

Definition at line 113 of file lazy_query.cpp.

114 {
115 return _is_complete;
116}

◆ update_candidate() [1/3]

void liblevenshtein::LazyQuery< std::string >::update_candidate ( std::string & term,
std::size_t distance )
private

Definition at line 95 of file lazy_query.cpp.

96 {
98}

References query().

Here is the call graph for this function:

◆ update_candidate() [2/3]

void liblevenshtein::LazyQuery< std::pair< std::string, std::size_t > >::update_candidate ( std::string & term,
std::size_t distance )
private

Definition at line 101 of file lazy_query.cpp.

102 {
103 _candidate.first = term;
104 _candidate.second = distance;
105}

References liblevenshtein::distance(), and query().

Here is the call graph for this function:

◆ update_candidate() [3/3]

template<class Result >
void liblevenshtein::LazyQuery< Result >::update_candidate ( std::string & term,
std::size_t distance )
private

Sets the current spelling candidate according to the desired Result type.

Parameters
termSpelling candidate.
distanceLevenshtein distance between the query term and the spelling candidate.

Referenced by liblevenshtein::LazyQuery< Result >::LazyQuery().

Here is the caller graph for this function:

Member Data Documentation

◆ _a

template<class Result >
std::size_t liblevenshtein::LazyQuery< Result >::_a
private

Intermediate value used to compare edit distances without overflowing.

Definition at line 123 of file lazy_query.h.

◆ _candidate

template<class Result >
Result liblevenshtein::LazyQuery< Result >::_candidate
private

Presently matched spelling candidate.

Definition at line 106 of file lazy_query.h.

◆ _edges

template<class Result >
std::queue<std::pair<char, DawgNode *> > liblevenshtein::LazyQuery< Result >::_edges
private

Outgoing dictionary node edges that are pending traversal along the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term.

Definition at line 94 of file lazy_query.h.

◆ _i

template<class Result >
std::size_t liblevenshtein::LazyQuery< Result >::_i
private

Intermediate value used to compare edit distances without overflowing.

Definition at line 129 of file lazy_query.h.

◆ _intersection

template<class Result >
Intersection* liblevenshtein::LazyQuery< Result >::_intersection = nullptr
private

Current whose respective spelling candidates are being considered.

Definition at line 103 of file lazy_query.h.

◆ _intersections

template<class Result >
std::vector<Intersection *> liblevenshtein::LazyQuery< Result >::_intersections
private

Tracks Intersection allocations so they may be freed.

Definition at line 97 of file lazy_query.h.

◆ _is_complete

template<class Result >
bool liblevenshtein::LazyQuery< Result >::_is_complete = false
private

Whether all the spelling candidates have been matched.

Definition at line 100 of file lazy_query.h.

◆ _k

template<class Result >
std::size_t liblevenshtein::LazyQuery< Result >::_k
private

Intermediate value used to compare edit distances without overflowing.

Definition at line 126 of file lazy_query.h.

◆ _max_distance

template<class Result >
std::size_t liblevenshtein::LazyQuery< Result >::_max_distance
private

Maximum edit distance to consider when matching spelling candidates.

Definition at line 112 of file lazy_query.h.

◆ _pending

template<class Result >
std::queue<Intersection *> liblevenshtein::LazyQuery< Result >::_pending
private

Intermediate Intersection instances representing paths along the intersection between the dictionary automaton and Levenshtein automaton, guided by the query term.

Definition at line 89 of file lazy_query.h.

Referenced by liblevenshtein::LazyQuery< Result >::LazyQuery().

◆ _term

template<class Result >
std::string liblevenshtein::LazyQuery< Result >::_term
private

Query term whose spelling candidates are to be matched.

Definition at line 109 of file lazy_query.h.

◆ min_distance

template<class Result >
DistanceFn liblevenshtein::LazyQuery< Result >::min_distance
private

Infers the Levenshtein distance between the query term and a spelling candidate.

Definition at line 120 of file lazy_query.h.

◆ transition

template<class Result >
TransitionFn liblevenshtein::LazyQuery< Result >::transition
private

Maps one Levenshtein State to another given the State and a characteristic vector.

Definition at line 116 of file lazy_query.h.


The documentation for this class was generated from the following files: