reglibcpp
1.0.0
(Naïve) C++ implementation of models for regular languages
|
Represents formal regular expressions. More...
#include <expression.h>
Classes | |
struct | literals |
Token literals as used in Introduction to Automata Theory, Languages, and Computation by Hopcroft, Motwani and Ullman. More... | |
struct | parser |
Parses regular expressions. More... | |
Public Types | |
enum | operation { empty, symbol, kleene, concatenation, alternation } |
The different purposes an RE may fulfill. More... | |
typedef std::shared_ptr< expression const > | exptr |
This is the type used to handle regular expressions. More... | |
Public Member Functions | |
size_t | size () const |
Reports the size of this RE's tree representation. More... | |
operation | getOperation () const |
Reports this RE's function. More... | |
bool | operator== (expression const &r) const |
Checks whether this RE is semantically equivalent to another one. More... | |
bool | operator!= (expression const &r) const |
Checks whether this RE is semantically different from another one. More... | |
char32_t | extractSymbol () const |
Reports this symbol expression's UTF-32-encoded symbol. More... | |
std::string | extractUtf8Symbol () const |
Reports this symbol expression's UTF-8-encoded symbol. More... | |
std::u32string | to_u32string () const |
Describes this RE in UTF-32-encoded human-readable form. More... | |
std::string | to_string () const |
Describes this RE in UTF-32-encoded human-readable form. More... | |
std::vector< exptr >::const_iterator | begin () const |
Returns an iterator pointing to this RE's first subexpression. More... | |
std::vector< exptr >::const_iterator | end () const |
Returns an iterator pointing behind this RE's last subexpression. More... | |
Static Public Member Functions | |
static exptr const & | spawnEmptySet () |
Gives an RE representing the empty set ∅. More... | |
static exptr const & | spawnEmptyString () |
Gives an RE representing the empty string ε. More... | |
static exptr const & | spawnSymbol (char32_t symbol) |
Gives an RE representing the given UTF-32-encoded symbol. More... | |
static exptr const & | spawnSymbol (std::string utf8Symbol) |
Same as above for a UTF-8-encoded symbol. More... | |
static exptr | spawnKleene (exptr const &b, bool optimized=true, bool aggressive=false) |
Gives an RE representing the Kleene closure of a given RE. More... | |
static exptr | spawnConcatenation (exptr const &l, exptr const &r, bool optimized=true, bool aggressive=false) |
Gives an RE representing the concatenation of two given REs. More... | |
static exptr | spawnAlternation (exptr const &l, exptr const &r, bool optimized=true, bool aggressive=false) |
Gives an RE representing the alternation of two given REs. More... | |
static exptr | spawnFromString (std::u32string const &re, literals lits=literals(), bool optimized=false, bool aggressive=false) |
Gives an RE encoded in a given string. More... | |
static exptr | spawnFromString (std::string const &utf8Re, literals lits=literals(), bool optimized=false, bool aggressive=false) |
Same as above for a UTF-8-encoded string. More... | |
Static Public Attributes | |
static std::unique_ptr< std::wstring_convert< std::codecvt_utf8< char32_t >, char32_t > > const | converter |
Converts between UTF-8-encoded and UTF-32-encoded strings. More... | |
Represents formal regular expressions.
One should never need to handle such an object directly, however, much less copy or move it and therefore copy and move constructors are deleted.
To work with regular expressions, one should use expression::exptr, which aliases a shared_ptr
to an actual object and can be copied and moved to one's heart's content. To access member functions, one might dereference exptr
s temporarily or, better yet, use the arrow ->
operator.
Definition at line 31 of file expression.h.
typedef std::shared_ptr<expression const> reg::expression::exptr |
This is the type used to handle regular expressions.
Every method works on shared_ptr
s to the actual regular expressions, to help with basic comparisons and to save memory.
For example, every symbol's (and the empty string's and the empty set's) regular expression is only instantiated once and then pointed to by as many exptr
s as one likes.
Definition at line 43 of file expression.h.
|
strong |
The different purposes an RE may fulfill.
Definition at line 84 of file expression.h.
vector< expression::exptr >::const_iterator reg::expression::begin | ( | ) | const |
Returns an iterator
pointing to this RE's first subexpression.
Definition at line 302 of file expression.cpp.
vector< expression::exptr >::const_iterator reg::expression::end | ( | ) | const |
Returns an iterator
pointing behind this RE's last subexpression.
Definition at line 307 of file expression.cpp.
char32_t reg::expression::extractSymbol | ( | ) | const |
Reports this symbol expression's UTF-32-encoded symbol.
This method should only be called on an object whose function is confirmed to be that of a symbol!
char
encoded within this symbol expression Definition at line 229 of file expression.cpp.
string reg::expression::extractUtf8Symbol | ( | ) | const |
Reports this symbol expression's UTF-8-encoded symbol.
This method should only be called on an object whose function is confirmed to be that of a symbol!
char
encoded within this symbol expression Definition at line 249 of file expression.cpp.
expression::operation reg::expression::getOperation | ( | ) | const |
Reports this RE's function.
Note that the empty string's function is technically that of a symbol.
Definition at line 195 of file expression.cpp.
bool reg::expression::operator!= | ( | expression const & | r | ) | const |
Checks whether this RE is semantically different from another one.
false
if this RE's language is exactly the same as the other one's, true
else Definition at line 219 of file expression.cpp.
bool reg::expression::operator== | ( | expression const & | r | ) | const |
Checks whether this RE is semantically equivalent to another one.
true
if this RE's language is exactly the same as the other one's, false
else Definition at line 204 of file expression.cpp.
size_t reg::expression::size | ( | ) | const |
Reports the size of this RE's tree representation.
In this context, an RE's size will be defined recursively as follows:
.size()
= 0.size()
= 1symbol
>.size()
= 1(l+r).size()
= 1 + l.size() + r.size()
(lr).size()
= 1 + l.size() + r.size()
(b*).size()
= 1 + b.size()
Definition at line 181 of file expression.cpp.
|
static |
Gives an RE representing the alternation of two given REs.
More formally, the RE's language will be L(l
+r
) = L(l
) ∪ L(r
).
l | exptr to one of the REs |
r | exptr to the other RE |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE representing the alternation of l
and r
Definition at line 117 of file expression.cpp.
|
static |
Gives an RE representing the concatenation of two given REs.
More formally, the RE's language will be L(lr
) = L(l
) • L(r
).
l | exptr to the first RE |
r | exptr to the second RE |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE representing the concatenation of l
and r
Definition at line 81 of file expression.cpp.
|
static |
Gives an RE representing the empty set ∅.
More formally, the RE's language will be {}.
exptr
to the RE representing the empty set ∅ Definition at line 40 of file expression.cpp.
|
static |
Gives an RE representing the empty string ε.
More formally, the RE's language will be {ε}.
exptr
to the RE representing the empty string ε Definition at line 50 of file expression.cpp.
|
static |
Gives an RE encoded in a given string.
re | the RE in text form |
lits | literals of operators in re |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE represented by the given string or to nullptr
if it is invalid Definition at line 613 of file expression.cpp.
|
static |
Same as above for a UTF-8-encoded string.
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 631 of file expression.cpp.
|
static |
Gives an RE representing the Kleene closure of a given RE.
More formally, the RE's language will be L(b
*) = L(b
)*.
b | exptr to the RE |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE representing the Kleene closure of l
Definition at line 152 of file expression.cpp.
|
static |
Gives an RE representing the given UTF-32-encoded symbol.
More formally, the RE's language will be {<symbol
>}.
symbol | the symbol the RE should represent or "" for the empty string ε |
exptr
to the RE representing the symbol Definition at line 61 of file expression.cpp.
|
static |
Same as above for a UTF-8-encoded symbol.
This is an overloaded member function, provided for convenience. It differs from the above function only in what argument(s) it accepts.
Definition at line 66 of file expression.cpp.
string reg::expression::to_string | ( | ) | const |
Describes this RE in UTF-32-encoded human-readable form.
Definition at line 297 of file expression.cpp.
u32string reg::expression::to_u32string | ( | ) | const |
Describes this RE in UTF-32-encoded human-readable form.
Definition at line 259 of file expression.cpp.
|
static |
Converts between UTF-8-encoded and UTF-32-encoded strings.
Definition at line 97 of file expression.h.