reglibcpp
2.0.0
A C++ implementation of models for regular languages
|
Represents formal regular expressions. More...
#include <expression.h>
Classes | |
struct | parser |
Parses regular expressions. More... | |
Public Types | |
enum | operation { empty, symbol, kleene, concatenation, alternation } |
The different purposes an RE may fulfill. More... | |
typedef std::shared_ptr< expression const > | exptr |
This is the type used to handle regular expressions. More... | |
Public Member Functions | |
size_t | size () const |
Reports the size of this RE's tree representation. More... | |
operation | getOperation () const |
Reports this RE's function. More... | |
operator nfa const & () const | |
Returns an NFA accepting the language that this RE describes. More... | |
bool | operator== (nfa const &other) const |
Checks whether this RE describes the same regular language as another object. More... | |
bool | operator!= (nfa const &other) const |
Checks whether this RE describes a different regular language than another object. More... | |
std::string | extractSymbol () const |
Reports this symbol expression's UTF-8-encoded symbol. More... | |
char32_t | extractSymbol_ () const |
Reports this symbol expression's UTF-32-encoded symbol. More... | |
std::u32string | to_u32string () const |
Describes this RE in UTF-32-encoded human-readable form. More... | |
std::string | to_string () const |
Describes this RE in UTF-8-encoded human-readable form. More... | |
std::vector< exptr >::const_iterator | begin () const |
Returns an iterator pointing to this RE's first subexpression. More... | |
std::vector< exptr >::const_iterator | end () const |
Returns an iterator pointing behind this RE's last subexpression. More... | |
Static Public Member Functions | |
static void | reset () |
Resets the symbols used for RE operators to their defaults. More... | |
static exptr const & | spawnEmptySet () |
Gives an RE representing the empty set ∅. More... | |
static exptr const & | spawnEmptyString () |
Gives an RE representing the empty string ε. More... | |
static exptr const & | spawnSymbol (std::string const &symbol) |
Gives an RE representing the given UTF-32-encoded symbol. More... | |
static exptr const & | spawnSymbol_ (char32_t u32symbol) |
Gives an RE representing the given UTF-32-encoded symbol. More... | |
static exptr | spawnKleene (exptr const &b, bool optimized=true, bool aggressive=false) |
Gives an RE representing the Kleene closure of a given RE. More... | |
static exptr | spawnConcatenation (exptr const &l, exptr const &r, bool optimized=true, bool aggressive=false) |
Gives an RE representing the concatenation of two given REs. More... | |
static exptr | spawnAlternation (exptr const &l, exptr const &r, bool optimized=true, bool aggressive=false) |
Gives an RE representing the alternation of two given REs. More... | |
static exptr | spawnFromString (std::string const &re, bool optimized=false, bool aggressive=false) |
Gives an RE encoded in a given UTF-8 string. More... | |
static exptr | spawnFromString_ (std::u32string const &u32re, bool optimized=false, bool aggressive=false) |
Gives an RE encoded in a given UTF-32 string. More... | |
Static Public Attributes | |
static char32_t | L |
The symbol used to represent the Left parenthesis in a regular expression. More... | |
static char32_t | R |
The symbol used to represent the Right parenthesis in a regular expression. More... | |
static char32_t | K |
The symbol used to represent the Kleene star in a regular expression. More... | |
static char32_t | A |
The symbol used to represent the Alternation in a regular expression. More... | |
static char32_t | E |
The symbol used to represent the Empty string in a regular expression. More... | |
static char32_t | N |
The symbol used to represent the Null/empty set in a regular expression. More... | |
Represents formal regular expressions.
One should never need to handle such an object directly, however, much less copy or move it and therefore copy and move constructors are deleted.
To work with regular expressions, one should use expression::exptr, which aliases a shared_ptr
to an actual object and can be copied and moved to one's heart's content. To access member functions, one might dereference exptr
s temporarily or, better yet, use the arrow ->
operator.
Definition at line 46 of file expression.h.
typedef std::shared_ptr<expression const> reg::expression::exptr |
This is the type used to handle regular expressions.
Every method works on shared_ptr
s to the actual regular expressions, to help with basic comparisons and to save memory.
For example, every symbol's (and the empty string's and the empty set's) regular expression is only instantiated once and then pointed to by as many exptr
s as one likes.
Definition at line 58 of file expression.h.
|
strong |
The different purposes an RE may fulfill.
Definition at line 84 of file expression.h.
vector< expression::exptr >::const_iterator reg::expression::begin | ( | ) | const |
Returns an iterator
pointing to this RE's first subexpression.
Definition at line 378 of file expression.cpp.
vector< expression::exptr >::const_iterator reg::expression::end | ( | ) | const |
Returns an iterator
pointing behind this RE's last subexpression.
Definition at line 383 of file expression.cpp.
string reg::expression::extractSymbol | ( | ) | const |
Reports this symbol expression's UTF-8-encoded symbol.
""
for an empty string std::logic_error | if this expression's purpose is not that of a symbol |
Definition at line 329 of file expression.cpp.
char32_t reg::expression::extractSymbol_ | ( | ) | const |
Reports this symbol expression's UTF-32-encoded symbol.
char32_t
encoded within this symbol expression, U'\0'
for an empty string std::logic_error | if this expression's purpose is not that of a symbol |
Definition at line 309 of file expression.cpp.
expression::operation reg::expression::getOperation | ( | ) | const |
Reports this RE's function.
Note that the empty string's function is technically that of a symbol.
Definition at line 269 of file expression.cpp.
reg::expression::operator nfa const & | ( | ) | const |
Returns an NFA accepting the language that this RE describes.
bool reg::expression::operator!= | ( | nfa const & | other | ) | const |
Checks whether this RE describes a different regular language than another object.
false
if this RE's language is exactly the same as the other object's, true
else Definition at line 296 of file expression.cpp.
bool reg::expression::operator== | ( | nfa const & | other | ) | const |
Checks whether this RE describes the same regular language as another object.
true
if this RE's language is exactly the same as the other object's, false
else Definition at line 286 of file expression.cpp.
|
static |
Resets the symbols used for RE operators to their defaults.
Definition at line 66 of file expression.cpp.
size_t reg::expression::size | ( | ) | const |
Reports the size of this RE's tree representation.
In this context, an RE's size will be defined recursively as follows:
.size()
= 1.size()
= 1symbol
>.size()
= 1(l+r).size()
= 1 + l.size() + r.size()
(lr).size()
= 1 + l.size() + r.size()
(b*).size()
= 1 + b.size()
Definition at line 255 of file expression.cpp.
|
static |
Gives an RE representing the alternation of two given REs.
More formally, the RE's language will be L(l
+r
) = L(l
) ∪ L(r
).
l | exptr to one of the REs |
r | exptr to the other RE |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE representing the alternation of l
and r
Definition at line 178 of file expression.cpp.
|
static |
Gives an RE representing the concatenation of two given REs.
More formally, the RE's language will be L(lr
) = L(l
) • L(r
).
l | exptr to the first RE |
r | exptr to the second RE |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE representing the concatenation of l
and r
Definition at line 135 of file expression.cpp.
|
static |
Gives an RE representing the empty set ∅.
More formally, the RE's language will be {}.
exptr
to the RE representing the empty set ∅ Definition at line 80 of file expression.cpp.
|
static |
Gives an RE representing the empty string ε.
More formally, the RE's language will be {ε}.
exptr
to the RE representing the empty string ε Definition at line 89 of file expression.cpp.
|
static |
Gives an RE encoded in a given UTF-8 string.
This converts re
to UTF-32 and calls spawnFromString_(std::u32string const&,bool,bool). If you don't want that overhead and already have a std::u32string on your hands, use that method.
re | the RE in text form |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE represented by the given string std::invalid_argument | if the re string is malformed |
Definition at line 746 of file expression.cpp.
|
static |
Gives an RE encoded in a given UTF-32 string.
u32re | the RE in text form |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE represented by the given string std::invalid_argument | if the u32re string is malformed |
Definition at line 720 of file expression.cpp.
|
static |
Gives an RE representing the Kleene closure of a given RE.
More formally, the RE's language will be L(b
*) = L(b
)*.
b | exptr to the RE |
optimized | whether simplifications on the syntax level should be applied |
aggressive | whether the simplifications should check the semantic level |
exptr
to the RE representing the Kleene closure of l
Definition at line 224 of file expression.cpp.
|
static |
Gives an RE representing the given UTF-32-encoded symbol.
This converts symbol
to UTF-32 and calls spawnSymbol_(char32_t) with the std::u32string's first char32_t
. If you don't want that overhead and already have a char32_t
on your hands, use that method.
More formally, the RE's language will be {<symbol>}
.
symbol | the symbol the RE should represent or "" for the empty string ε |
exptr
to the RE representing the symbol Definition at line 116 of file expression.cpp.
|
static |
Gives an RE representing the given UTF-32-encoded symbol.
More formally, the RE's language will be {<symbol>}
.
symbol | the symbol the RE should represent or "" for the empty string ε |
exptr
to the RE representing the symbol Definition at line 100 of file expression.cpp.
string reg::expression::to_string | ( | ) | const |
Describes this RE in UTF-8-encoded human-readable form.
Definition at line 373 of file expression.cpp.
u32string reg::expression::to_u32string | ( | ) | const |
Describes this RE in UTF-32-encoded human-readable form.
Definition at line 335 of file expression.cpp.
|
static |
The symbol used to represent the Alternation in a regular expression.
Definition at line 59 of file expression.h.
|
static |
The symbol used to represent the Empty string in a regular expression.
Definition at line 59 of file expression.h.
|
static |
The symbol used to represent the Kleene star in a regular expression.
Definition at line 59 of file expression.h.
|
static |
The symbol used to represent the Left parenthesis in a regular expression.
Definition at line 59 of file expression.h.
|
static |
The symbol used to represent the Null/empty set in a regular expression.
Definition at line 59 of file expression.h.
|
static |
The symbol used to represent the Right parenthesis in a regular expression.
Definition at line 59 of file expression.h.