reglibcpp  2.0.0
A C++ implementation of models for regular languages
Classes | Public Types | Public Member Functions | Static Public Member Functions | Static Public Attributes | List of all members
reg::expression Class Reference

Represents formal regular expressions. More...

#include <expression.h>

Classes

struct  parser
 Parses regular expressions. More...
 

Public Types

enum  operation {
  empty, symbol, kleene, concatenation,
  alternation
}
 The different purposes an RE may fulfill. More...
 
typedef std::shared_ptr< expression const > exptr
 This is the type used to handle regular expressions. More...
 

Public Member Functions

size_t size () const
 Reports the size of this RE's tree representation. More...
 
operation getOperation () const
 Reports this RE's function. More...
 
 operator nfa const & () const
 Returns an NFA accepting the language that this RE describes. More...
 
bool operator== (nfa const &other) const
 Checks whether this RE describes the same regular language as another object. More...
 
bool operator!= (nfa const &other) const
 Checks whether this RE describes a different regular language than another object. More...
 
std::string extractSymbol () const
 Reports this symbol expression's UTF-8-encoded symbol. More...
 
char32_t extractSymbol_ () const
 Reports this symbol expression's UTF-32-encoded symbol. More...
 
std::u32string to_u32string () const
 Describes this RE in UTF-32-encoded human-readable form. More...
 
std::string to_string () const
 Describes this RE in UTF-8-encoded human-readable form. More...
 
std::vector< exptr >::const_iterator begin () const
 Returns an iterator pointing to this RE's first subexpression. More...
 
std::vector< exptr >::const_iterator end () const
 Returns an iterator pointing behind this RE's last subexpression. More...
 

Static Public Member Functions

static void reset ()
 Resets the symbols used for RE operators to their defaults. More...
 
static exptr const & spawnEmptySet ()
 Gives an RE representing the empty set ∅. More...
 
static exptr const & spawnEmptyString ()
 Gives an RE representing the empty string ε. More...
 
static exptr const & spawnSymbol (std::string const &symbol)
 Gives an RE representing the given UTF-32-encoded symbol. More...
 
static exptr const & spawnSymbol_ (char32_t u32symbol)
 Gives an RE representing the given UTF-32-encoded symbol. More...
 
static exptr spawnKleene (exptr const &b, bool optimized=true, bool aggressive=false)
 Gives an RE representing the Kleene closure of a given RE. More...
 
static exptr spawnConcatenation (exptr const &l, exptr const &r, bool optimized=true, bool aggressive=false)
 Gives an RE representing the concatenation of two given REs. More...
 
static exptr spawnAlternation (exptr const &l, exptr const &r, bool optimized=true, bool aggressive=false)
 Gives an RE representing the alternation of two given REs. More...
 
static exptr spawnFromString (std::string const &re, bool optimized=false, bool aggressive=false)
 Gives an RE encoded in a given UTF-8 string. More...
 
static exptr spawnFromString_ (std::u32string const &u32re, bool optimized=false, bool aggressive=false)
 Gives an RE encoded in a given UTF-32 string. More...
 

Static Public Attributes

static char32_t L
 The symbol used to represent the Left parenthesis in a regular expression. More...
 
static char32_t R
 The symbol used to represent the Right parenthesis in a regular expression. More...
 
static char32_t K
 The symbol used to represent the Kleene star in a regular expression. More...
 
static char32_t A
 The symbol used to represent the Alternation in a regular expression. More...
 
static char32_t E
 The symbol used to represent the Empty string in a regular expression. More...
 
static char32_t N
 The symbol used to represent the Null/empty set in a regular expression. More...
 

Detailed Description

Represents formal regular expressions.

One should never need to handle such an object directly, however, much less copy or move it and therefore copy and move constructors are deleted.

To work with regular expressions, one should use expression::exptr, which aliases a shared_ptr to an actual object and can be copied and moved to one's heart's content. To access member functions, one might dereference exptrs temporarily or, better yet, use the arrow -> operator.

See also
expression::exptr

Definition at line 46 of file expression.h.

Member Typedef Documentation

◆ exptr

This is the type used to handle regular expressions.

Every method works on shared_ptrs to the actual regular expressions, to help with basic comparisons and to save memory.

For example, every symbol's (and the empty string's and the empty set's) regular expression is only instantiated once and then pointed to by as many exptrs as one likes.

Definition at line 58 of file expression.h.

Member Enumeration Documentation

◆ operation

The different purposes an RE may fulfill.

See also
spawnEmptySet
spawnEmptyString
spawnSymbol
spawnKleene
spawnConcatenation
spawnAlternation

Definition at line 84 of file expression.h.

Member Function Documentation

◆ begin()

vector< expression::exptr >::const_iterator reg::expression::begin ( ) const

Returns an iterator pointing to this RE's first subexpression.

Definition at line 378 of file expression.cpp.

◆ end()

vector< expression::exptr >::const_iterator reg::expression::end ( ) const

Returns an iterator pointing behind this RE's last subexpression.

Definition at line 383 of file expression.cpp.

◆ extractSymbol()

string reg::expression::extractSymbol ( ) const

Reports this symbol expression's UTF-8-encoded symbol.

Returns
the character encoded within this symbol expression, "" for an empty string
Exceptions
std::logic_errorif this expression's purpose is not that of a symbol

Definition at line 329 of file expression.cpp.

◆ extractSymbol_()

char32_t reg::expression::extractSymbol_ ( ) const

Reports this symbol expression's UTF-32-encoded symbol.

Returns
the char32_t encoded within this symbol expression, U'\0' for an empty string
Exceptions
std::logic_errorif this expression's purpose is not that of a symbol

Definition at line 309 of file expression.cpp.

◆ getOperation()

expression::operation reg::expression::getOperation ( ) const

Reports this RE's function.

Note that the empty string's function is technically that of a symbol.

Returns
the expression::operation best describing this RE's purpose

Definition at line 269 of file expression.cpp.

◆ operator nfa const &()

reg::expression::operator nfa const & ( ) const

Returns an NFA accepting the language that this RE describes.

◆ operator!=()

bool reg::expression::operator!= ( nfa const &  other) const

Checks whether this RE describes a different regular language than another object.

Returns
false if this RE's language is exactly the same as the other object's, true else

Definition at line 296 of file expression.cpp.

◆ operator==()

bool reg::expression::operator== ( nfa const &  other) const

Checks whether this RE describes the same regular language as another object.

Returns
true if this RE's language is exactly the same as the other object's, false else

Definition at line 286 of file expression.cpp.

◆ reset()

void reg::expression::reset ( )
static

Resets the symbols used for RE operators to their defaults.

Definition at line 66 of file expression.cpp.

◆ size()

size_t reg::expression::size ( ) const

Reports the size of this RE's tree representation.

In this context, an RE's size will be defined recursively as follows:

  • .size() = 1
  • ε.size() = 1
  • <symbol>.size() = 1
  • (l+r).size() = 1 + l.size() + r.size()
  • (lr).size() = 1 + l.size() + r.size()
  • (b*).size() = 1 + b.size()
    Returns
    a measure of how many subexpressions this RE consists of

Definition at line 255 of file expression.cpp.

◆ spawnAlternation()

expression::exptr reg::expression::spawnAlternation ( expression::exptr const &  l,
expression::exptr const &  r,
bool  optimized = true,
bool  aggressive = false 
)
static

Gives an RE representing the alternation of two given REs.

More formally, the RE's language will be L(l+r) = L(l) ∪ L(r).

Parameters
lexptr to one of the REs
rexptr to the other RE
optimizedwhether simplifications on the syntax level should be applied
aggressivewhether the simplifications should check the semantic level
Returns
exptr to the RE representing the alternation of l and r

Definition at line 178 of file expression.cpp.

◆ spawnConcatenation()

expression::exptr reg::expression::spawnConcatenation ( expression::exptr const &  l,
expression::exptr const &  r,
bool  optimized = true,
bool  aggressive = false 
)
static

Gives an RE representing the concatenation of two given REs.

More formally, the RE's language will be L(lr) = L(l) • L(r).

Parameters
lexptr to the first RE
rexptr to the second RE
optimizedwhether simplifications on the syntax level should be applied
aggressivewhether the simplifications should check the semantic level
Returns
exptr to the RE representing the concatenation of l and r

Definition at line 135 of file expression.cpp.

◆ spawnEmptySet()

expression::exptr const & reg::expression::spawnEmptySet ( )
static

Gives an RE representing the empty set ∅.

More formally, the RE's language will be {}.

Returns
exptr to the RE representing the empty set ∅

Definition at line 80 of file expression.cpp.

◆ spawnEmptyString()

expression::exptr const & reg::expression::spawnEmptyString ( )
static

Gives an RE representing the empty string ε.

More formally, the RE's language will be {ε}.

Returns
exptr to the RE representing the empty string ε

Definition at line 89 of file expression.cpp.

◆ spawnFromString()

expression::exptr reg::expression::spawnFromString ( std::string const &  re,
bool  optimized = false,
bool  aggressive = false 
)
static

Gives an RE encoded in a given UTF-8 string.

This converts re to UTF-32 and calls spawnFromString_(std::u32string const&,bool,bool). If you don't want that overhead and already have a std::u32string on your hands, use that method.

Parameters
rethe RE in text form
optimizedwhether simplifications on the syntax level should be applied
aggressivewhether the simplifications should check the semantic level
Returns
exptr to the RE represented by the given string
Exceptions
std::invalid_argumentif the re string is malformed

Definition at line 746 of file expression.cpp.

◆ spawnFromString_()

expression::exptr reg::expression::spawnFromString_ ( std::u32string const &  u32re,
bool  optimized = false,
bool  aggressive = false 
)
static

Gives an RE encoded in a given UTF-32 string.

Parameters
u32rethe RE in text form
optimizedwhether simplifications on the syntax level should be applied
aggressivewhether the simplifications should check the semantic level
Returns
exptr to the RE represented by the given string
Exceptions
std::invalid_argumentif the u32re string is malformed

Definition at line 720 of file expression.cpp.

◆ spawnKleene()

expression::exptr reg::expression::spawnKleene ( expression::exptr const &  b,
bool  optimized = true,
bool  aggressive = false 
)
static

Gives an RE representing the Kleene closure of a given RE.

More formally, the RE's language will be L(b*) = L(b)*.

Parameters
bexptr to the RE
optimizedwhether simplifications on the syntax level should be applied
aggressivewhether the simplifications should check the semantic level
Returns
exptr to the RE representing the Kleene closure of l

Definition at line 224 of file expression.cpp.

◆ spawnSymbol()

expression::exptr const & reg::expression::spawnSymbol ( std::string const &  symbol)
static

Gives an RE representing the given UTF-32-encoded symbol.

This converts symbol to UTF-32 and calls spawnSymbol_(char32_t) with the std::u32string's first char32_t. If you don't want that overhead and already have a char32_t on your hands, use that method.

More formally, the RE's language will be {<symbol>}.

Parameters
symbolthe symbol the RE should represent or "" for the empty string ε
Returns
exptr to the RE representing the symbol

Definition at line 116 of file expression.cpp.

◆ spawnSymbol_()

expression::exptr const & reg::expression::spawnSymbol_ ( char32_t  symbol)
static

Gives an RE representing the given UTF-32-encoded symbol.

More formally, the RE's language will be {<symbol>}.

Parameters
symbolthe symbol the RE should represent or "" for the empty string ε
Returns
exptr to the RE representing the symbol

Definition at line 100 of file expression.cpp.

◆ to_string()

string reg::expression::to_string ( ) const

Describes this RE in UTF-8-encoded human-readable form.

Definition at line 373 of file expression.cpp.

◆ to_u32string()

u32string reg::expression::to_u32string ( ) const

Describes this RE in UTF-32-encoded human-readable form.

Definition at line 335 of file expression.cpp.

Member Data Documentation

◆ A

char32_t reg::expression::A
static
Initial value:
=
U'+'

The symbol used to represent the Alternation in a regular expression.

Definition at line 59 of file expression.h.

◆ E

char32_t reg::expression::E
static
Initial value:
=
U'ε'

The symbol used to represent the Empty string in a regular expression.

Definition at line 59 of file expression.h.

◆ K

char32_t reg::expression::K
static
Initial value:
=
U'*'

The symbol used to represent the Kleene star in a regular expression.

Definition at line 59 of file expression.h.

◆ L

char32_t reg::expression::L
static
Initial value:
=
U'('

The symbol used to represent the Left parenthesis in a regular expression.

Definition at line 59 of file expression.h.

◆ N

char32_t reg::expression::N
static
Initial value:
=
U'∅'

The symbol used to represent the Null/empty set in a regular expression.

Definition at line 59 of file expression.h.

◆ R

char32_t reg::expression::R
static
Initial value:
=
U')'

The symbol used to represent the Right parenthesis in a regular expression.

Definition at line 59 of file expression.h.


The documentation for this class was generated from the following files: