6414 lines
454 KiB
HTML
6414 lines
454 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
||
<html>
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
||
<title>User's Guide</title>
|
||
<link rel="stylesheet" href="../../../doc/src/boostbook.css" type="text/css">
|
||
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
|
||
<link rel="home" href="../index.html" title="The Boost C++ Libraries BoostBook Documentation Subset">
|
||
<link rel="up" href="../xpressive.html" title="Chapter 47. Boost.Xpressive">
|
||
<link rel="prev" href="../xpressive.html" title="Chapter 47. Boost.Xpressive">
|
||
<link rel="next" href="reference.html" title="Reference">
|
||
</head>
|
||
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
|
||
<table cellpadding="2" width="100%"><tr>
|
||
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../boost.png"></td>
|
||
<td align="center"><a href="../../../index.html">Home</a></td>
|
||
<td align="center"><a href="../../../libs/libraries.htm">Libraries</a></td>
|
||
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
|
||
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
|
||
<td align="center"><a href="../../../more/index.htm">More</a></td>
|
||
</tr></table>
|
||
<hr>
|
||
<div class="spirit-nav">
|
||
<a accesskey="p" href="../xpressive.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="reference.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
|
||
<a name="xpressive.user_s_guide"></a><a class="link" href="user_s_guide.html" title="User's Guide">User's Guide</a>
|
||
</h2></div></div></div>
|
||
<div class="toc"><dl class="toc">
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.introduction">Introduction</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive">Installing
|
||
xpressive</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start">Quick Start</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#xpressive.user_s_guide.creating_a_regex_object">Creating
|
||
a Regex Object</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching">Matching
|
||
and Searching</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results">Accessing
|
||
Results</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions">String
|
||
Substitutions</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization">String
|
||
Splitting and Tokenization</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures">Named Captures</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches">Grammars
|
||
and Nested Matches</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions">Semantic
|
||
Actions and User-Defined Assertions</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes">Symbol
|
||
Tables and Attributes</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits">Localization
|
||
and Regex Traits</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks">Tips 'N Tricks</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.concepts">Concepts</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.examples">Examples</a></span></dt>
|
||
</dl></div>
|
||
<p>
|
||
This section describes how to use xpressive to accomplish text manipulation
|
||
and parsing tasks. If you are looking for detailed information regarding specific
|
||
components in xpressive, check the <a class="link" href="reference.html" title="Reference">Reference</a>
|
||
section.
|
||
</p>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.introduction"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction" title="Introduction">Introduction</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.introduction.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.introduction.what_is_xpressive_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction.what_is_xpressive_">What
|
||
is xpressive?</a>
|
||
</h3>
|
||
<p>
|
||
xpressive is a regular expression template library. Regular expressions (regexes)
|
||
can be written as strings that are parsed dynamically at runtime (dynamic
|
||
regexes), or as <span class="emphasis"><em>expression templates</em></span><a href="#ftn.boost_xpressive.user_s_guide.introduction.f0" class="footnote" name="boost_xpressive.user_s_guide.introduction.f0"><sup class="footnote">[36]</sup></a> that are parsed at compile-time (static regexes). Dynamic regexes
|
||
have the advantage that they can be accepted from the user as input at runtime
|
||
or read from an initialization file. Static regexes have several advantages.
|
||
Since they are C++ expressions instead of strings, they can be syntax-checked
|
||
at compile-time. Also, they can naturally refer to code and data elsewhere
|
||
in your program, giving you the ability to call back into your code from
|
||
within a regex match. Finally, since they are statically bound, the compiler
|
||
can generate faster code for static regexes.
|
||
</p>
|
||
<p>
|
||
xpressive's dual nature is unique and powerful. Static xpressive is a bit
|
||
like the <a href="http://spirit.sourceforge.net" target="_top">Spirit Parser Framework</a>.
|
||
Like <a href="http://spirit.sourceforge.net" target="_top">Spirit</a>, you can build
|
||
grammars with static regexes using expression templates. (Unlike <a href="http://spirit.sourceforge.net" target="_top">Spirit</a>,
|
||
xpressive does exhaustive backtracking, trying every possibility to find
|
||
a match for your pattern.) Dynamic xpressive is a bit like <a href="../../../libs/regex" target="_top">Boost.Regex</a>.
|
||
In fact, xpressive's interface should be familiar to anyone who has used
|
||
<a href="../../../libs/regex" target="_top">Boost.Regex</a>. xpressive's innovation
|
||
comes from allowing you to mix and match static and dynamic regexes in the
|
||
same program, and even in the same expression! You can embed a dynamic regex
|
||
in a static regex, or <span class="emphasis"><em>vice versa</em></span>, and the embedded regex
|
||
will participate fully in the search, back-tracking as needed to make the
|
||
match succeed.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.introduction.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.introduction.hello__world_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.introduction.hello__world_">Hello,
|
||
world!</a>
|
||
</h3>
|
||
<p>
|
||
Enough theory. Let's have a look at <span class="emphasis"><em>Hello World</em></span>, xpressive
|
||
style:
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">hello</span><span class="special">(</span> <span class="string">"hello world!"</span> <span class="special">);</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">hello</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span> <span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// first capture</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">2</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// second capture</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program outputs the following:
|
||
</p>
|
||
<pre class="programlisting">hello world!
|
||
hello
|
||
world
|
||
</pre>
|
||
<p>
|
||
The first thing you'll notice about the code is that all the types in xpressive
|
||
live in the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span></code> namespace.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Most of the rest of the examples in this document will leave off the <code class="computeroutput"><span class="keyword">using</span> <span class="keyword">namespace</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span></code>
|
||
directive. Just pretend it's there.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
Next, you'll notice the type of the regular expression object is <code class="computeroutput"><span class="identifier">sregex</span></code>. If you are familiar with <a href="../../../libs/regex" target="_top">Boost.Regex</a>, this is different than what you
|
||
are used to. The "<code class="computeroutput"><span class="identifier">s</span></code>"
|
||
in "<code class="computeroutput"><span class="identifier">sregex</span></code>" stands
|
||
for "<code class="computeroutput"><span class="identifier">string</span></code>", indicating
|
||
that this regex can be used to find patterns in <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>
|
||
objects. I'll discuss this difference and its implications in detail later.
|
||
</p>
|
||
<p>
|
||
Notice how the regex object is initialized:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
To create a regular expression object from a string, you must call a factory
|
||
method such as <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_48_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>.
|
||
This is another area in which xpressive differs from other object-oriented
|
||
regular expression libraries. Other libraries encourage you to think of a
|
||
regular expression as a kind of string on steroids. In xpressive, regular
|
||
expressions are not strings; they are little programs in a domain-specific
|
||
language. Strings are only one <span class="emphasis"><em>representation</em></span> of that
|
||
language. Another representation is an expression template. For example,
|
||
the above line of code is equivalent to the following:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">' '</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">'!'</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
This describes the same regular expression, except it uses the domain-specific
|
||
embedded language defined by static xpressive.
|
||
</p>
|
||
<p>
|
||
As you can see, static regexes have a syntax that is noticeably different
|
||
than standard Perl syntax. That is because we are constrained by C++'s syntax.
|
||
The biggest difference is the use of <code class="computeroutput"><span class="special">>></span></code>
|
||
to mean "followed by". For instance, in Perl you can just put sub-expressions
|
||
next to each other:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">abc</span>
|
||
</pre>
|
||
<p>
|
||
But in C++, there must be an operator separating sub-expressions:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">a</span> <span class="special">>></span> <span class="identifier">b</span> <span class="special">>></span> <span class="identifier">c</span>
|
||
</pre>
|
||
<p>
|
||
In Perl, parentheses <code class="computeroutput"><span class="special">()</span></code> have
|
||
special meaning. They group, but as a side-effect they also create back-references
|
||
like <code class="literal">$1</code> and <code class="literal">$2</code>. In C++, there is no
|
||
way to overload parentheses to give them side-effects. To get the same effect,
|
||
we use the special <code class="computeroutput"><span class="identifier">s1</span></code>, <code class="computeroutput"><span class="identifier">s2</span></code>, etc. tokens. Assign to one to create
|
||
a back-reference (known as a sub-match in xpressive).
|
||
</p>
|
||
<p>
|
||
You'll also notice that the one-or-more repetition operator <code class="computeroutput"><span class="special">+</span></code> has moved from postfix to prefix position.
|
||
That's because C++ doesn't have a postfix <code class="computeroutput"><span class="special">+</span></code>
|
||
operator. So:
|
||
</p>
|
||
<pre class="programlisting"><span class="string">"\\w+"</span>
|
||
</pre>
|
||
<p>
|
||
is the same as:
|
||
</p>
|
||
<pre class="programlisting"><span class="special">+</span><span class="identifier">_w</span>
|
||
</pre>
|
||
<p>
|
||
We'll cover all the other differences <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes" title="Static Regexes">later</a>.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.installing_xpressive"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive" title="Installing xpressive">Installing
|
||
xpressive</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.installing_xpressive.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.getting_xpressive"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.getting_xpressive">Getting
|
||
xpressive</a>
|
||
</h3>
|
||
<p>
|
||
There are two ways to get xpressive. The first and simplest is to download
|
||
the latest version of Boost. Just go to <a href="http://sf.net/projects/boost" target="_top">http://sf.net/projects/boost</a>
|
||
and follow the <span class="quote">“<span class="quote">Download</span>”</span> link.
|
||
</p>
|
||
<p>
|
||
The second way is by directly accessing the Boost Subversion repository.
|
||
Just go to <a href="http://svn.boost.org/trac/boost/" target="_top">http://svn.boost.org/trac/boost/</a>
|
||
and follow the instructions there for anonymous Subversion access. The version
|
||
in Boost Subversion is unstable.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.installing_xpressive.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.building_with_xpressive"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.building_with_xpressive">Building
|
||
with xpressive</a>
|
||
</h3>
|
||
<p>
|
||
Xpressive is a header-only template library, which means you don't need to
|
||
alter your build scripts or link to any separate lib file to use it. All
|
||
you need to do is <code class="computeroutput"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>.
|
||
If you are only using static regexes, you can improve compile times by only
|
||
including <code class="computeroutput"><span class="identifier">xpressive_static</span><span class="special">.</span><span class="identifier">hpp</span></code>. Likewise,
|
||
you can include <code class="computeroutput"><span class="identifier">xpressive_dynamic</span><span class="special">.</span><span class="identifier">hpp</span></code> if
|
||
you only plan on using dynamic regexes.
|
||
</p>
|
||
<p>
|
||
If you would also like to use semantic actions or custom assertions with
|
||
your static regexes, you will need to additionally include <code class="computeroutput"><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span></code>.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.installing_xpressive.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.requirements">Requirements</a>
|
||
</h3>
|
||
<p>
|
||
Xpressive requires Boost version 1.34.1 or higher.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.installing_xpressive.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.installing_xpressive.supported_compilers"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.installing_xpressive.supported_compilers">Supported
|
||
Compilers</a>
|
||
</h3>
|
||
<p>
|
||
Currently, Boost.Xpressive is known to work on the following compilers:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
Visual C++ 7.1 and higher
|
||
</li>
|
||
<li class="listitem">
|
||
GNU C++ 3.4 and higher
|
||
</li>
|
||
<li class="listitem">
|
||
Intel for Linux 8.1 and higher
|
||
</li>
|
||
<li class="listitem">
|
||
Intel for Windows 10 and higher
|
||
</li>
|
||
<li class="listitem">
|
||
tru64cxx 71 and higher
|
||
</li>
|
||
<li class="listitem">
|
||
MinGW 3.4 and higher
|
||
</li>
|
||
<li class="listitem">
|
||
HP C/aC++ A.06.14
|
||
</li>
|
||
</ul></div>
|
||
<p>
|
||
Check the latest tests results at Boost's <a href="http://beta.boost.org/development/tests/trunk/developer/xpressive.html" target="_top">Regression
|
||
Results Page</a>.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Please send any questions, comments and bug reports to eric <at>
|
||
boost-consulting <dot> com.
|
||
</p></td></tr>
|
||
</table></div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.quick_start"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start" title="Quick Start">Quick Start</a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
You don't need to know much to start being productive with xpressive. Let's
|
||
begin with the nickel tour of the types and algorithms xpressive provides.
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.quick_start.t0"></a><p class="title"><b>Table 47.1. xpressive's Tool-Box</b></p>
|
||
<div class="table-contents"><table class="table" summary="xpressive's Tool-Box">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Tool
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Description
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Contains a compiled regular expression. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
is the most important type in xpressive. Everything you do with
|
||
xpressive will begin with creating an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>,
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
contains the results of a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
operation. It acts like a vector of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
objects. A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
object contains a marked sub-expression (also known as a back-reference
|
||
in Perl). It is basically just a pair of iterators representing
|
||
the begin and end of the marked sub-expression.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Checks to see if a string matches a regex. For <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
to succeed, the <span class="emphasis"><em>whole string</em></span> must match the
|
||
regex, from beginning to end. If you give <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>,
|
||
it will write into it any marked sub-expressions it finds.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Searches a string to find a sub-string that matches the regex.
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
will try to find a match at every position in the string, starting
|
||
at the beginning, and stopping when it finds a match or when the
|
||
string is exhausted. As with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>,
|
||
if you give <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>,
|
||
it will write into it any marked sub-expressions it finds.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Given an input string, a regex, and a substitution string, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
builds a new string by replacing those parts of the input string
|
||
that match the regex with the substitution string. The substitution
|
||
string can contain references to marked sub-expressions.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
An STL-compatible iterator that makes it easy to find all the places
|
||
in a string that match a regex. Dereferencing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
|
||
returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>.
|
||
Incrementing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
|
||
finds the next match.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Like <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>,
|
||
except dereferencing a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
returns a string. By default, it will return the whole sub-string
|
||
that the regex matched, but it can be configured to return any
|
||
or all of the marked sub-expressions one at a time, or even the
|
||
parts of the string that <span class="emphasis"><em>didn't</em></span> match the
|
||
regex.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A factory for <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
objects. It "compiles" a string into a regular expression.
|
||
You will not usually have to deal directly with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
because the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
class has a factory method that uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
internally. But if you need to do anything fancy like create a
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
object with a different <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
|
||
you will need to use a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
explicitly.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
Now that you know a bit about the tools xpressive provides, you can pick
|
||
the right tool for you by answering the following two questions:
|
||
</p>
|
||
<div class="orderedlist"><ol class="orderedlist" type="1">
|
||
<li class="listitem">
|
||
What <span class="emphasis"><em>iterator</em></span> type will you use to traverse your
|
||
data?
|
||
</li>
|
||
<li class="listitem">
|
||
What do you want to <span class="emphasis"><em>do</em></span> to your data?
|
||
</li>
|
||
</ol></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.quick_start.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.quick_start.know_your_iterator_type"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_iterator_type">Know
|
||
Your Iterator Type</a>
|
||
</h3>
|
||
<p>
|
||
Most of the classes in xpressive are templates that are parameterized on
|
||
the iterator type. xpressive defines some common typedefs to make the job
|
||
of choosing the right types easier. You can use the table below to find the
|
||
right types based on the type of your iterator.
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.quick_start.t1"></a><p class="title"><b>Table 47.2. xpressive Typedefs vs. Iterator Types</b></p>
|
||
<div class="table-contents"><table class="table" summary="xpressive Typedefs vs. Iterator Types">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
<col>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
std::string::const_iterator
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
char const *
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
std::wstring::const_iterator
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
wchar_t const *
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">sregex</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">cregex</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wsregex</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wcregex</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">smatch</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">cmatch</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wsmatch</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wcmatch</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">sregex_compiler</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">cregex_compiler</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wsregex_compiler</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wcregex_compiler</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">sregex_iterator</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">cregex_iterator</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wcregex_iterator</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">sregex_token_iterator</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">cregex_token_iterator</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wsregex_token_iterator</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">wcregex_token_iterator</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
You should notice the systematic naming convention. Many of these types are
|
||
used together, so the naming convention helps you to use them consistently.
|
||
For instance, if you have a <code class="computeroutput"><span class="identifier">sregex</span></code>,
|
||
you should also be using a <code class="computeroutput"><span class="identifier">smatch</span></code>.
|
||
</p>
|
||
<p>
|
||
If you are not using one of those four iterator types, then you can use the
|
||
templates directly and specify your iterator type.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.quick_start.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.quick_start.know_your_task"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_task">Know Your
|
||
Task</a>
|
||
</h3>
|
||
<p>
|
||
Do you want to find a pattern once? Many times? Search and replace? xpressive
|
||
has tools for all that and more. Below is a quick reference:
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.quick_start.t2"></a><p class="title"><b>Table 47.3. Tasks and Tools</b></p>
|
||
<div class="table-contents"><table class="table" summary="Tasks and Tools">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
To do this ...
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Use this ...
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">See
|
||
if a whole string matches a regex</a>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
algorithm
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">See
|
||
if a string contains a sub-string that matches a regex</a>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
algorithm
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">Replace
|
||
all sub-strings that match a regex</a>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
algorithm
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time">Find
|
||
all the sub-strings that match a regex and step through them one
|
||
at a time</a>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
|
||
class
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex">Split
|
||
a string into tokens that each match a regex</a>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
class
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/tip.png" alt="tip"></span> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter">Split
|
||
a string using a regex as a delimiter</a>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
class
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
These algorithms and classes are described in excruciating detail in the
|
||
Reference section.
|
||
</p>
|
||
<div class="tip"><table border="0" summary="Tip">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
|
||
<th align="left">Tip</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Try clicking on a task in the table above to see a complete example program
|
||
that uses xpressive to solve that particular task.
|
||
</p></td></tr>
|
||
</table></div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="xpressive.user_s_guide.creating_a_regex_object"></a><a class="link" href="user_s_guide.html#xpressive.user_s_guide.creating_a_regex_object" title="Creating a Regex Object">Creating
|
||
a Regex Object</a>
|
||
</h3></div></div></div>
|
||
<div class="toc"><dl class="toc">
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes">Static
|
||
Regexes</a></span></dt>
|
||
<dt><span class="section"><a href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes">Dynamic
|
||
Regexes</a></span></dt>
|
||
</dl></div>
|
||
<p>
|
||
When using xpressive, the first thing you'll do is create a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
object. This section goes over the nuts and bolts of building a regular expression
|
||
in the two dialects xpressive supports: static and dynamic.
|
||
</p>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes" title="Static Regexes">Static
|
||
Regexes</a>
|
||
</h4></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
The feature that really sets xpressive apart from other C/C++ regular expression
|
||
libraries is the ability to author a regular expression using C++ expressions.
|
||
xpressive achieves this through operator overloading, using a technique
|
||
called <span class="emphasis"><em>expression templates</em></span> to embed a mini-language
|
||
dedicated to pattern matching within C++. These "static regexes"
|
||
have many advantages over their string-based brethren. In particular, static
|
||
regexes:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
are syntax-checked at compile-time; they will never fail at run-time
|
||
due to a syntax error.
|
||
</li>
|
||
<li class="listitem">
|
||
can naturally refer to other C++ data and code, including other regexes,
|
||
making it simple to build grammars out of regular expressions and bind
|
||
user-defined actions that execute when parts of your regex match.
|
||
</li>
|
||
<li class="listitem">
|
||
are statically bound for better inlining and optimization. Static regexes
|
||
require no state tables, virtual functions, byte-code or calls through
|
||
function pointers that cannot be resolved at compile time.
|
||
</li>
|
||
<li class="listitem">
|
||
are not limited to searching for patterns in strings. You can declare
|
||
a static regex that finds patterns in an array of integers, for instance.
|
||
</li>
|
||
</ul></div>
|
||
<p>
|
||
Since we compose static regexes using C++ expressions, we are constrained
|
||
by the rules for legal C++ expressions. Unfortunately, that means that
|
||
"classic" regular expression syntax cannot always be mapped cleanly
|
||
into C++. Rather, we map the regex <span class="emphasis"><em>constructs</em></span>, picking
|
||
new syntax that is legal C++.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.construction_and_assignment"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.construction_and_assignment">Construction
|
||
and Assignment</a>
|
||
</h3>
|
||
<p>
|
||
You create a static regex by assigning one to an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>.
|
||
For instance, the following defines a regex that can be used to find patterns
|
||
in objects of type <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="char">'$'</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">>></span> <span class="char">'.'</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
Assignment works similarly.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.character_and_string_literals"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.character_and_string_literals">Character
|
||
and String Literals</a>
|
||
</h3>
|
||
<p>
|
||
In static regexes, character and string literals match themselves. For
|
||
instance, in the regex above, <code class="computeroutput"><span class="char">'$'</span></code>
|
||
and <code class="computeroutput"><span class="char">'.'</span></code> match the characters
|
||
<code class="computeroutput"><span class="char">'$'</span></code> and <code class="computeroutput"><span class="char">'.'</span></code>
|
||
respectively. Don't be confused by the fact that <code class="literal">$</code> and
|
||
<code class="literal">.</code> are meta-characters in Perl. In xpressive, literals
|
||
always represent themselves.
|
||
</p>
|
||
<p>
|
||
When using literals in static regexes, you must take care that at least
|
||
one operand is not a literal. For instance, the following are <span class="emphasis"><em>not</em></span>
|
||
valid regexes:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re1</span> <span class="special">=</span> <span class="char">'a'</span> <span class="special">>></span> <span class="char">'b'</span><span class="special">;</span> <span class="comment">// ERROR!</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re2</span> <span class="special">=</span> <span class="special">+</span><span class="char">'a'</span><span class="special">;</span> <span class="comment">// ERROR!</span>
|
||
</pre>
|
||
<p>
|
||
The two operands to the binary <code class="computeroutput"><span class="special">>></span></code>
|
||
operator are both literals, and the operand of the unary <code class="computeroutput"><span class="special">+</span></code> operator is also a literal, so these statements
|
||
will call the native C++ binary right-shift and unary plus operators, respectively.
|
||
That's not what we want. To get operator overloading to kick in, at least
|
||
one operand must be a user-defined type. We can use xpressive's <code class="computeroutput"><span class="identifier">as_xpr</span><span class="special">()</span></code>
|
||
helper function to "taint" an expression with regex-ness, forcing
|
||
operator overloading to find the correct operators. The two regexes above
|
||
should be written as:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re1</span> <span class="special">=</span> <span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">>></span> <span class="char">'b'</span><span class="special">;</span> <span class="comment">// OK</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re2</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">);</span> <span class="comment">// OK</span>
|
||
</pre>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.sequencing_and_alternation"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.sequencing_and_alternation">Sequencing
|
||
and Alternation</a>
|
||
</h3>
|
||
<p>
|
||
As you've probably already noticed, sub-expressions in static regexes must
|
||
be separated by the sequencing operator, <code class="computeroutput"><span class="special">>></span></code>.
|
||
You can read this operator as "followed by".
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Match an 'a' followed by a digit</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="char">'a'</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
Alternation works just as it does in Perl with the <code class="computeroutput"><span class="special">|</span></code>
|
||
operator. You can read this operator as "or". For example:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// match a digit character or a word character one or more times</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">_w</span> <span class="special">);</span>
|
||
</pre>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h4"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.grouping_and_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.grouping_and_captures">Grouping
|
||
and Captures</a>
|
||
</h3>
|
||
<p>
|
||
In Perl, parentheses <code class="computeroutput"><span class="special">()</span></code> have
|
||
special meaning. They group, but as a side-effect they also create back-references
|
||
like <code class="literal">$1</code> and <code class="literal">$2</code>. In C++, parentheses
|
||
only group -- there is no way to give them side-effects. To get the same
|
||
effect, we use the special <code class="computeroutput"><span class="identifier">s1</span></code>,
|
||
<code class="computeroutput"><span class="identifier">s2</span></code>, etc. tokens. Assigning
|
||
to one creates a back-reference. You can then use the back-reference later
|
||
in your expression, like using <code class="literal">\1</code> and <code class="literal">\2</code>
|
||
in Perl. For example, consider the following regex, which finds matching
|
||
HTML tags:
|
||
</p>
|
||
<pre class="programlisting"><span class="string">"<(\\w+)>.*?</\\1>"</span>
|
||
</pre>
|
||
<p>
|
||
In static xpressive, this would be:
|
||
</p>
|
||
<pre class="programlisting"><span class="char">'<'</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">'>'</span> <span class="special">>></span> <span class="special">-*</span><span class="identifier">_</span> <span class="special">>></span> <span class="string">"</"</span> <span class="special">>></span> <span class="identifier">s1</span> <span class="special">>></span> <span class="char">'>'</span>
|
||
</pre>
|
||
<p>
|
||
Notice how you capture a back-reference by assigning to <code class="computeroutput"><span class="identifier">s1</span></code>,
|
||
and then you use <code class="computeroutput"><span class="identifier">s1</span></code> later
|
||
in the pattern to find the matching end tag.
|
||
</p>
|
||
<div class="tip"><table border="0" summary="Tip">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
|
||
<th align="left">Tip</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
<span class="bold"><strong>Grouping without capturing a back-reference</strong></span>
|
||
<br> <br> In xpressive, if you just want grouping without capturing
|
||
a back-reference, you can just use <code class="computeroutput"><span class="special">()</span></code>
|
||
without <code class="computeroutput"><span class="identifier">s1</span></code>. That is the
|
||
equivalent of Perl's <code class="literal">(?:)</code> non-capturing grouping construct.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h5"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.case_insensitivity_and_internationalization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.case_insensitivity_and_internationalization">Case-Insensitivity
|
||
and Internationalization</a>
|
||
</h3>
|
||
<p>
|
||
Perl lets you make part of your regular expression case-insensitive by
|
||
using the <code class="literal">(?i:)</code> pattern modifier. xpressive also has
|
||
a case-insensitivity pattern modifier, called <code class="computeroutput"><span class="identifier">icase</span></code>.
|
||
You can use it as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="string">"this"</span> <span class="special">>></span> <span class="identifier">icase</span><span class="special">(</span> <span class="string">"that"</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
In this regular expression, <code class="computeroutput"><span class="string">"this"</span></code>
|
||
will be matched exactly, but <code class="computeroutput"><span class="string">"that"</span></code>
|
||
will be matched irrespective of case.
|
||
</p>
|
||
<p>
|
||
Case-insensitive regular expressions raise the issue of internationalization:
|
||
how should case-insensitive character comparisons be evaluated? Also, many
|
||
character classes are locale-specific. Which characters are matched by
|
||
<code class="computeroutput"><span class="identifier">digit</span></code> and which are matched
|
||
by <code class="computeroutput"><span class="identifier">alpha</span></code>? The answer depends
|
||
on the <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code> object the regular expression
|
||
object is using. By default, all regular expression objects use the global
|
||
locale. You can override the default by using the <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> pattern modifier, as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">my_locale</span> <span class="special">=</span> <span class="comment">/* initialize a std::locale object */</span><span class="special">;</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span> <span class="identifier">my_locale</span> <span class="special">)(</span> <span class="special">+</span><span class="identifier">alpha</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">digit</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
This regular expression will evaluate <code class="computeroutput"><span class="identifier">alpha</span></code>
|
||
and <code class="computeroutput"><span class="identifier">digit</span></code> according to
|
||
<code class="computeroutput"><span class="identifier">my_locale</span></code>. See the section
|
||
on <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits" title="Localization and Regex Traits">Localization
|
||
and Regex Traits</a> for more information about how to customize the
|
||
behavior of your regexes.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.h6"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.static_xpressive_syntax_cheat_sheet"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.static_xpressive_syntax_cheat_sheet">Static
|
||
xpressive Syntax Cheat Sheet</a>
|
||
</h3>
|
||
<p>
|
||
The table below lists the familiar regex constructs and their equivalents
|
||
in static xpressive.
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.static_regexes.t0"></a><p class="title"><b>Table 47.4. Perl syntax vs. Static xpressive syntax</b></p>
|
||
<div class="table-contents"><table class="table" summary="Perl syntax vs. Static xpressive syntax">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Perl
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Static xpressive
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Meaning
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">.</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/_.html" title="Global _">_</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
any character (assuming Perl's /s modifier).
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">ab</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">a</span> <span class="special">>></span>
|
||
<span class="identifier">b</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
sequencing of <code class="literal">a</code> and <code class="literal">b</code> sub-expressions.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a|b</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">a</span> <span class="special">|</span>
|
||
<span class="identifier">b</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
alternation of <code class="literal">a</code> and <code class="literal">b</code>
|
||
sub-expressions.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(a)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">(</span><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a><span class="special">=</span> <span class="identifier">a</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
group and capture a back-reference.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?:a)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">(</span><span class="identifier">a</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
group and do not capture a back-reference.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\1</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a previously captured back-reference.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a*</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">*</span><span class="identifier">a</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
zero or more times, greedy.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a+</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">+</span><span class="identifier">a</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
one or more times, greedy.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a?</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">!</span><span class="identifier">a</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
zero or one time, greedy.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a{n,m}</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/repeat.html" title="Function repeat">repeat</a><span class="special"><</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">>(</span><span class="identifier">a</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
between <code class="literal">n</code> and <code class="literal">m</code> times,
|
||
greedy.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a*?</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">-*</span><span class="identifier">a</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
zero or more times, non-greedy.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a+?</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">-+</span><span class="identifier">a</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
one or more times, non-greedy.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a??</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">-!</span><span class="identifier">a</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
zero or one time, non-greedy.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">a{n,m}?</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">-</span><a class="link" href="../boost/xpressive/repeat.html" title="Function repeat">repeat</a><span class="special"><</span><span class="identifier">n</span><span class="special">,</span><span class="identifier">m</span><span class="special">>(</span><span class="identifier">a</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
between <code class="literal">n</code> and <code class="literal">m</code> times,
|
||
non-greedy.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">^</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/bos.html" title="Global bos">bos</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
beginning of sequence assertion.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/eos.html" title="Global eos">eos</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
end of sequence assertion.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\b</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/_b.html" title="Global _b">_b</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
word boundary assertion.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\B</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_b.html" title="Global _b">_b</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
not word boundary assertion.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\n</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/_n.html" title="Global _n">_n</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
literal newline.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">.</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_n.html" title="Global _n">_n</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
any character except a literal newline (without Perl's /s modifier).
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\r?\n|\r</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/_ln.html" title="Global _ln">_ln</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
logical newline.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[^\r\n]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_ln.html" title="Global _ln">_ln</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
any single character not a logical newline.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\w</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/_w.html" title="Global _w">_w</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a word character, equivalent to set[alnum | '_'].
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\W</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_w.html" title="Global _w">_w</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
not a word character, equivalent to ~set[alnum | '_'].
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\d</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/_d.html" title="Global _d">_d</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a digit character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\D</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_d.html" title="Global _d">_d</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
not a digit character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\s</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/_s.html" title="Global _s">_s</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a space character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\S</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/_s.html" title="Global _s">_s</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
not a space character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:alnum:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/alnum.html" title="Global alnum">alnum</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
an alpha-numeric character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:alpha:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/alpha.html" title="Global alpha">alpha</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
an alphabetic character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:blank:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/blank.html" title="Global blank">blank</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a horizontal white-space character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:cntrl:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/cntrl.html" title="Global cntrl">cntrl</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a control character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:digit:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/digit.html" title="Global digit">digit</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a digit character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:graph:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/graph.html" title="Global graph">graph</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a graphable character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:lower:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/lower.html" title="Global lower">lower</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a lower-case character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:print:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/print.html" title="Global print">print</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a printing character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:punct:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/punct.html" title="Global punct">punct</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a punctuation character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:space:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/space.html" title="Global space">space</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a white-space character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:upper:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/upper.html" title="Global upper">upper</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
an upper-case character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[:xdigit:]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/xdigit.html" title="Global xdigit">xdigit</a></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a hexadecimal digit character.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[0-9]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
characters in range <code class="computeroutput"><span class="char">'0'</span></code>
|
||
through <code class="computeroutput"><span class="char">'9'</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[abc]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">as_xpr</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">|</span> <span class="char">'b'</span> <span class="special">|</span><span class="char">'c'</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
characters <code class="computeroutput"><span class="char">'a'</span></code>, <code class="computeroutput"><span class="char">'b'</span></code>, or <code class="computeroutput"><span class="char">'c'</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[abc]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<span class="emphasis"><em>same as above</em></span>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[0-9abc]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">[</span> <a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span> <span class="special">|</span>
|
||
<span class="char">'a'</span> <span class="special">|</span>
|
||
<span class="char">'b'</span> <span class="special">|</span>
|
||
<span class="char">'c'</span> <span class="special">]</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
characters <code class="computeroutput"><span class="char">'a'</span></code>, <code class="computeroutput"><span class="char">'b'</span></code>, <code class="computeroutput"><span class="char">'c'</span></code>
|
||
or in range <code class="computeroutput"><span class="char">'0'</span></code> through
|
||
<code class="computeroutput"><span class="char">'9'</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[0-9abc]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">[</span> <a class="link" href="../boost/xpressive/range.html" title="Function template range">range</a><span class="special">(</span><span class="char">'0'</span><span class="special">,</span><span class="char">'9'</span><span class="special">)</span> <span class="special">|</span>
|
||
<span class="special">(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span> <span class="special">]</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<span class="emphasis"><em>same as above</em></span>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">[^abc]</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~(</span><a class="link" href="../boost/xpressive/set.html" title="Global set">set</a><span class="special">=</span> <span class="char">'a'</span><span class="special">,</span><span class="char">'b'</span><span class="special">,</span><span class="char">'c'</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
not characters <code class="computeroutput"><span class="char">'a'</span></code>,
|
||
<code class="computeroutput"><span class="char">'b'</span></code>, or <code class="computeroutput"><span class="char">'c'</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?i:<span class="emphasis"><em>stuff</em></span>)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/icase.html" title="Function template icase">icase</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
match <span class="emphasis"><em>stuff</em></span> disregarding case.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?><span class="emphasis"><em>stuff</em></span>)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/keep.html" title="Function template keep">keep</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
independent sub-expression, match <span class="emphasis"><em>stuff</em></span>
|
||
and turn off backtracking.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?=<span class="emphasis"><em>stuff</em></span>)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/before.html" title="Function template before">before</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
positive look-ahead assertion, match if before <span class="emphasis"><em>stuff</em></span>
|
||
but don't include <span class="emphasis"><em>stuff</em></span> in the match.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?!<span class="emphasis"><em>stuff</em></span>)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/before.html" title="Function template before">before</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
negative look-ahead assertion, match if not before <span class="emphasis"><em>stuff</em></span>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?<=<span class="emphasis"><em>stuff</em></span>)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><a class="link" href="../boost/xpressive/after.html" title="Function template after">after</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
positive look-behind assertion, match if after <span class="emphasis"><em>stuff</em></span>
|
||
but don't include <span class="emphasis"><em>stuff</em></span> in the match. (<span class="emphasis"><em>stuff</em></span>
|
||
must be constant-width.)
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?<!<span class="emphasis"><em>stuff</em></span>)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="special">~</span><a class="link" href="../boost/xpressive/after.html" title="Function template after">after</a><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
negative look-behind assertion, match if not after <span class="emphasis"><em>stuff</em></span>.
|
||
(<span class="emphasis"><em>stuff</em></span> must be constant-width.)
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?P<<span class="emphasis"><em>name</em></span>><span class="emphasis"><em>stuff</em></span>)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><code class="literal"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code>
|
||
</code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">(</span></code><span class="emphasis"><em>n</em></span><code class="computeroutput"><span class="special">);</span></code><br> ...<br> <code class="computeroutput"><span class="special">(</span></code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">=</span> </code><code class="literal"><span class="emphasis"><em>stuff</em></span></code><code class="computeroutput"><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Create a named capture.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">(?P=<span class="emphasis"><em>name</em></span>)</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><code class="literal"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code>
|
||
</code><code class="literal"><span class="emphasis"><em>name</em></span></code><code class="computeroutput"><span class="special">(</span></code><span class="emphasis"><em>n</em></span><code class="computeroutput"><span class="special">);</span></code><br> ...<br> <code class="literal"><span class="emphasis"><em>name</em></span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Refer back to a previously created named capture.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
<br>
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes" title="Dynamic Regexes">Dynamic
|
||
Regexes</a>
|
||
</h4></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
Static regexes are dandy, but sometimes you need something a bit more ...
|
||
dynamic. Imagine you are developing a text editor with a regex search/replace
|
||
feature. You need to accept a regular expression from the end user as input
|
||
at run-time. There should be a way to parse a string into a regular expression.
|
||
That's what xpressive's dynamic regexes are for. They are built from the
|
||
same core components as their static counterparts, but they are late-bound
|
||
so you can specify them at run-time.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.construction_and_assignment"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.construction_and_assignment">Construction
|
||
and Assignment</a>
|
||
</h3>
|
||
<p>
|
||
There are two ways to create a dynamic regex: with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_48_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>
|
||
function or with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
class template. Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_48_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>
|
||
if you want the default locale. Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
if you need to specify a different locale. In the section on <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">regex
|
||
grammars</a>, we'll see another use for <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>.
|
||
</p>
|
||
<p>
|
||
Here is an example of using <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">compile</span><span class="special">()</span></code>:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"this|that"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">icase</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
Here is the same example using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"this|that"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">icase</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html#id-1_3_48_5_18_2_1_1_9_1-bb">basic_regex<>::compile()</a></code></code>
|
||
is implemented in terms of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.dynamic_xpressive_syntax"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.dynamic_xpressive_syntax">Dynamic
|
||
xpressive Syntax</a>
|
||
</h3>
|
||
<p>
|
||
Since the dynamic syntax is not constrained by the rules for valid C++
|
||
expressions, we are free to use familiar syntax for dynamic regexes. For
|
||
this reason, the syntax used by xpressive for dynamic regexes follows the
|
||
lead set by John Maddock's <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm" target="_top">proposal</a>
|
||
to add regular expressions to the Standard Library. It is essentially the
|
||
syntax standardized by <a href="http://www.ecma-international.org/publications/files/ECMA-ST/Ecma-262.pdf" target="_top">ECMAScript</a>,
|
||
with minor changes in support of internationalization.
|
||
</p>
|
||
<p>
|
||
Since the syntax is documented exhaustively elsewhere, I will simply refer
|
||
you to the existing standards, rather than duplicate the specification
|
||
here.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.internationalization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.creating_a_regex_object.dynamic_regexes.internationalization">Internationalization</a>
|
||
</h3>
|
||
<p>
|
||
As with static regexes, dynamic regexes support internationalization by
|
||
allowing you to specify a different <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>.
|
||
To do this, you must use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>.
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
class has an <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code>
|
||
function. After you have imbued a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
object with a custom <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
|
||
all regex objects compiled by that <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
will use that locale. For example:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">my_locale</span> <span class="special">=</span> <span class="comment">/* initialize your locale object here */</span><span class="special">;</span>
|
||
<span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
|
||
<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">imbue</span><span class="special">(</span> <span class="identifier">my_locale</span> <span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+|\\d+"</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
This regex will use <code class="computeroutput"><span class="identifier">my_locale</span></code>
|
||
when evaluating the intrinsic character sets <code class="computeroutput"><span class="string">"\\w"</span></code>
|
||
and <code class="computeroutput"><span class="string">"\\d"</span></code>.
|
||
</p>
|
||
</div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.matching_and_searching"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching" title="Matching and Searching">Matching
|
||
and Searching</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.matching_and_searching.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
Once you have created a regex object, you can use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
algorithms to find patterns in strings. This page covers the basics of regex
|
||
matching and searching. In all cases, if you are familiar with how <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
in the <a href="../../../libs/regex" target="_top">Boost.Regex</a> library work, xpressive's
|
||
versions work the same way.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.matching_and_searching.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.seeing_if_a_string_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.seeing_if_a_string_matches_a_regex">Seeing
|
||
if a String Matches a Regex</a>
|
||
</h3>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
algorithm checks to see if a regex matches a given input.
|
||
</p>
|
||
<div class="warning"><table border="0" summary="Warning">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Warning]" src="../../../doc/src/images/warning.png"></td>
|
||
<th align="left">Warning</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
algorithm will only report success if the regex matches the <span class="emphasis"><em>whole
|
||
input</em></span>, from beginning to end. If the regex matches only a part
|
||
of the input, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
will return false. If you want to search through the string looking for
|
||
sub-strings that the regex matches, use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
algorithm.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
The input can be a bidirectional range such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>,
|
||
a C-style null-terminated string or a pair of iterators. In all cases, the
|
||
type of the iterator used to traverse the input sequence must match the iterator
|
||
type used to declare the regex object. (You can use the table in the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.quick_start.know_your_iterator_type">Quick
|
||
Start</a> to find the correct regex type for your iterator.)
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">cregex</span> <span class="identifier">cre</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// this regex can match C-style strings</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">sre</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// this regex can match std::strings</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">cre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// OK</span>
|
||
<span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">(</span><span class="string">"hello"</span><span class="special">),</span> <span class="identifier">sre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// OK</span>
|
||
<span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">sre</span> <span class="special">)</span> <span class="special">)</span> <span class="comment">// ERROR! iterator mis-match!</span>
|
||
<span class="special">{</span> <span class="comment">/*...*/</span> <span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
algorithm optionally accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
struct as an out parameter. If given, the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
algorithm fills in the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
struct with information about which parts of the regex matched which parts
|
||
of the input.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">cmatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
<span class="identifier">cregex</span> <span class="identifier">cre</span> <span class="special">=</span> <span class="special">+(</span><span class="identifier">s1</span><span class="special">=</span> <span class="identifier">_w</span><span class="special">);</span>
|
||
|
||
<span class="comment">// store the results of the regex_match in "what"</span>
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="string">"hello"</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">cre</span> <span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// prints "o"</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
algorithm also optionally accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
|
||
bitmask. With <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>,
|
||
you can control certain aspects of how the match is evaluated. See the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
|
||
reference for a complete list of the flags and their meanings.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"hello"</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">sre</span> <span class="special">=</span> <span class="identifier">bol</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span>
|
||
|
||
<span class="comment">// match_not_bol means that "bol" should not match at [begin,begin)</span>
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">sre</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">match_not_bol</span> <span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="comment">// should never get here!!!</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">here</a>
|
||
to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>.
|
||
And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
reference to see a complete list of the available overloads.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.matching_and_searching.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.matching_and_searching.searching_for_matching_sub_strings"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.matching_and_searching.searching_for_matching_sub_strings">Searching
|
||
for Matching Sub-Strings</a>
|
||
</h3>
|
||
<p>
|
||
Use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
when you want to know if an input sequence contains a sub-sequence that a
|
||
regex matches. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
will try to match the regex at the beginning of the input sequence and scan
|
||
forward in the sequence until it either finds a match or exhausts the sequence.
|
||
</p>
|
||
<p>
|
||
In all other regards, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
behaves like <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
<span class="emphasis"><em>(see above)</em></span>. In particular, it can operate on a bidirectional
|
||
range such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>, C-style null-terminated strings
|
||
or iterator ranges. The same care must be taken to ensure that the iterator
|
||
type of your regex matches the iterator type of your input sequence. As with
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>,
|
||
you can optionally provide a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
struct to receive the results of the search, and a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_constants/match_flag_type.html" title="Type match_flag_type">match_flag_type</a></code></code>
|
||
bitmask to control how the match is evaluated.
|
||
</p>
|
||
<p>
|
||
Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">here</a>
|
||
to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>.
|
||
And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
reference to see a complete list of the available overloads.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.accessing_results"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results" title="Accessing Results">Accessing
|
||
Results</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.accessing_results.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
Sometimes, it is not enough to know simply whether a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
was successful or not. If you pass an object of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>,
|
||
then after the algorithm has completed successfully the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
will contain extra information about which parts of the regex matched which
|
||
parts of the sequence. In Perl, these sub-sequences are called <span class="emphasis"><em>back-references</em></span>,
|
||
and they are stored in the variables <code class="literal">$1</code>, <code class="literal">$2</code>,
|
||
etc. In xpressive, they are objects of type <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>,
|
||
and they are stored in the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
structure, which acts as a vector of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
objects.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.accessing_results.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.match_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.match_results">match_results</a>
|
||
</h3>
|
||
<p>
|
||
So, you've passed a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object to a regex algorithm, and the algorithm has succeeded. Now you want
|
||
to examine the results. Most of what you'll be doing with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object is indexing into it to access its internally stored <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
objects, but there are a few other things you can do with a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object besides.
|
||
</p>
|
||
<p>
|
||
The table below shows how to access the information stored in a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object named <code class="computeroutput"><span class="identifier">what</span></code>.
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.accessing_results.t0"></a><p class="title"><b>Table 47.5. match_results<> Accessors</b></p>
|
||
<div class="table-contents"><table class="table" summary="match_results<> Accessors">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Accessor
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Effects
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">size</span><span class="special">()</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns the number of sub-matches, which is always greater than
|
||
zero after a successful match because the full match is stored
|
||
in the zero-th sub-match.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">]</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns the <span class="emphasis"><em>n</em></span>-th sub-match.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">length</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns the length of the <span class="emphasis"><em>n</em></span>-th sub-match.
|
||
Same as <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">].</span><span class="identifier">length</span><span class="special">()</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">position</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns the offset into the input sequence at which the <span class="emphasis"><em>n</em></span>-th
|
||
sub-match begins.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">str</span><span class="special">(</span><span class="identifier">n</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>
|
||
constructed from the <span class="emphasis"><em>n</em></span>-th sub-match. Same
|
||
as <code class="computeroutput"><span class="identifier">what</span><span class="special">[</span><span class="identifier">n</span><span class="special">].</span><span class="identifier">str</span><span class="special">()</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">prefix</span><span class="special">()</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
object which represents the sub-sequence from the beginning of
|
||
the input sequence to the start of the full match.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">suffix</span><span class="special">()</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
object which represents the sub-sequence from the end of the full
|
||
match to the end of the input sequence.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns the <code class="computeroutput"><span class="identifier">regex_id</span></code>
|
||
of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
object that was last used with this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
There is more you can do with the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object, but that will be covered when we talk about <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">Grammars
|
||
and Nested Matches</a>.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.accessing_results.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results.sub_match"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results.sub_match">sub_match</a>
|
||
</h3>
|
||
<p>
|
||
When you index into a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object, you get back a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
object. A <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
is basically a pair of iterators. It is defined like this:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">template</span><span class="special"><</span> <span class="keyword">class</span> <span class="identifier">BidirectionalIterator</span> <span class="special">></span>
|
||
<span class="keyword">struct</span> <span class="identifier">sub_match</span>
|
||
<span class="special">:</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><</span> <span class="identifier">BidirectionalIterator</span><span class="special">,</span> <span class="identifier">BidirectionalIterator</span> <span class="special">></span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">bool</span> <span class="identifier">matched</span><span class="special">;</span>
|
||
<span class="comment">// ...</span>
|
||
<span class="special">};</span>
|
||
</pre>
|
||
<p>
|
||
Since it inherits publicaly from <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><></span></code>, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
has <code class="computeroutput"><span class="identifier">first</span></code> and <code class="computeroutput"><span class="identifier">second</span></code> data members of type <code class="computeroutput"><span class="identifier">BidirectionalIterator</span></code>. These are the beginning
|
||
and end of the sub-sequence this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
represents. <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
also has a Boolean <code class="computeroutput"><span class="identifier">matched</span></code>
|
||
data member, which is true if this <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
participated in the full match.
|
||
</p>
|
||
<p>
|
||
The following table shows how you might access the information stored in
|
||
a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
object called <code class="computeroutput"><span class="identifier">sub</span></code>.
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.accessing_results.t1"></a><p class="title"><b>Table 47.6. sub_match<> Accessors</b></p>
|
||
<div class="table-contents"><table class="table" summary="sub_match<> Accessors">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Accessor
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Effects
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns the length of the sub-match. Same as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">distance</span><span class="special">(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">first</span><span class="special">,</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">second</span><span class="special">)</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">str</span><span class="special">()</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>
|
||
constructed from the sub-match. Same as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><</span><span class="identifier">char_type</span><span class="special">>(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">first</span><span class="special">,</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">second</span><span class="special">)</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">compare</span><span class="special">(</span><span class="identifier">str</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Performs a string comparison between the sub-match and <code class="computeroutput"><span class="identifier">str</span></code>, where <code class="computeroutput"><span class="identifier">str</span></code>
|
||
can be a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>,
|
||
C-style null-terminated string, or another sub-match. Same as
|
||
<code class="computeroutput"><span class="identifier">sub</span><span class="special">.</span><span class="identifier">str</span><span class="special">().</span><span class="identifier">compare</span><span class="special">(</span><span class="identifier">str</span><span class="special">)</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><h3>
|
||
<a name="boost_xpressive.user_s_guide.accessing_results.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.accessing_results._inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject__results_invalidation__inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.accessing_results._inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject__results_invalidation__inlinemediaobject__imageobject__imagedata_fileref__images_caution_png____imagedata___imageobject__textobject__phrase_caution__phrase___textobject___inlinemediaobject_"><span class="inlinemediaobject"><img src="../images/caution.png" alt="caution"></span> Results Invalidation <span class="inlinemediaobject"><img src="../images/caution.png" alt="caution"></span></a>
|
||
</h3>
|
||
<p>
|
||
Results are stored as iterators into the input sequence. Anything which invalidates
|
||
the input sequence will invalidate the match results. For instance, if you
|
||
match a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> object, the results are only valid
|
||
until your next call to a non-const member function of that <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>
|
||
object. After that, the results held by the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object are invalid. Don't use them!
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions" title="String Substitutions">String
|
||
Substitutions</a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
Regular expressions are not only good for searching text; they're good at
|
||
<span class="emphasis"><em>manipulating</em></span> it. And one of the most common text manipulation
|
||
tasks is search-and-replace. xpressive provides the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
algorithm for searching and replacing.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.regex_replace__"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.regex_replace__">regex_replace()</a>
|
||
</h3>
|
||
<p>
|
||
Performing search-and-replace using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
is simple. All you need is an input sequence, a regex object, and a format
|
||
string or a formatter object. There are several versions of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
algorithm. Some accept the input sequence as a bidirectional container such
|
||
as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> and returns the result in a new
|
||
container of the same type. Others accept the input as a null terminated
|
||
string and return a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>. Still others accept the input sequence
|
||
as a pair of iterators and writes the result into an output iterator. The
|
||
substitution may be specified as a string with format sequences or as a formatter
|
||
object. Below are some simple examples of using string-based substitutions.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">as_xpr</span><span class="special">(</span><span class="string">"his"</span><span class="special">);</span> <span class="comment">// find all occurrences of "his" ...</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">format</span><span class="special">(</span><span class="string">"her"</span><span class="special">);</span> <span class="comment">// ... and replace them with "her"</span>
|
||
|
||
<span class="comment">// use the version of regex_replace() that operates on strings</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">input</span><span class="special">,</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
|
||
<span class="comment">// use the version of regex_replace() that operates on iterators</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="keyword">char</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">);</span>
|
||
<span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">out_iter</span><span class="special">,</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
The above program prints out the following:
|
||
</p>
|
||
<pre class="programlisting">Ther is her face
|
||
Ther is her face
|
||
</pre>
|
||
<p>
|
||
Notice that <span class="emphasis"><em>all</em></span> the occurrences of <code class="computeroutput"><span class="string">"his"</span></code>
|
||
have been replaced with <code class="computeroutput"><span class="string">"her"</span></code>.
|
||
</p>
|
||
<p>
|
||
Click <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">here</a>
|
||
to see a complete example program that shows how to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>.
|
||
And check the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
reference to see a complete list of the available overloads.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.replace_options"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.replace_options">Replace
|
||
Options</a>
|
||
</h3>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
algorithm takes an optional bitmask parameter to control the formatting.
|
||
The possible values of the bitmask are:
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.t0"></a><p class="title"><b>Table 47.7. Format Flags</b></p>
|
||
<div class="table-contents"><table class="table" summary="Format Flags">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Flag
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Meaning
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">format_default</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Recognize the ECMA-262 format sequences (see below).
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">format_first_only</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Only replace the first match, not all of them.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">format_no_copy</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Don't copy the parts of the input sequence that didn't match the
|
||
regex to the output sequence.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">format_literal</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Treat the format string as a literal; that is, don't recognize
|
||
any escape sequences.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">format_perl</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Recognize the Perl format sequences (see below).
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">format_sed</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Recognize the sed format sequences (see below).
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">format_all</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
In addition to the Perl format sequences, recognize some Boost-specific
|
||
format sequences.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
These flags live in the <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">regex_constants</span></code>
|
||
namespace. If the substitution parameter is a function object instead of
|
||
a string, the flags <code class="computeroutput"><span class="identifier">format_literal</span></code>,
|
||
<code class="computeroutput"><span class="identifier">format_perl</span></code>, <code class="computeroutput"><span class="identifier">format_sed</span></code>, and <code class="computeroutput"><span class="identifier">format_all</span></code>
|
||
are ignored.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_ecma_262_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_ecma_262_format_sequences">The
|
||
ECMA-262 Format Sequences</a>
|
||
</h3>
|
||
<p>
|
||
When you haven't specified a substitution string dialect with one of the
|
||
format flags above, you get the dialect defined by ECMA-262, the standard
|
||
for ECMAScript. The table below shows the escape sequences recognized in
|
||
ECMA-262 mode.
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.t1"></a><p class="title"><b>Table 47.8. Format Escape Sequences</b></p>
|
||
<div class="table-contents"><table class="table" summary="Format Escape Sequences">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Escape Sequence
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Meaning
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$1</code>, <code class="literal">$2</code>, etc.
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the corresponding sub-match
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$&</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the full match
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$`</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the match prefix
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$'</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the match suffix
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$$</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a literal <code class="computeroutput"><span class="char">'$'</span></code> character
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
Any other sequence beginning with <code class="computeroutput"><span class="char">'$'</span></code>
|
||
simply represents itself. For example, if the format string were <code class="computeroutput"><span class="string">"$a"</span></code> then <code class="computeroutput"><span class="string">"$a"</span></code>
|
||
would be inserted into the output sequence.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_sed_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_sed_format_sequences">The
|
||
Sed Format Sequences</a>
|
||
</h3>
|
||
<p>
|
||
When specifying the <code class="computeroutput"><span class="identifier">format_sed</span></code>
|
||
flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
|
||
the following escape sequences are recognized:
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.t2"></a><p class="title"><b>Table 47.9. Sed Format Escape Sequences</b></p>
|
||
<div class="table-contents"><table class="table" summary="Sed Format Escape Sequences">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Escape Sequence
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Meaning
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\1</code>, <code class="literal">\2</code>, etc.
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The corresponding sub-match
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">&</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the full match
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\a</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\a'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\e</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">27</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\f</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\f'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\n</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\n'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\r</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\r'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\t</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\t'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\v</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\v'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\xFF</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
|
||
is any hex digit
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\x{FFFF}</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFFFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
|
||
is any hex digit
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\cX</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The control character <code class="literal"><span class="emphasis"><em>X</em></span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><h3>
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.h4"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_perl_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_perl_format_sequences">The
|
||
Perl Format Sequences</a>
|
||
</h3>
|
||
<p>
|
||
When specifying the <code class="computeroutput"><span class="identifier">format_perl</span></code>
|
||
flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
|
||
the following escape sequences are recognized:
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.t3"></a><p class="title"><b>Table 47.10. Perl Format Escape Sequences</b></p>
|
||
<div class="table-contents"><table class="table" summary="Perl Format Escape Sequences">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Escape Sequence
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Meaning
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$1</code>, <code class="literal">$2</code>, etc.
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the corresponding sub-match
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$&</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the full match
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$`</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the match prefix
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$'</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
the match suffix
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">$$</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
a literal <code class="computeroutput"><span class="char">'$'</span></code> character
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\a</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\a'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\e</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">27</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\f</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\f'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\n</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\n'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\r</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\r'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\t</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\t'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\v</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="char">'\v'</span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\xFF</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
|
||
is any hex digit
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\x{FFFF}</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A literal <code class="computeroutput"><span class="identifier">char_type</span><span class="special">(</span><span class="number">0xFFFF</span><span class="special">)</span></code>, where <code class="literal"><span class="emphasis"><em>F</em></span></code>
|
||
is any hex digit
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\cX</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The control character <code class="literal"><span class="emphasis"><em>X</em></span></code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\l</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Make the next character lowercase
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\L</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Make the rest of the substitution lowercase until the next <code class="literal">\E</code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\u</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Make the next character uppercase
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\U</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Make the rest of the substitution uppercase until the next <code class="literal">\E</code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\E</code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Terminate <code class="literal">\L</code> or <code class="literal">\U</code>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\1</code>, <code class="literal">\2</code>, etc.
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The corresponding sub-match
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="literal">\g<name></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The named backref <span class="emphasis"><em>name</em></span>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><h3>
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.h5"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.the_boost_specific_format_sequences"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.the_boost_specific_format_sequences">The
|
||
Boost-Specific Format Sequences</a>
|
||
</h3>
|
||
<p>
|
||
When specifying the <code class="computeroutput"><span class="identifier">format_all</span></code>
|
||
flag to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
|
||
the escape sequences recognized are the same as those above for <code class="computeroutput"><span class="identifier">format_perl</span></code>. In addition, conditional expressions
|
||
of the following form are recognized:
|
||
</p>
|
||
<pre class="programlisting">?Ntrue-expression:false-expression
|
||
</pre>
|
||
<p>
|
||
where <span class="emphasis"><em>N</em></span> is a decimal digit representing a sub-match.
|
||
If the corresponding sub-match participated in the full match, then the substitution
|
||
is <span class="emphasis"><em>true-expression</em></span>. Otherwise, it is <span class="emphasis"><em>false-expression</em></span>.
|
||
In this mode, you can use parens <code class="literal">()</code> for grouping. If you
|
||
want a literal paren, you must escape it as <code class="literal">\(</code>.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.h6"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.formatter_objects"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.formatter_objects">Formatter
|
||
Objects</a>
|
||
</h3>
|
||
<p>
|
||
Format strings are not always expressive enough for all your text substitution
|
||
needs. Consider the simple example of wanting to map input strings to output
|
||
strings, as you may want to do with environment variables. Rather than a
|
||
format <span class="emphasis"><em>string</em></span>, for this you would use a formatter <span class="emphasis"><em>object</em></span>.
|
||
Consider the following code, which finds embedded environment variables of
|
||
the form <code class="computeroutput"><span class="string">"$(XYZ)"</span></code> and
|
||
computes the substitution string by looking up the environment variable in
|
||
a map.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">;</span>
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env</span><span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">format_fun</span><span class="special">(</span><span class="identifier">smatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span><span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">return</span> <span class="identifier">env</span><span class="special">[</span><span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">].</span><span class="identifier">str</span><span class="special">()];</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
|
||
<span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
|
||
|
||
<span class="comment">// replace strings like "$(XYZ)" with the result of env["XYZ"]</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">format_fun</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
In this case, we use a function, <code class="computeroutput"><span class="identifier">format_fun</span><span class="special">()</span></code> to compute the substitution string on the
|
||
fly. It accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object which contains the results of the current match. <code class="computeroutput"><span class="identifier">format_fun</span><span class="special">()</span></code> uses the first submatch as a key into the
|
||
global <code class="computeroutput"><span class="identifier">env</span></code> map. The above
|
||
code displays:
|
||
</p>
|
||
<pre class="programlisting">"this" has the value "that"
|
||
</pre>
|
||
<p>
|
||
The formatter need not be an ordinary function. It may be an object of class
|
||
type. And rather than return a string, it may accept an output iterator into
|
||
which it writes the substitution. Consider the following, which is functionally
|
||
equivalent to the above.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">;</span>
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">struct</span> <span class="identifier">formatter</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">typedef</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env_map</span><span class="special">;</span>
|
||
<span class="identifier">env_map</span> <span class="identifier">env</span><span class="special">;</span>
|
||
|
||
<span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Out</span><span class="special">></span>
|
||
<span class="identifier">Out</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">smatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span><span class="special">,</span> <span class="identifier">Out</span> <span class="identifier">out</span><span class="special">)</span> <span class="keyword">const</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">env_map</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">where</span> <span class="special">=</span> <span class="identifier">env</span><span class="special">.</span><span class="identifier">find</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]);</span>
|
||
<span class="keyword">if</span><span class="special">(</span><span class="identifier">where</span> <span class="special">!=</span> <span class="identifier">env</span><span class="special">.</span><span class="identifier">end</span><span class="special">())</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">sub</span> <span class="special">=</span> <span class="identifier">where</span><span class="special">-></span><span class="identifier">second</span><span class="special">;</span>
|
||
<span class="identifier">out</span> <span class="special">=</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span><span class="identifier">sub</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">out</span><span class="special">);</span>
|
||
<span class="special">}</span>
|
||
<span class="keyword">return</span> <span class="identifier">out</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="special">};</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">formatter</span> <span class="identifier">fmt</span><span class="special">;</span>
|
||
<span class="identifier">fmt</span><span class="special">.</span><span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
|
||
<span class="identifier">fmt</span><span class="special">.</span><span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">fmt</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
The formatter must be a callable object -- a function or a function object
|
||
-- that has one of three possible signatures, detailed in the table below.
|
||
For the table, <code class="computeroutput"><span class="identifier">fmt</span></code> is a function
|
||
pointer or function object, <code class="computeroutput"><span class="identifier">what</span></code>
|
||
is a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object, <code class="computeroutput"><span class="identifier">out</span></code> is an OutputIterator,
|
||
and <code class="computeroutput"><span class="identifier">flags</span></code> is a value of
|
||
<code class="computeroutput"><span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">match_flag_type</span></code>:
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.t4"></a><p class="title"><b>Table 47.11. Formatter Signatures</b></p>
|
||
<div class="table-contents"><table class="table" summary="Formatter Signatures">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
Formatter Invocation
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Return Type
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
Semantics
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Range of characters (e.g. <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>)
|
||
or null-terminated string
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The string matched by the regex is replaced with the string returned
|
||
by the formatter.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">,</span>
|
||
<span class="identifier">out</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
OutputIterator
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The formatter writes the replacement string into <code class="computeroutput"><span class="identifier">out</span></code> and returns <code class="computeroutput"><span class="identifier">out</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">fmt</span><span class="special">(</span><span class="identifier">what</span><span class="special">,</span>
|
||
<span class="identifier">out</span><span class="special">,</span>
|
||
<span class="identifier">flags</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
OutputIterator
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The formatter writes the replacement string into <code class="computeroutput"><span class="identifier">out</span></code> and returns <code class="computeroutput"><span class="identifier">out</span></code>. The <code class="computeroutput"><span class="identifier">flags</span></code>
|
||
parameter is the value of the match flags passed to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
algorithm.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><h3>
|
||
<a name="boost_xpressive.user_s_guide.string_substitutions.h7"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_substitutions.formatter_expressions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_substitutions.formatter_expressions">Formatter
|
||
Expressions</a>
|
||
</h3>
|
||
<p>
|
||
In addition to format <span class="emphasis"><em>strings</em></span> and formatter <span class="emphasis"><em>objects</em></span>,
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
also accepts formatter <span class="emphasis"><em>expressions</em></span>. A formatter expression
|
||
is a lambda expression that generates a string. It uses the same syntax as
|
||
that for <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions" title="Semantic Actions and User-Defined Assertions">Semantic
|
||
Actions</a>, which are covered later. The above example, which uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>
|
||
to substitute strings for environment variables, is repeated here using a
|
||
formatter expression.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">map</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">env</span><span class="special">;</span>
|
||
<span class="identifier">env</span><span class="special">[</span><span class="string">"X"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"this"</span><span class="special">;</span>
|
||
<span class="identifier">env</span><span class="special">[</span><span class="string">"Y"</span><span class="special">]</span> <span class="special">=</span> <span class="string">"that"</span><span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"\"$(X)\" has the value \"$(Y)\""</span><span class="special">);</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">envar</span> <span class="special">=</span> <span class="string">"$("</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s1</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">output</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">input</span><span class="special">,</span> <span class="identifier">envar</span><span class="special">,</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">env</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">output</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
In the above, the formatter expression is <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">env</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>. This
|
||
means to use the value of the first submatch, <code class="computeroutput"><span class="identifier">s1</span></code>,
|
||
as a key into the <code class="computeroutput"><span class="identifier">env</span></code> map.
|
||
The purpose of <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
|
||
here is to make the reference to the <code class="computeroutput"><span class="identifier">env</span></code>
|
||
local variable <span class="emphasis"><em>lazy</em></span> so that the index operation is deferred
|
||
until we know what to replace <code class="computeroutput"><span class="identifier">s1</span></code>
|
||
with.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization" title="String Splitting and Tokenization">String
|
||
Splitting and Tokenization</a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
is the Ginsu knife of the text manipulation world. It slices! It dices! This
|
||
section describes how to use the highly-configurable <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
to chop up input sequences.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
You initialize a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
with an input sequence, a regex, and some optional configuration parameters.
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
will use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
to find the first place in the sequence that the regex matches. When dereferenced,
|
||
the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
returns a <span class="emphasis"><em>token</em></span> in the form of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><></span></code>. Which string it returns depends
|
||
on the configuration parameters. By default it returns a string corresponding
|
||
to the full match, but it could also return a string corresponding to a particular
|
||
marked sub-expression, or even the part of the sequence that <span class="emphasis"><em>didn't</em></span>
|
||
match. When you increment the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>,
|
||
it will move to the next token. Which token is next depends on the configuration
|
||
parameters. It could simply be a different marked sub-expression in the current
|
||
match, or it could be part or all of the next match. Or it could be the part
|
||
that <span class="emphasis"><em>didn't</em></span> match.
|
||
</p>
|
||
<p>
|
||
As you can see, <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
can do a lot. That makes it hard to describe, but some examples should make
|
||
it clear.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_1__simple_tokenization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_1__simple_tokenization">Example
|
||
1: Simple Tokenization</a>
|
||
</h3>
|
||
<p>
|
||
This example uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
to chop a sequence into a series of tokens consisting of words.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">;</span> <span class="comment">// find a word</span>
|
||
|
||
<span class="comment">// iterate over all the words in the input</span>
|
||
<span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
|
||
|
||
<span class="comment">// write all the words to std::cout</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
This program displays the following:
|
||
</p>
|
||
<pre class="programlisting">This
|
||
is
|
||
his
|
||
face
|
||
</pre>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_2__simple_tokenization__reloaded"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_2__simple_tokenization__reloaded">Example
|
||
2: Simple Tokenization, Reloaded</a>
|
||
</h3>
|
||
<p>
|
||
This example also uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
to chop a sequence into a series of tokens consisting of words, but it uses
|
||
the regex as a delimiter. When we pass a <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code> as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
constructor, it instructs the token iterator to consider as tokens those
|
||
parts of the input that <span class="emphasis"><em>didn't</em></span> match the regex.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"This is his face"</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_s</span><span class="special">;</span> <span class="comment">// find white space</span>
|
||
|
||
<span class="comment">// iterate over all non-white space in the input. Note the -1 below:</span>
|
||
<span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
|
||
|
||
<span class="comment">// write all the words to std::cout</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
This program displays the following:
|
||
</p>
|
||
<pre class="programlisting">This
|
||
is
|
||
his
|
||
face
|
||
</pre>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_3__simple_tokenization__revolutions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_3__simple_tokenization__revolutions">Example
|
||
3: Simple Tokenization, Revolutions</a>
|
||
</h3>
|
||
<p>
|
||
This example also uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
to chop a sequence containing a bunch of dates into a series of tokens consisting
|
||
of just the years. When we pass a positive integer <code class="literal"><span class="emphasis"><em>N</em></span></code>
|
||
as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
constructor, it instructs the token iterator to consider as tokens only the
|
||
<code class="literal"><span class="emphasis"><em>N</em></span></code>-th marked sub-expression of each
|
||
match.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"01/02/2003 blahblah 04/23/1999 blahblah 11/13/1981"</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(\\d{2})/(\\d{2})/(\\d{4})"</span><span class="special">);</span> <span class="comment">// find a date</span>
|
||
|
||
<span class="comment">// iterate over all the years in the input. Note the 3 below, corresponding to the 3rd sub-expression:</span>
|
||
<span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="number">3</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
|
||
|
||
<span class="comment">// write all the words to std::cout</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
This program displays the following:
|
||
</p>
|
||
<pre class="programlisting">2003
|
||
1999
|
||
1981
|
||
</pre>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.h4"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_4__not_so_simple_tokenization"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.string_splitting_and_tokenization.example_4__not_so_simple_tokenization">Example
|
||
4: Not-So-Simple Tokenization</a>
|
||
</h3>
|
||
<p>
|
||
This example is like the previous one, except that instead of tokenizing
|
||
just the years, this program turns the days, months and years into tokens.
|
||
When we pass an array of integers <code class="literal"><span class="emphasis"><em>{I,J,...}</em></span></code>
|
||
as the last parameter to the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
constructor, it instructs the token iterator to consider as tokens the <code class="literal"><span class="emphasis"><em>I</em></span></code>-th,
|
||
<code class="literal"><span class="emphasis"><em>J</em></span></code>-th, etc. marked sub-expression
|
||
of each match.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">input</span><span class="special">(</span><span class="string">"01/02/2003 blahblah 04/23/1999 blahblah 11/13/1981"</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(\\d{2})/(\\d{2})/(\\d{4})"</span><span class="special">);</span> <span class="comment">// find a date</span>
|
||
|
||
<span class="comment">// iterate over the days, months and years in the input</span>
|
||
<span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">sub_matches</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">2</span><span class="special">,</span> <span class="number">1</span><span class="special">,</span> <span class="number">3</span> <span class="special">};</span> <span class="comment">// day, month, year</span>
|
||
<span class="identifier">sregex_token_iterator</span> <span class="identifier">begin</span><span class="special">(</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">input</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">re</span><span class="special">,</span> <span class="identifier">sub_matches</span> <span class="special">),</span> <span class="identifier">end</span><span class="special">;</span>
|
||
|
||
<span class="comment">// write all the words to std::cout</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="special">></span> <span class="identifier">out_iter</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span><span class="special">,</span> <span class="string">"\n"</span> <span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">copy</span><span class="special">(</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">out_iter</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
This program displays the following:
|
||
</p>
|
||
<pre class="programlisting">02
|
||
01
|
||
2003
|
||
23
|
||
04
|
||
1999
|
||
13
|
||
11
|
||
1981
|
||
</pre>
|
||
<p>
|
||
The <code class="computeroutput"><span class="identifier">sub_matches</span></code> array instructs
|
||
the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
to first take the value of the 2nd sub-match, then the 1st sub-match, and
|
||
finally the 3rd. Incrementing the iterator again instructs it to use <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
again to find the next match. At that point, the process repeats -- the token
|
||
iterator takes the value of the 2nd sub-match, then the 1st, et cetera.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.named_captures"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures" title="Named Captures">Named Captures</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.named_captures.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
For complicated regular expressions, dealing with numbered captures can be
|
||
a pain. Counting left parentheses to figure out which capture to reference
|
||
is no fun. Less fun is the fact that merely editing a regular expression
|
||
could cause a capture to be assigned a new number, invaliding code that refers
|
||
back to it by the old number.
|
||
</p>
|
||
<p>
|
||
Other regular expression engines solve this problem with a feature called
|
||
<span class="emphasis"><em>named captures</em></span>. This feature allows you to assign a
|
||
name to a capture, and to refer back to the capture by name rather by number.
|
||
Xpressive also supports named captures, both in dynamic and in static regexes.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.named_captures.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.dynamic_named_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.dynamic_named_captures">Dynamic
|
||
Named Captures</a>
|
||
</h3>
|
||
<p>
|
||
For dynamic regular expressions, xpressive follows the lead of other popular
|
||
regex engines with the syntax of named captures. You can create a named capture
|
||
with <code class="computeroutput"><span class="string">"(?P<xxx>...)"</span></code>
|
||
and refer back to that capture with <code class="computeroutput"><span class="string">"(?P=xxx)"</span></code>.
|
||
Here, for instance, is a regular expression that creates a named capture
|
||
and refers back to it:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Create a named capture called "char" that matches a single</span>
|
||
<span class="comment">// character and refer back to that capture by name.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
The effect of the above regular expression is to find the first doubled character.
|
||
</p>
|
||
<p>
|
||
Once you have executed a match or search operation using a regex with named
|
||
captures, you can access the named capture through the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object using the capture's name.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span>
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="string">"char = "</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="string">"char"</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
The above code displays:
|
||
</p>
|
||
<pre class="programlisting">char = e
|
||
</pre>
|
||
<p>
|
||
You can also refer back to a named capture from within a substitution string.
|
||
The syntax for that is <code class="computeroutput"><span class="string">"\\g<xxx>"</span></code>.
|
||
Below is some code that demonstrates how to use named captures when doing
|
||
string substitution.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(?P<char>.)(?P=char)"</span><span class="special">);</span>
|
||
<span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">,</span> <span class="string">"**\\g<char>**"</span><span class="special">,</span> <span class="identifier">regex_constants</span><span class="special">::</span><span class="identifier">format_perl</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
Notice that you have to specify <code class="computeroutput"><span class="identifier">format_perl</span></code>
|
||
when using named captures. Only the perl syntax recognizes the <code class="computeroutput"><span class="string">"\\g<xxx>"</span></code> syntax. The above
|
||
code displays:
|
||
</p>
|
||
<pre class="programlisting">tw**e**t
|
||
</pre>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.named_captures.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.named_captures.static_named_captures"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.named_captures.static_named_captures">Static
|
||
Named Captures</a>
|
||
</h3>
|
||
<p>
|
||
If you're using static regular expressions, creating and using named captures
|
||
is even easier. You can use the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/mark_tag.html" title="Struct mark_tag">mark_tag</a></code></code>
|
||
type to create a variable that you can use like <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s1</a></code>, <code class="computeroutput"><a class="link" href="../boost/xpressive/s1.html" title="Global s1">s2</a></code> and friends, but with a
|
||
name that is more meaningful. Below is how the above example would look using
|
||
static regexes:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span> <span class="comment">// char_ is now a synonym for s1</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
After a match operation, you can use the <code class="computeroutput"><span class="identifier">mark_tag</span></code>
|
||
to index into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
to access the named capture:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
|
||
<span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span>
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">char_</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
The above code displays:
|
||
</p>
|
||
<pre class="programlisting">char = e
|
||
</pre>
|
||
<p>
|
||
When doing string substitutions with <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_replace.html" title="Function regex_replace">regex_replace()</a></code></code>,
|
||
you can use named captures to create <span class="emphasis"><em>format expressions</em></span>
|
||
as below:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"tweet"</span><span class="special">);</span>
|
||
<span class="identifier">mark_tag</span> <span class="identifier">char_</span><span class="special">(</span><span class="number">1</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">char_</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">char_</span><span class="special">;</span>
|
||
<span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">,</span> <span class="string">"**"</span> <span class="special">+</span> <span class="identifier">char_</span> <span class="special">+</span> <span class="string">"**"</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
The above code displays:
|
||
</p>
|
||
<pre class="programlisting">tw**e**t
|
||
</pre>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
You need to include <code class="literal"><boost/xpressive/regex_actions.hpp></code>
|
||
to use format expressions.
|
||
</p></td></tr>
|
||
</table></div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches" title="Grammars and Nested Matches">Grammars
|
||
and Nested Matches</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
One of the key benefits of representing regexes as C++ expressions is the
|
||
ability to easily refer to other C++ code and data from within the regex.
|
||
This enables programming idioms that are not possible with other regular
|
||
expression libraries. Of particular note is the ability for one regex to
|
||
refer to another regex, allowing you to build grammars out of regular expressions.
|
||
This section describes how to embed one regex in another by value and by
|
||
reference, how regex objects behave when they refer to other regexes, and
|
||
how to access the tree of results after a successful parse.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_value"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_value">Embedding
|
||
a Regex by Value</a>
|
||
</h3>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
object has value semantics. When a regex object appears on the right-hand
|
||
side in the definition of another regex, it is as if the regex were embedded
|
||
by value; that is, a copy of the nested regex is stored by the enclosing
|
||
regex. The inner regex is invoked by the outer regex during pattern matching.
|
||
The inner regex participates fully in the match, back-tracking as needed
|
||
to make the match succeed.
|
||
</p>
|
||
<p>
|
||
Consider a text editor that has a regex-find feature with a whole-word option.
|
||
You can implement this with xpressive as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">find_dialog</span> <span class="identifier">dlg</span><span class="special">;</span>
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">dialog_ok</span> <span class="special">==</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">do_modal</span><span class="special">()</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">pattern</span> <span class="special">=</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">get_text</span><span class="special">();</span> <span class="comment">// the pattern the user entered</span>
|
||
<span class="keyword">bool</span> <span class="identifier">whole_word</span> <span class="special">=</span> <span class="identifier">dlg</span><span class="special">.</span><span class="identifier">whole_word</span><span class="special">.</span><span class="identifier">is_checked</span><span class="special">();</span> <span class="comment">// did the user select the whole-word option?</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="identifier">pattern</span> <span class="special">);</span> <span class="comment">// try to compile the pattern</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">whole_word</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="comment">// wrap the regex in begin-word / end-word assertions</span>
|
||
<span class="identifier">re</span> <span class="special">=</span> <span class="identifier">bow</span> <span class="special">>></span> <span class="identifier">re</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="comment">// ... use re ...</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
Look closely at this line:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// wrap the regex in begin-word / end-word assertions</span>
|
||
<span class="identifier">re</span> <span class="special">=</span> <span class="identifier">bow</span> <span class="special">>></span> <span class="identifier">re</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
This line creates a new regex that embeds the old regex by value. Then, the
|
||
new regex is assigned back to the original regex. Since a copy of the old
|
||
regex was made on the right-hand side, this works as you might expect: the
|
||
new regex has the behavior of the old regex wrapped in begin- and end-word
|
||
assertions.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Note that <code class="computeroutput"><span class="identifier">re</span> <span class="special">=</span>
|
||
<span class="identifier">bow</span> <span class="special">>></span>
|
||
<span class="identifier">re</span> <span class="special">>></span>
|
||
<span class="identifier">eow</span></code> does <span class="emphasis"><em>not</em></span>
|
||
define a recursive regular expression, since regex objects embed by value
|
||
by default. The next section shows how to define a recursive regular expression
|
||
by embedding a regex by reference.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.embedding_a_regex_by_reference">Embedding
|
||
a Regex by Reference</a>
|
||
</h3>
|
||
<p>
|
||
If you want to be able to build recursive regular expressions and context-free
|
||
grammars, embedding a regex by value is not enough. You need to be able to
|
||
make your regular expressions self-referential. Most regular expression engines
|
||
don't give you that power, but xpressive does.
|
||
</p>
|
||
<div class="tip"><table border="0" summary="Tip">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
|
||
<th align="left">Tip</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
The theoretical computer scientists out there will correctly point out
|
||
that a self-referential regular expression is not "regular",
|
||
so in the strict sense, xpressive isn't really a <span class="emphasis"><em>regular</em></span>
|
||
expression engine at all. But as Larry Wall once said, "the term [regular expression] has
|
||
grown with the capabilities of our pattern matching engines, so I'm not
|
||
going to try to fight linguistic necessity here."
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
Consider the following code, which uses the <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code> helper to define a recursive regular expression
|
||
that matches balanced, nested parentheses:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">parentheses</span><span class="special">;</span>
|
||
<span class="identifier">parentheses</span> <span class="comment">// A balanced set of parentheses ...</span>
|
||
<span class="special">=</span> <span class="char">'('</span> <span class="comment">// is an opening parenthesis ...</span>
|
||
<span class="special">>></span> <span class="comment">// followed by ...</span>
|
||
<span class="special">*(</span> <span class="comment">// zero or more ...</span>
|
||
<span class="identifier">keep</span><span class="special">(</span> <span class="special">+~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'('</span><span class="special">,</span><span class="char">')'</span><span class="special">)</span> <span class="special">)</span> <span class="comment">// of a bunch of things that are not parentheses ...</span>
|
||
<span class="special">|</span> <span class="comment">// or ...</span>
|
||
<span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">parentheses</span><span class="special">)</span> <span class="comment">// a balanced set of parentheses</span>
|
||
<span class="special">)</span> <span class="comment">// (ooh, recursion!) ...</span>
|
||
<span class="special">>></span> <span class="comment">// followed by ...</span>
|
||
<span class="char">')'</span> <span class="comment">// a closing parenthesis</span>
|
||
<span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
Matching balanced, nested tags is an important text processing task, and
|
||
it is one that "classic" regular expressions cannot do. The <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code>
|
||
helper makes it possible. It allows one regex object to be embedded in another
|
||
<span class="emphasis"><em>by reference</em></span>. Since the right-hand side holds <code class="computeroutput"><span class="identifier">parentheses</span></code> by reference, assigning the
|
||
right-hand side back to <code class="computeroutput"><span class="identifier">parentheses</span></code>
|
||
creates a cycle, which will execute recursively.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.building_a_grammar"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.building_a_grammar">Building
|
||
a Grammar</a>
|
||
</h3>
|
||
<p>
|
||
Once we allow self-reference in our regular expressions, the genie is out
|
||
of the bottle and all manner of fun things are possible. In particular, we
|
||
can now build grammars out of regular expressions. Let's have a look at the
|
||
text-book grammar example: the humble calculator.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">group</span><span class="special">,</span> <span class="identifier">factor</span><span class="special">,</span> <span class="identifier">term</span><span class="special">,</span> <span class="identifier">expression</span><span class="special">;</span>
|
||
|
||
<span class="identifier">group</span> <span class="special">=</span> <span class="char">'('</span> <span class="special">>></span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">expression</span><span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
|
||
<span class="identifier">factor</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span> <span class="special">|</span> <span class="identifier">group</span><span class="special">;</span>
|
||
<span class="identifier">term</span> <span class="special">=</span> <span class="identifier">factor</span> <span class="special">>></span> <span class="special">*((</span><span class="char">'*'</span> <span class="special">>></span> <span class="identifier">factor</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'/'</span> <span class="special">>></span> <span class="identifier">factor</span><span class="special">));</span>
|
||
<span class="identifier">expression</span> <span class="special">=</span> <span class="identifier">term</span> <span class="special">>></span> <span class="special">*((</span><span class="char">'+'</span> <span class="special">>></span> <span class="identifier">term</span><span class="special">)</span> <span class="special">|</span> <span class="special">(</span><span class="char">'-'</span> <span class="special">>></span> <span class="identifier">term</span><span class="special">));</span>
|
||
</pre>
|
||
<p>
|
||
The regex <code class="computeroutput"><span class="identifier">expression</span></code> defined
|
||
above does something rather remarkable for a regular expression: it matches
|
||
mathematical expressions. For example, if the input string were <code class="computeroutput"><span class="string">"foo 9*(10+3) bar"</span></code>, this pattern
|
||
would match <code class="computeroutput"><span class="string">"9*(10+3)"</span></code>.
|
||
It only matches well-formed mathematical expressions, where the parentheses
|
||
are balanced and the infix operators have two arguments each. Don't try this
|
||
with just any regular expression engine!
|
||
</p>
|
||
<p>
|
||
Let's take a closer look at this regular expression grammar. Notice that
|
||
it is cyclic: <code class="computeroutput"><span class="identifier">expression</span></code>
|
||
is implemented in terms of <code class="computeroutput"><span class="identifier">term</span></code>,
|
||
which is implemented in terms of <code class="computeroutput"><span class="identifier">factor</span></code>,
|
||
which is implemented in terms of <code class="computeroutput"><span class="identifier">group</span></code>,
|
||
which is implemented in terms of <code class="computeroutput"><span class="identifier">expression</span></code>,
|
||
closing the loop. In general, the way to define a cyclic grammar is to forward-declare
|
||
the regex objects and embed by reference those regular expressions that have
|
||
not yet been initialized. In the above grammar, there is only one place where
|
||
we need to reference a regex object that has not yet been initialized: the
|
||
definition of <code class="computeroutput"><span class="identifier">group</span></code>. In that
|
||
place, we use <code class="computeroutput"><span class="identifier">by_ref</span><span class="special">()</span></code>
|
||
to embed <code class="computeroutput"><span class="identifier">expression</span></code> by reference.
|
||
In all other places, it is sufficient to embed the other regex objects by
|
||
value, since they have already been initialized and their values will not
|
||
change.
|
||
</p>
|
||
<div class="tip"><table border="0" summary="Tip">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
|
||
<th align="left">Tip</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
<span class="bold"><strong>Embed by value if possible</strong></span> <br> <br>
|
||
In general, prefer embedding regular expressions by value rather than by
|
||
reference. It involves one less indirection, making your patterns match
|
||
a little faster. Besides, value semantics are simpler and will make your
|
||
grammars easier to reason about. Don't worry about the expense of "copying"
|
||
a regex. Each regex object shares its implementation with all of its copies.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h4"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.dynamic_regex_grammars"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.dynamic_regex_grammars">Dynamic
|
||
Regex Grammars</a>
|
||
</h3>
|
||
<p>
|
||
Using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>,
|
||
you can also build grammars out of dynamic regular expressions. You do that
|
||
by creating named regexes, and referring to other regexes by name. Each
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>
|
||
instance keeps a mapping from names to regexes that have been created with
|
||
it.
|
||
</p>
|
||
<p>
|
||
You can create a named dynamic regex by prefacing your regex with <code class="computeroutput"><span class="string">"(?$name=)"</span></code>, where <span class="emphasis"><em>name</em></span>
|
||
is the name of the regex. You can refer to a named regex from another regex
|
||
with <code class="computeroutput"><span class="string">"(?$name)"</span></code>. The
|
||
named regex does not need to exist yet at the time it is referenced in another
|
||
regex, but it must exist by the time you use the regex.
|
||
</p>
|
||
<p>
|
||
Below is a code fragment that uses dynamic regex grammars to implement the
|
||
calculator example from above.
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">regex_constants</span><span class="special">;</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">expr</span><span class="special">;</span>
|
||
|
||
<span class="special">{</span>
|
||
<span class="identifier">sregex_compiler</span> <span class="identifier">compiler</span><span class="special">;</span>
|
||
<span class="identifier">syntax_option_type</span> <span class="identifier">x</span> <span class="special">=</span> <span class="identifier">ignore_white_space</span><span class="special">;</span>
|
||
|
||
<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $group = ) \\( (? $expr ) \\) "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
|
||
<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $factor = ) \\d+ | (? $group ) "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
|
||
<span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $term = ) (? $factor )"</span>
|
||
<span class="string">" ( \\* (? $factor ) | / (? $factor ) )* "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
|
||
<span class="identifier">expr</span> <span class="special">=</span> <span class="identifier">compiler</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span><span class="string">"(? $expr = ) (? $term )"</span>
|
||
<span class="string">" ( \\+ (? $term ) | - (? $term ) )* "</span><span class="special">,</span> <span class="identifier">x</span><span class="special">);</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"foo 9*(10+3) bar"</span><span class="special">);</span>
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">expr</span><span class="special">))</span>
|
||
<span class="special">{</span>
|
||
<span class="comment">// This prints "9*(10+3)":</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
As with static regex grammars, nested regex invocations create nested match
|
||
results (see <span class="emphasis"><em>Nested Results</em></span> below). The result is a
|
||
complete parse tree for string that matched. Unlike static regexes, dynamic
|
||
regexes are always embedded by reference, not by value.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h5"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.cyclic_patterns__copying_and_memory_management__oh_my_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.cyclic_patterns__copying_and_memory_management__oh_my_">Cyclic
|
||
Patterns, Copying and Memory Management, Oh My!</a>
|
||
</h3>
|
||
<p>
|
||
The calculator examples above raises a number of very complicated memory-management
|
||
issues. Each of the four regex objects refer to each other, some directly
|
||
and some indirectly, some by value and some by reference. What if we were
|
||
to return one of them from a function and let the others go out of scope?
|
||
What becomes of the references? The answer is that the regex objects are
|
||
internally reference counted, such that they keep their referenced regex
|
||
objects alive as long as they need them. So passing a regex object by value
|
||
is never a problem, even if it refers to other regex objects that have gone
|
||
out of scope.
|
||
</p>
|
||
<p>
|
||
Those of you who have dealt with reference counting are probably familiar
|
||
with its Achilles Heel: cyclic references. If regex objects are reference
|
||
counted, what happens to cycles like the one created in the calculator examples?
|
||
Are they leaked? The answer is no, they are not leaked. The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
object has some tricky reference tracking code that ensures that even cyclic
|
||
regex grammars are cleaned up when the last external reference goes away.
|
||
So don't worry about it. Create cyclic grammars, pass your regex objects
|
||
around and copy them all you want. It is fast and efficient and guaranteed
|
||
not to leak or result in dangling references.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h6"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_regexes_and_sub_match_scoping"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_regexes_and_sub_match_scoping">Nested
|
||
Regexes and Sub-Match Scoping</a>
|
||
</h3>
|
||
<p>
|
||
Nested regular expressions raise the issue of sub-match scoping. If both
|
||
the inner and outer regex write to and read from the same sub-match vector,
|
||
chaos would ensue. The inner regex would stomp on the sub-matches written
|
||
by the outer regex. For example, what does this do?
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">inner</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(.)\\1"</span> <span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">outer</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="identifier">_</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">inner</span> <span class="special">>></span> <span class="identifier">s1</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
The author probably didn't intend for the inner regex to overwrite the sub-match
|
||
written by the outer regex. The problem is particularly acute when the inner
|
||
regex is accepted from the user as input. The author has no way of knowing
|
||
whether the inner regex will stomp the sub-match vector or not. This is clearly
|
||
not acceptable.
|
||
</p>
|
||
<p>
|
||
Instead, what actually happens is that each invocation of a nested regex
|
||
gets its own scope. Sub-matches belong to that scope. That is, each nested
|
||
regex invocation gets its own copy of the sub-match vector to play with,
|
||
so there is no way for an inner regex to stomp on the sub-matches of an outer
|
||
regex. So, for example, the regex <code class="computeroutput"><span class="identifier">outer</span></code>
|
||
defined above would match <code class="computeroutput"><span class="string">"ABBA"</span></code>,
|
||
as it should.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h7"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.nested_results">Nested
|
||
Results</a>
|
||
</h3>
|
||
<p>
|
||
If nested regexes have their own sub-matches, there should be a way to access
|
||
them after a successful match. In fact, there is. After a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>,
|
||
the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
struct behaves like the head of a tree of nested results. The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
class provides a <code class="computeroutput"><span class="identifier">nested_results</span><span class="special">()</span></code> member function that returns an ordered
|
||
sequence of <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
structures, representing the results of the nested regexes. The order of
|
||
the nested results is the same as the order in which the nested regex objects
|
||
matched.
|
||
</p>
|
||
<p>
|
||
Take as an example the regex for balanced, nested parentheses we saw earlier:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">parentheses</span><span class="special">;</span>
|
||
<span class="identifier">parentheses</span> <span class="special">=</span> <span class="char">'('</span> <span class="special">>></span> <span class="special">*(</span> <span class="identifier">keep</span><span class="special">(</span> <span class="special">+~(</span><span class="identifier">set</span><span class="special">=</span><span class="char">'('</span><span class="special">,</span><span class="char">')'</span><span class="special">)</span> <span class="special">)</span> <span class="special">|</span> <span class="identifier">by_ref</span><span class="special">(</span><span class="identifier">parentheses</span><span class="special">)</span> <span class="special">)</span> <span class="special">>></span> <span class="char">')'</span><span class="special">;</span>
|
||
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"blah blah( a(b)c (c(e)f (g)h )i (j)6 )blah"</span> <span class="special">);</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_search</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">parentheses</span> <span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="comment">// display the whole match</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
|
||
<span class="comment">// display the nested results</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
|
||
<span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">(),</span>
|
||
<span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">(),</span>
|
||
<span class="identifier">output_nested_results</span><span class="special">()</span> <span class="special">);</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program displays the following:
|
||
</p>
|
||
<pre class="programlisting">( a(b)c (c(e)f (g)h )i (j)6 )
|
||
(b)
|
||
(c(e)f (g)h )
|
||
(e)
|
||
(g)
|
||
(j)
|
||
</pre>
|
||
<p>
|
||
Here you can see how the results are nested and that they are stored in the
|
||
order in which they are found.
|
||
</p>
|
||
<div class="tip"><table border="0" summary="Tip">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../../doc/src/images/tip.png"></td>
|
||
<th align="left">Tip</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
See the definition of <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results">output_nested_results</a>
|
||
in the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">Examples</a>
|
||
section.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.h8"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.grammars_and_nested_matches.filtering_nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.grammars_and_nested_matches.filtering_nested_results">Filtering
|
||
Nested Results</a>
|
||
</h3>
|
||
<p>
|
||
Sometimes a regex will have several nested regex objects, and you want to
|
||
know which result corresponds to which regex object. That's where <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code>
|
||
and <code class="computeroutput"><span class="identifier">match_results</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code>
|
||
come in handy. When iterating over the nested results, you can compare the
|
||
regex id from the results to the id of the regex object you're interested
|
||
in.
|
||
</p>
|
||
<p>
|
||
To make this a bit easier, xpressive provides a predicate to make it simple
|
||
to iterate over just the results that correspond to a certain nested regex.
|
||
It is called <code class="computeroutput"><span class="identifier">regex_id_filter_predicate</span></code>,
|
||
and it is intended to be used with <a href="../../../libs/iterator/doc/index.html" target="_top">Boost.Iterator</a>.
|
||
You can use it as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">name</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">alpha</span><span class="special">;</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">integer</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">;</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">re</span> <span class="special">=</span> <span class="special">*(</span> <span class="special">*</span><span class="identifier">_s</span> <span class="special">>></span> <span class="special">(</span> <span class="identifier">name</span> <span class="special">|</span> <span class="identifier">integer</span> <span class="special">)</span> <span class="special">);</span>
|
||
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"marsha 123 jan 456 cindy 789"</span> <span class="special">);</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">re</span> <span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">smatch</span><span class="special">::</span><span class="identifier">nested_results_type</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">begin</span> <span class="special">=</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">();</span>
|
||
<span class="identifier">smatch</span><span class="special">::</span><span class="identifier">nested_results_type</span><span class="special">::</span><span class="identifier">const_iterator</span> <span class="identifier">end</span> <span class="special">=</span> <span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">();</span>
|
||
|
||
<span class="comment">// declare filter predicates to select just the names or the integers</span>
|
||
<span class="identifier">sregex_id_filter_predicate</span> <span class="identifier">name_id</span><span class="special">(</span> <span class="identifier">name</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span> <span class="special">);</span>
|
||
<span class="identifier">sregex_id_filter_predicate</span> <span class="identifier">integer_id</span><span class="special">(</span> <span class="identifier">integer</span><span class="special">.</span><span class="identifier">regex_id</span><span class="special">()</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// iterate over only the results from the name regex</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">name_id</span><span class="special">,</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">name_id</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
|
||
<span class="identifier">output_result</span>
|
||
<span class="special">);</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
|
||
<span class="comment">// iterate over only the results from the integer regex</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">integer_id</span><span class="special">,</span> <span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">(</span> <span class="identifier">integer_id</span><span class="special">,</span> <span class="identifier">end</span><span class="special">,</span> <span class="identifier">end</span> <span class="special">),</span>
|
||
<span class="identifier">output_result</span>
|
||
<span class="special">);</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
where <code class="computeroutput"><span class="identifier">output_results</span></code> is a
|
||
simple function that takes a <code class="computeroutput"><span class="identifier">smatch</span></code>
|
||
and displays the full match. Notice how we use the <code class="computeroutput"><span class="identifier">regex_id_filter_predicate</span></code>
|
||
together with <code class="computeroutput"><span class="identifier">basic_regex</span><span class="special"><>::</span><span class="identifier">regex_id</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">make_filter_iterator</span><span class="special">()</span></code> from the <a href="../../../libs/iterator/doc/index.html" target="_top">Boost.Iterator</a>
|
||
to select only those results corresponding to a particular nested regex.
|
||
This program displays the following:
|
||
</p>
|
||
<pre class="programlisting">marsha
|
||
jan
|
||
cindy
|
||
123
|
||
456
|
||
789
|
||
</pre>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions" title="Semantic Actions and User-Defined Assertions">Semantic
|
||
Actions and User-Defined Assertions</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
Imagine you want to parse an input string and build a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>
|
||
from it. For something like that, matching a regular expression isn't enough.
|
||
You want to <span class="emphasis"><em>do something</em></span> when parts of your regular
|
||
expression match. Xpressive lets you attach semantic actions to parts of
|
||
your static regular expressions. This section shows you how.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.semantic_actions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.semantic_actions">Semantic
|
||
Actions</a>
|
||
</h3>
|
||
<p>
|
||
Consider the following code, which uses xpressive's semantic actions to parse
|
||
a string of word/integer pairs and stuffs them into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>.
|
||
It is described below.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span>
|
||
|
||
<span class="comment">// Match a word and an integer, separated by =>,</span>
|
||
<span class="comment">// and then stuff the result into a std::map<></span>
|
||
<span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
|
||
|
||
<span class="comment">// Match one or more word/integer pairs, separated</span>
|
||
<span class="comment">// by whitespace.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">pair</span> <span class="special">>></span> <span class="special">*(+</span><span class="identifier">_s</span> <span class="special">>></span> <span class="identifier">pair</span><span class="special">);</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program prints the following:
|
||
</p>
|
||
<pre class="programlisting">1
|
||
23
|
||
456
|
||
</pre>
|
||
<p>
|
||
The regular expression <code class="computeroutput"><span class="identifier">pair</span></code>
|
||
has two parts: the pattern and the action. The pattern says to match a word,
|
||
capturing it in sub-match 1, and an integer, capturing it in sub-match 2,
|
||
separated by <code class="computeroutput"><span class="string">"=>"</span></code>.
|
||
The action is the part in square brackets: <code class="computeroutput"><span class="special">[</span>
|
||
<span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span>
|
||
<span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">]</span></code>. It says
|
||
to take sub-match one and use it to index into the <code class="computeroutput"><span class="identifier">results</span></code>
|
||
map, and assign to it the result of converting sub-match 2 to an integer.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
To use semantic actions with your static regexes, you must <code class="computeroutput"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
How does this work? Just as the rest of the static regular expression, the
|
||
part between brackets is an expression template. It encodes the action and
|
||
executes it later. The expression <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span></code> creates a lazy reference to the <code class="computeroutput"><span class="identifier">result</span></code> object. The larger expression <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>
|
||
is a lazy map index operation. Later, when this action is getting executed,
|
||
<code class="computeroutput"><span class="identifier">s1</span></code> gets replaced with the
|
||
first <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>.
|
||
Likewise, when <code class="computeroutput"><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span></code> gets executed, <code class="computeroutput"><span class="identifier">s2</span></code>
|
||
is replaced with the second <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>.
|
||
The <code class="computeroutput"><span class="identifier">as</span><span class="special"><></span></code>
|
||
action converts its argument to the requested type using Boost.Lexical_cast.
|
||
The effect of the whole action is to insert a new word/integer pair into
|
||
the map.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
There is an important difference between the function <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code> in <code class="computeroutput"><span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">ref</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>
|
||
and <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
|
||
in <code class="computeroutput"><span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code>. The first returns a plain <code class="computeroutput"><span class="identifier">reference_wrapper</span><span class="special"><></span></code>
|
||
which behaves in many respects like an ordinary reference. By contrast,
|
||
<code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>
|
||
returns a <span class="emphasis"><em>lazy</em></span> reference that you can use in expressions
|
||
that are executed lazily. That is why we can say <code class="computeroutput"><span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)[</span><span class="identifier">s1</span><span class="special">]</span></code>, even though <code class="computeroutput"><span class="identifier">result</span></code>
|
||
doesn't have an <code class="computeroutput"><span class="keyword">operator</span><span class="special">[]</span></code>
|
||
that would accept <code class="computeroutput"><span class="identifier">s1</span></code>.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
In addition to the sub-match placeholders <code class="computeroutput"><span class="identifier">s1</span></code>,
|
||
<code class="computeroutput"><span class="identifier">s2</span></code>, etc., you can also use
|
||
the placeholder <code class="computeroutput"><span class="identifier">_</span></code> within
|
||
an action to refer back to the string matched by the sub-expression to which
|
||
the action is attached. For instance, you can use the following regex to
|
||
match a bunch of digits, interpret them as an integer and assign the result
|
||
to a local variable:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="comment">// Here, _ refers back to all the</span>
|
||
<span class="comment">// characters matched by (+_d)</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">];</span>
|
||
</pre>
|
||
<h4>
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_action_execution"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_action_execution">Lazy
|
||
Action Execution</a>
|
||
</h4>
|
||
<p>
|
||
What does it mean, exactly, to attach an action to part of a regular expression
|
||
and perform a match? When does the action execute? If the action is part
|
||
of a repeated sub-expression, does the action execute once or many times?
|
||
And if the sub-expression initially matches, but ultimately fails because
|
||
the rest of the regular expression fails to match, is the action executed
|
||
at all?
|
||
</p>
|
||
<p>
|
||
The answer is that by default, actions are executed <span class="emphasis"><em>lazily</em></span>.
|
||
When a sub-expression matches a string, its action is placed on a queue,
|
||
along with the current values of any sub-matches to which the action refers.
|
||
If the match algorithm must backtrack, actions are popped off the queue as
|
||
necessary. Only after the entire regex has matched successfully are the actions
|
||
actually exeucted. They are executed all at once, in the order in which they
|
||
were added to the queue, as the last step before <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
returns.
|
||
</p>
|
||
<p>
|
||
For example, consider the following regex that increments a counter whenever
|
||
it finds a digit.
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
|
||
<span class="comment">// count the exciting digits, but not the</span>
|
||
<span class="comment">// questionable ones.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
|
||
<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span> <span class="special">==</span> <span class="number">2</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
The action <code class="computeroutput"><span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span></code>
|
||
is queued three times: once for each found digit. But it is only <span class="emphasis"><em>executed</em></span>
|
||
twice: once for each digit that precedes a <code class="computeroutput"><span class="char">'!'</span></code>
|
||
character. When the <code class="computeroutput"><span class="char">'?'</span></code> character
|
||
is encountered, the match algorithm backtracks, removing the final action
|
||
from the queue.
|
||
</p>
|
||
<h4>
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.immediate_action_execution"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.immediate_action_execution">Immediate
|
||
Action Execution</a>
|
||
</h4>
|
||
<p>
|
||
When you want semantic actions to execute immediately, you can wrap the sub-expression
|
||
containing the action in a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/keep.html" title="Function template keep">keep()</a></code></code>.
|
||
<code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
|
||
turns off back-tracking for its sub-expression, but it also causes any actions
|
||
queued by the sub-expression to execute at the end of the <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>. It is as if the sub-expression in the
|
||
<code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
|
||
were compiled into an independent regex object, and matching the <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>
|
||
is like a separate invocation of <code class="computeroutput"><span class="identifier">regex_search</span><span class="special">()</span></code>. It matches characters and executes actions
|
||
but never backtracks or unwinds. For example, imagine the above example had
|
||
been written as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
|
||
<span class="comment">// count all the digits.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">keep</span><span class="special">(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">)</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
|
||
<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span> <span class="special">==</span> <span class="number">3</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
We have wrapped the sub-expression <code class="computeroutput"><span class="identifier">_d</span>
|
||
<span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span></code> in <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>.
|
||
Now, whenever this regex matches a digit, the action will be queued and then
|
||
immediately executed before we try to match a <code class="computeroutput"><span class="char">'!'</span></code>
|
||
character. In this case, the action executes three times.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Like <code class="computeroutput"><span class="identifier">keep</span><span class="special">()</span></code>,
|
||
actions within <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/before.html" title="Function template before">before()</a></code></code>
|
||
and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/after.html" title="Function template after">after()</a></code></code>
|
||
are also executed early when their sub-expressions have matched.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<h4>
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h4"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_functions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.lazy_functions">Lazy
|
||
Functions</a>
|
||
</h4>
|
||
<p>
|
||
So far, we've seen how to write semantic actions consisting of variables
|
||
and operators. But what if you want to be able to call a function from a
|
||
semantic action? Xpressive provides a mechanism to do this.
|
||
</p>
|
||
<p>
|
||
The first step is to define a function object type. Here, for instance, is
|
||
a function object type that calls <code class="computeroutput"><span class="identifier">push</span><span class="special">()</span></code> on its argument:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">push_impl</span>
|
||
<span class="special">{</span>
|
||
<span class="comment">// Result type, needed for tr1::result_of</span>
|
||
<span class="keyword">typedef</span> <span class="keyword">void</span> <span class="identifier">result_type</span><span class="special">;</span>
|
||
|
||
<span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Sequence</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Value</span><span class="special">></span>
|
||
<span class="keyword">void</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">Sequence</span> <span class="special">&</span><span class="identifier">seq</span><span class="special">,</span> <span class="identifier">Value</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">val</span><span class="special">)</span> <span class="keyword">const</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">seq</span><span class="special">.</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">val</span><span class="special">);</span>
|
||
<span class="special">}</span>
|
||
<span class="special">};</span>
|
||
</pre>
|
||
<p>
|
||
The next step is to use xpressive's <code class="computeroutput"><span class="identifier">function</span><span class="special"><></span></code> template to define a function object
|
||
named <code class="computeroutput"><span class="identifier">push</span></code>:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Global "push" function object.</span>
|
||
<span class="identifier">function</span><span class="special"><</span><span class="identifier">push_impl</span><span class="special">>::</span><span class="identifier">type</span> <span class="keyword">const</span> <span class="identifier">push</span> <span class="special">=</span> <span class="special">{{}};</span>
|
||
</pre>
|
||
<p>
|
||
The initialization looks a bit odd, but this is because <code class="computeroutput"><span class="identifier">push</span></code>
|
||
is being statically initialized. That means it doesn't need to be constructed
|
||
at runtime. We can use <code class="computeroutput"><span class="identifier">push</span></code>
|
||
in semantic actions as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">std</span><span class="special">::</span><span class="identifier">stack</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ints</span><span class="special">;</span>
|
||
<span class="comment">// Match digits, cast them to an int</span>
|
||
<span class="comment">// and push it on the stack.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">ints</span><span class="special">),</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">))];</span>
|
||
</pre>
|
||
<p>
|
||
You'll notice that doing it this way causes member function invocations to
|
||
look like ordinary function invocations. You can choose to write your semantic
|
||
action in a different way that makes it look a bit more like a member function
|
||
call:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">(+</span><span class="identifier">_d</span><span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">ints</span><span class="special">)->*</span><span class="identifier">push</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">))];</span>
|
||
</pre>
|
||
<p>
|
||
Xpressive recognizes the use of the <code class="computeroutput"><span class="special">->*</span></code>
|
||
and treats this expression exactly the same as the one above.
|
||
</p>
|
||
<p>
|
||
When your function object must return a type that depends on its arguments,
|
||
you can use a <code class="computeroutput"><span class="identifier">result</span><span class="special"><></span></code>
|
||
member template instead of the <code class="computeroutput"><span class="identifier">result_type</span></code>
|
||
typedef. Here, for example, is a <code class="computeroutput"><span class="identifier">first</span></code>
|
||
function object that returns the <code class="computeroutput"><span class="identifier">first</span></code>
|
||
member of a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">pair</span><span class="special"><></span></code>
|
||
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Function object that returns the</span>
|
||
<span class="comment">// first element of a pair.</span>
|
||
<span class="keyword">struct</span> <span class="identifier">first_impl</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Sig</span><span class="special">></span> <span class="keyword">struct</span> <span class="identifier">result</span> <span class="special">{};</span>
|
||
|
||
<span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">This</span><span class="special">,</span> <span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">></span>
|
||
<span class="keyword">struct</span> <span class="identifier">result</span><span class="special"><</span><span class="identifier">This</span><span class="special">(</span><span class="identifier">Pair</span><span class="special">)></span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">remove_reference</span><span class="special"><</span><span class="identifier">Pair</span><span class="special">></span>
|
||
<span class="special">::</span><span class="identifier">type</span><span class="special">::</span><span class="identifier">first_type</span> <span class="identifier">type</span><span class="special">;</span>
|
||
<span class="special">};</span>
|
||
|
||
<span class="keyword">template</span><span class="special"><</span><span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">></span>
|
||
<span class="keyword">typename</span> <span class="identifier">Pair</span><span class="special">::</span><span class="identifier">first_type</span>
|
||
<span class="keyword">operator</span><span class="special">()(</span><span class="identifier">Pair</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">p</span><span class="special">)</span> <span class="keyword">const</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">return</span> <span class="identifier">p</span><span class="special">.</span><span class="identifier">first</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
<span class="special">};</span>
|
||
|
||
<span class="comment">// OK, use as first(s1) to get the begin iterator</span>
|
||
<span class="comment">// of the sub-match referred to by s1.</span>
|
||
<span class="identifier">function</span><span class="special"><</span><span class="identifier">first_impl</span><span class="special">>::</span><span class="identifier">type</span> <span class="keyword">const</span> <span class="identifier">first</span> <span class="special">=</span> <span class="special">{{}};</span>
|
||
</pre>
|
||
<h4>
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h5"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_local_variables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_local_variables">Referring
|
||
to Local Variables</a>
|
||
</h4>
|
||
<p>
|
||
As we've seen in the examples above, we can refer to local variables within
|
||
an actions using <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">ref</span><span class="special">()</span></code>.
|
||
Any such variables are held by reference by the regular expression, and care
|
||
should be taken to avoid letting those references dangle. For instance, in
|
||
the following code, the reference to <code class="computeroutput"><span class="identifier">i</span></code>
|
||
is left to dangle when <code class="computeroutput"><span class="identifier">bad_voodoo</span><span class="special">()</span></code> returns:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">bad_voodoo</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
|
||
<span class="comment">// ERROR! rex refers by reference to a local</span>
|
||
<span class="comment">// variable, which will dangle after bad_voodoo()</span>
|
||
<span class="comment">// returns.</span>
|
||
<span class="keyword">return</span> <span class="identifier">rex</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
When writing semantic actions, it is your responsibility to make sure that
|
||
all the references do not dangle. One way to do that would be to make the
|
||
variables shared pointers that are held by the regex by value.
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">sregex</span> <span class="identifier">good_voodoo</span><span class="special">(</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="comment">// Use val() to hold the shared_ptr by value:</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
|
||
<span class="comment">// OK, rex holds a reference count to the integer.</span>
|
||
<span class="keyword">return</span> <span class="identifier">rex</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
In the above code, we use <code class="computeroutput"><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">val</span><span class="special">()</span></code>
|
||
to hold the shared pointer by value. That's not normally necessary because
|
||
local variables appearing in actions are held by value by default, but in
|
||
this case, it is necessary. Had we written the action as <code class="computeroutput"><span class="special">++*</span><span class="identifier">pi</span></code>, it would have executed immediately.
|
||
That's because <code class="computeroutput"><span class="special">++*</span><span class="identifier">pi</span></code>
|
||
is not an expression template, but <code class="computeroutput"><span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span></code> is.
|
||
</p>
|
||
<p>
|
||
It can be tedious to wrap all your variables in <code class="computeroutput"><span class="identifier">ref</span><span class="special">()</span></code> and <code class="computeroutput"><span class="identifier">val</span><span class="special">()</span></code> in your semantic actions. Xpressive provides
|
||
the <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code>
|
||
and <code class="computeroutput"><span class="identifier">value</span><span class="special"><></span></code>
|
||
templates to make things easier. The following table shows the equivalencies:
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.t0"></a><p class="title"><b>Table 47.12. reference<> and value<></b></p>
|
||
<div class="table-contents"><table class="table" summary="reference<> and value<>">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
This ...
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
... is equivalent to this ...
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">i</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
|
||
<p>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="identifier">reference</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ri</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ri</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
|
||
<p>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">(</span><span class="keyword">new</span> <span class="keyword">int</span><span class="special">(</span><span class="number">0</span><span class="special">));</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">val</span><span class="special">(</span><span class="identifier">pi</span><span class="special">)</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
|
||
<p>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">pi</span><span class="special">(</span><span class="keyword">new</span> <span class="keyword">int</span><span class="special">(</span><span class="number">0</span><span class="special">));</span>
|
||
<span class="identifier">value</span><span class="special"><</span><span class="identifier">boost</span><span class="special">::</span><span class="identifier">shared_ptr</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">vpi</span><span class="special">(</span><span class="identifier">pi</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++*</span><span class="identifier">vpi</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
|
||
<p>
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
As you can see, when using <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code>, you need to first declare a local
|
||
variable and then declare a <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code> to it. These two steps can be combined
|
||
into one using <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code>.
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.t1"></a><p class="title"><b>Table 47.13. local<> vs. reference<></b></p>
|
||
<div class="table-contents"><table class="table" summary="local<> vs. reference<>">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
This ...
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
... is equivalent to this ...
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody><tr>
|
||
<td>
|
||
<p>
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">local</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">i</span><span class="special">(</span><span class="number">0</span><span class="special">);</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">i</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
|
||
<p>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">i</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="identifier">reference</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">ri</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">ri</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span></pre>
|
||
<p>
|
||
</p>
|
||
</td>
|
||
</tr></tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><p>
|
||
We can use <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code>
|
||
to rewrite the above example as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">local</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">i</span><span class="special">(</span><span class="number">0</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"1!2!3?"</span><span class="special">);</span>
|
||
<span class="comment">// count the exciting digits, but not the</span>
|
||
<span class="comment">// questionable ones.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="special">+(</span> <span class="identifier">_d</span> <span class="special">[</span> <span class="special">++</span><span class="identifier">i</span> <span class="special">]</span> <span class="special">>></span> <span class="char">'!'</span> <span class="special">);</span>
|
||
<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span> <span class="identifier">i</span><span class="special">.</span><span class="identifier">get</span><span class="special">()</span> <span class="special">==</span> <span class="number">2</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
Notice that we use <code class="computeroutput"><span class="identifier">local</span><span class="special"><>::</span><span class="identifier">get</span><span class="special">()</span></code> to access the value of the local variable.
|
||
Also, beware that <code class="computeroutput"><span class="identifier">local</span><span class="special"><></span></code>
|
||
can be used to create a dangling reference, just as <code class="computeroutput"><span class="identifier">reference</span><span class="special"><></span></code> can.
|
||
</p>
|
||
<h4>
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h6"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_non_local_variables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.referring_to_non_local_variables">Referring
|
||
to Non-Local Variables</a>
|
||
</h4>
|
||
<p>
|
||
In the beginning of this section, we used a regex with a semantic action
|
||
to parse a string of word/integer pairs and stuff them into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>. That required that the map and the
|
||
regex be defined together and used before either could go out of scope. What
|
||
if we wanted to define the regex once and use it to fill lots of different
|
||
maps? We would rather pass the map into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
algorithm rather than embed a reference to it directly in the regex object.
|
||
What we can do instead is define a placeholder and use that in the semantic
|
||
action instead of the map itself. Later, when we call one of the regex algorithms,
|
||
we can bind the reference to an actual map object. The following code shows
|
||
how.
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Define a placeholder for a map object:</span>
|
||
<span class="identifier">placeholder</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">_map</span><span class="special">;</span>
|
||
|
||
<span class="comment">// Match a word and an integer, separated by =>,</span>
|
||
<span class="comment">// and then stuff the result into a std::map<></span>
|
||
<span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">[</span> <span class="identifier">_map</span><span class="special">[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
|
||
|
||
<span class="comment">// Match one or more word/integer pairs, separated</span>
|
||
<span class="comment">// by whitespace.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="identifier">pair</span> <span class="special">>></span> <span class="special">*(+</span><span class="identifier">_s</span> <span class="special">>></span> <span class="identifier">pair</span><span class="special">);</span>
|
||
|
||
<span class="comment">// The string to parse</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span>
|
||
|
||
<span class="comment">// Here is the actual map to fill in:</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
|
||
|
||
<span class="comment">// Bind the _map placeholder to the actual map</span>
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
<span class="identifier">what</span><span class="special">.</span><span class="identifier">let</span><span class="special">(</span> <span class="identifier">_map</span> <span class="special">=</span> <span class="identifier">result</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// Execute the match and fill in result map</span>
|
||
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_match</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">))</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program displays:
|
||
</p>
|
||
<pre class="programlisting">1
|
||
23
|
||
456
|
||
</pre>
|
||
<p>
|
||
We use <code class="computeroutput"><span class="identifier">placeholder</span><span class="special"><></span></code>
|
||
here to define <code class="computeroutput"><span class="identifier">_map</span></code>, which
|
||
stands in for a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>
|
||
variable. We can use the placeholder in the semantic action as if it were
|
||
a map. Then, we define a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
struct and bind an actual map to the placeholder with "<code class="computeroutput"><span class="identifier">what</span><span class="special">.</span><span class="identifier">let</span><span class="special">(</span> <span class="identifier">_map</span> <span class="special">=</span> <span class="identifier">result</span> <span class="special">);</span></code>". The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
call behaves as if the placeholder in the semantic action had been replaced
|
||
with a reference to <code class="computeroutput"><span class="identifier">result</span></code>.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Placeholders in semantic actions are not <span class="emphasis"><em>actually</em></span>
|
||
replaced at runtime with references to variables. The regex object is never
|
||
mutated in any way during any of the regex algorithms, so they are safe
|
||
to use in multiple threads.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
The syntax for late-bound action arguments is a little different if you are
|
||
using <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_iterator.html" title="Struct template regex_iterator">regex_iterator<></a></code></code>
|
||
or <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>.
|
||
The regex iterators accept an extra constructor parameter for specifying
|
||
the argument bindings. There is a <code class="computeroutput"><span class="identifier">let</span><span class="special">()</span></code> function that you can use to bind variables
|
||
to their placeholders. The following code demonstrates how.
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Define a placeholder for a map object:</span>
|
||
<span class="identifier">placeholder</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="special">></span> <span class="identifier">_map</span><span class="special">;</span>
|
||
|
||
<span class="comment">// Match a word and an integer, separated by =>,</span>
|
||
<span class="comment">// and then stuff the result into a std::map<></span>
|
||
<span class="identifier">sregex</span> <span class="identifier">pair</span> <span class="special">=</span> <span class="special">(</span> <span class="special">(</span><span class="identifier">s1</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_w</span><span class="special">)</span> <span class="special">>></span> <span class="string">"=>"</span> <span class="special">>></span> <span class="special">(</span><span class="identifier">s2</span><span class="special">=</span> <span class="special">+</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">[</span> <span class="identifier">_map</span><span class="special">[</span><span class="identifier">s1</span><span class="special">]</span> <span class="special">=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">s2</span><span class="special">)</span> <span class="special">];</span>
|
||
|
||
<span class="comment">// The string to parse</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"aaa=>1 bbb=>23 ccc=>456"</span><span class="special">);</span>
|
||
|
||
<span class="comment">// Here is the actual map to fill in:</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
|
||
|
||
<span class="comment">// Create a regex_iterator to find all the matches</span>
|
||
<span class="identifier">sregex_iterator</span> <span class="identifier">it</span><span class="special">(</span><span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">pair</span><span class="special">,</span> <span class="identifier">let</span><span class="special">(</span><span class="identifier">_map</span><span class="special">=</span><span class="identifier">result</span><span class="special">));</span>
|
||
<span class="identifier">sregex_iterator</span> <span class="identifier">end</span><span class="special">;</span>
|
||
|
||
<span class="comment">// step through all the matches, and fill in</span>
|
||
<span class="comment">// the result map</span>
|
||
<span class="keyword">while</span><span class="special">(</span><span class="identifier">it</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">)</span>
|
||
<span class="special">++</span><span class="identifier">it</span><span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"aaa"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"bbb"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span><span class="special">[</span><span class="string">"ccc"</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
This program displays:
|
||
</p>
|
||
<pre class="programlisting">1
|
||
23
|
||
456
|
||
</pre>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.h7"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.user_defined_assertions"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.semantic_actions_and_user_defined_assertions.user_defined_assertions">User-Defined
|
||
Assertions</a>
|
||
</h3>
|
||
<p>
|
||
You are probably already familiar with regular expression <span class="emphasis"><em>assertions</em></span>.
|
||
In Perl, some examples are the <code class="literal">^</code> and <code class="literal">$</code>
|
||
assertions, which you can use to match the beginning and end of a string,
|
||
respectively. Xpressive lets you define your own assertions. A custom assertion
|
||
is a contition which must be true at a point in the match in order for the
|
||
match to succeed. You can check a custom assertion with xpressive's <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/check.html" title="Function template check">check()</a></code></code> function.
|
||
</p>
|
||
<p>
|
||
There are a couple of ways to define a custom assertion. The simplest is
|
||
to use a function object. Let's say that you want to ensure that a sub-expression
|
||
matches a sub-string that is either 3 or 6 characters long. The following
|
||
struct defines such a predicate:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// A predicate that is true IFF a sub-match is</span>
|
||
<span class="comment">// either 3 or 6 characters long.</span>
|
||
<span class="keyword">struct</span> <span class="identifier">three_or_six</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">bool</span> <span class="keyword">operator</span><span class="special">()(</span><span class="identifier">ssub_match</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">sub</span><span class="special">)</span> <span class="keyword">const</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">return</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span> <span class="special">==</span> <span class="number">3</span> <span class="special">||</span> <span class="identifier">sub</span><span class="special">.</span><span class="identifier">length</span><span class="special">()</span> <span class="special">==</span> <span class="number">6</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
<span class="special">};</span>
|
||
</pre>
|
||
<p>
|
||
You can use this predicate within a regular expression as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// match words of 3 characters or 6 characters.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">bow</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">)[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">three_or_six</span><span class="special">())</span> <span class="special">]</span> <span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
The above regular expression will find whole words that are either 3 or 6
|
||
characters long. The <code class="computeroutput"><span class="identifier">three_or_six</span></code>
|
||
predicate accepts a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/sub_match.html" title="Struct template sub_match">sub_match<></a></code></code>
|
||
that refers back to the part of the string matched by the sub-expression
|
||
to which the custom assertion is attached.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
The custom assertion participates in determining whether the match succeeds
|
||
or fails. Unlike actions, which execute lazily, custom assertions execute
|
||
immediately while the regex engine is searching for a match.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
Custom assertions can also be defined inline using the same syntax as for
|
||
semantic actions. Below is the same custom assertion written inline:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// match words of 3 characters or 6 characters.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">bow</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">eow</span><span class="special">)[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">length</span><span class="special">(</span><span class="identifier">_</span><span class="special">)==</span><span class="number">3</span> <span class="special">||</span> <span class="identifier">length</span><span class="special">(</span><span class="identifier">_</span><span class="special">)==</span><span class="number">6</span><span class="special">)</span> <span class="special">]</span> <span class="special">;</span>
|
||
</pre>
|
||
<p>
|
||
In the above, <code class="computeroutput"><span class="identifier">length</span><span class="special">()</span></code>
|
||
is a lazy function that calls the <code class="computeroutput"><span class="identifier">length</span><span class="special">()</span></code> member function of its argument, and <code class="computeroutput"><span class="identifier">_</span></code> is a placeholder that receives the <code class="computeroutput"><span class="identifier">sub_match</span></code>.
|
||
</p>
|
||
<p>
|
||
Once you get the hang of writing custom assertions inline, they can be very
|
||
powerful. For example, you can write a regular expression that only matches
|
||
valid dates (for some suitably liberal definition of the term <span class="quote">“<span class="quote">valid</span>”</span>).
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">days_per_month</span><span class="special">[]</span> <span class="special">=</span>
|
||
<span class="special">{</span><span class="number">31</span><span class="special">,</span> <span class="number">29</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">30</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">,</span> <span class="number">31</span><span class="special">};</span>
|
||
|
||
<span class="identifier">mark_tag</span> <span class="identifier">month</span><span class="special">(</span><span class="number">1</span><span class="special">),</span> <span class="identifier">day</span><span class="special">(</span><span class="number">2</span><span class="special">);</span>
|
||
<span class="comment">// find a valid date of the form month/day/year.</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">date</span> <span class="special">=</span>
|
||
<span class="special">(</span>
|
||
<span class="comment">// Month must be between 1 and 12 inclusive</span>
|
||
<span class="special">(</span><span class="identifier">month</span><span class="special">=</span> <span class="identifier">_d</span> <span class="special">>></span> <span class="special">!</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1</span>
|
||
<span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">12</span><span class="special">)</span> <span class="special">]</span>
|
||
<span class="special">>></span> <span class="char">'/'</span>
|
||
<span class="comment">// Day must be between 1 and 31 inclusive</span>
|
||
<span class="special">>></span> <span class="special">(</span><span class="identifier">day</span><span class="special">=</span> <span class="identifier">_d</span> <span class="special">>></span> <span class="special">!</span><span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1</span>
|
||
<span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">31</span><span class="special">)</span> <span class="special">]</span>
|
||
<span class="special">>></span> <span class="char">'/'</span>
|
||
<span class="comment">// Only consider years between 1970 and 2038</span>
|
||
<span class="special">>></span> <span class="special">(</span><span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">)</span> <span class="special">[</span> <span class="identifier">check</span><span class="special">(</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special">>=</span> <span class="number">1970</span>
|
||
<span class="special">&&</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">_</span><span class="special">)</span> <span class="special"><=</span> <span class="number">2038</span><span class="special">)</span> <span class="special">]</span>
|
||
<span class="special">)</span>
|
||
<span class="comment">// Ensure the month actually has that many days!</span>
|
||
<span class="special">[</span> <span class="identifier">check</span><span class="special">(</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">days_per_month</span><span class="special">)[</span><span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">month</span><span class="special">)-</span><span class="number">1</span><span class="special">]</span> <span class="special">>=</span> <span class="identifier">as</span><span class="special"><</span><span class="keyword">int</span><span class="special">>(</span><span class="identifier">day</span><span class="special">)</span> <span class="special">)</span> <span class="special">]</span>
|
||
<span class="special">;</span>
|
||
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span><span class="string">"99/99/9999 2/30/2006 2/28/2006"</span><span class="special">);</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span><span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">date</span><span class="special">))</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">endl</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
The above program prints out the following:
|
||
</p>
|
||
<pre class="programlisting">2/28/2006
|
||
</pre>
|
||
<p>
|
||
Notice how the inline custom assertions are used to range-check the values
|
||
for the month, day and year. The regular expression doesn't match <code class="computeroutput"><span class="string">"99/99/9999"</span></code> or <code class="computeroutput"><span class="string">"2/30/2006"</span></code>
|
||
because they are not valid dates. (There is no 99th month, and February doesn't
|
||
have 30 days.)
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes" title="Symbol Tables and Attributes">Symbol
|
||
Tables and Attributes</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
Symbol tables can be built into xpressive regular expressions with just a
|
||
<code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>.
|
||
The map keys are the strings to be matched and the map values are the data
|
||
to be returned to your semantic action. Xpressive attributes, named <code class="computeroutput"><span class="identifier">a1</span></code>, <code class="computeroutput"><span class="identifier">a2</span></code>,
|
||
through <code class="computeroutput"><span class="identifier">a9</span></code>, hold the value
|
||
corresponding to a matching key so that it can be used in a semantic action.
|
||
A default value can be specified for an attribute if a symbol is not found.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.symbol_tables"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.symbol_tables">Symbol
|
||
Tables</a>
|
||
</h3>
|
||
<p>
|
||
An xpressive symbol table is just a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><></span></code>,
|
||
where the key is a string type and the value can be anything. For example,
|
||
the following regular expression matches a key from map1 and assigns the
|
||
corresponding value to the attribute <code class="computeroutput"><span class="identifier">a1</span></code>.
|
||
Then, in the semantic action, it assigns the value stored in attribute <code class="computeroutput"><span class="identifier">a1</span></code> to an integer result.
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">int</span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">map1</span><span class="special">;</span>
|
||
<span class="comment">// ... (fill the map)</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">(</span> <span class="identifier">a1</span> <span class="special">=</span> <span class="identifier">map1</span> <span class="special">)</span> <span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span> <span class="special">=</span> <span class="identifier">a1</span> <span class="special">];</span>
|
||
</pre>
|
||
<p>
|
||
Consider the following example code, which translates number names into integers.
|
||
It is described below.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">string</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">regex_actions</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">map</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="keyword">int</span><span class="special">></span> <span class="identifier">number_map</span><span class="special">;</span>
|
||
<span class="identifier">number_map</span><span class="special">[</span><span class="string">"one"</span><span class="special">]</span> <span class="special">=</span> <span class="number">1</span><span class="special">;</span>
|
||
<span class="identifier">number_map</span><span class="special">[</span><span class="string">"two"</span><span class="special">]</span> <span class="special">=</span> <span class="number">2</span><span class="special">;</span>
|
||
<span class="identifier">number_map</span><span class="special">[</span><span class="string">"three"</span><span class="special">]</span> <span class="special">=</span> <span class="number">3</span><span class="special">;</span>
|
||
<span class="comment">// Match a string from number_map</span>
|
||
<span class="comment">// and store the integer value in 'result'</span>
|
||
<span class="comment">// if not found, store -1 in 'result'</span>
|
||
<span class="keyword">int</span> <span class="identifier">result</span> <span class="special">=</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="identifier">cregex</span> <span class="identifier">rx</span> <span class="special">=</span> <span class="special">((</span><span class="identifier">a1</span> <span class="special">=</span> <span class="identifier">number_map</span> <span class="special">)</span> <span class="special">|</span> <span class="special">*</span><span class="identifier">_</span><span class="special">)</span>
|
||
<span class="special">[</span> <span class="identifier">ref</span><span class="special">(</span><span class="identifier">result</span><span class="special">)</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">a1</span> <span class="special">|</span> <span class="special">-</span><span class="number">1</span><span class="special">)];</span>
|
||
|
||
<span class="identifier">regex_match</span><span class="special">(</span><span class="string">"three"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="identifier">regex_match</span><span class="special">(</span><span class="string">"two"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="identifier">regex_match</span><span class="special">(</span><span class="string">"stuff"</span><span class="special">,</span> <span class="identifier">rx</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">result</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program prints the following:
|
||
</p>
|
||
<pre class="programlisting">3
|
||
2
|
||
-1
|
||
</pre>
|
||
<p>
|
||
First the program builds a number map, with number names as string keys and
|
||
the corresponding integers as values. Then it constructs a static regular
|
||
expression using an attribute <code class="computeroutput"><span class="identifier">a1</span></code>
|
||
to represent the result of the symbol table lookup. In the semantic action,
|
||
the attribute is assigned to an integer variable <code class="computeroutput"><span class="identifier">result</span></code>.
|
||
If the symbol was not found, a default value of <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code> is assigned to <code class="computeroutput"><span class="identifier">result</span></code>.
|
||
A wildcard, <code class="computeroutput"><span class="special">*</span><span class="identifier">_</span></code>,
|
||
makes sure the regex matches even if the symbol is not found.
|
||
</p>
|
||
<p>
|
||
A more complete version of this example can be found in <code class="literal">libs/xpressive/example/numbers.cpp</code><a href="#ftn.boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="footnote" name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0"><sup class="footnote">[37]</sup></a>. It translates number names up to "nine hundred ninety nine
|
||
million nine hundred ninety nine thousand nine hundred ninety nine"
|
||
along with some special number names like "dozen".
|
||
</p>
|
||
<p>
|
||
Symbol table matches are case sensitive by default, but they can be made
|
||
case-insensitive by enclosing the expression in <code class="computeroutput"><span class="identifier">icase</span><span class="special">()</span></code>.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.symbol_tables_and_attributes.attributes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.symbol_tables_and_attributes.attributes">Attributes</a>
|
||
</h3>
|
||
<p>
|
||
Up to nine attributes can be used in a regular expression. They are named
|
||
<code class="computeroutput"><span class="identifier">a1</span></code>, <code class="computeroutput"><span class="identifier">a2</span></code>,
|
||
..., <code class="computeroutput"><span class="identifier">a9</span></code> in the <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span></code> namespace. The attribute type
|
||
is the same as the second component of the map that is assigned to it. A
|
||
default value for an attribute can be specified in a semantic action with
|
||
the syntax <code class="computeroutput"><span class="special">(</span><span class="identifier">a1</span>
|
||
<span class="special">|</span> <em class="replaceable"><code>default-value</code></em><span class="special">)</span></code>.
|
||
</p>
|
||
<p>
|
||
Attributes are properly scoped, so you can do crazy things like: <code class="computeroutput"><span class="special">(</span> <span class="special">(</span><span class="identifier">a1</span><span class="special">=</span><span class="identifier">sym1</span><span class="special">)</span>
|
||
<span class="special">>></span> <span class="special">(</span><span class="identifier">a1</span><span class="special">=</span><span class="identifier">sym2</span><span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">x</span><span class="special">)=</span><span class="identifier">a1</span><span class="special">]</span> <span class="special">)[</span><span class="identifier">ref</span><span class="special">(</span><span class="identifier">y</span><span class="special">)=</span><span class="identifier">a1</span><span class="special">]</span></code>. The
|
||
inner semantic action sees the inner <code class="computeroutput"><span class="identifier">a1</span></code>,
|
||
and the outer semantic action sees the outer one. They can even have different
|
||
types.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Xpressive builds a hidden ternary search trie from the map so it can search
|
||
quickly. If BOOST_DISABLE_THREADS is defined, the hidden ternary search
|
||
trie "self adjusts", so after each search it restructures itself
|
||
to improve the efficiency of future searches based on the frequency of
|
||
previous searches.
|
||
</p></td></tr>
|
||
</table></div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits" title="Localization and Regex Traits">Localization
|
||
and Regex Traits</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.overview"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.overview">Overview</a>
|
||
</h3>
|
||
<p>
|
||
Matching a regular expression against a string often requires locale-dependent
|
||
information. For example, how are case-insensitive comparisons performed?
|
||
The locale-sensitive behavior is captured in a traits class. xpressive provides
|
||
three traits class templates: <code class="computeroutput"><span class="identifier">cpp_regex_traits</span><span class="special"><></span></code>, <code class="computeroutput"><span class="identifier">c_regex_traits</span><span class="special"><></span></code> and <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code>. The first wraps a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>,
|
||
the second wraps the global C locale, and the third is a stub traits type
|
||
for use when searching non-character data. All traits templates conform to
|
||
the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Regex
|
||
Traits Concept</a>.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.setting_the_default_regex_trait"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.setting_the_default_regex_trait">Setting
|
||
the Default Regex Trait</a>
|
||
</h3>
|
||
<p>
|
||
By default, xpressive uses <code class="computeroutput"><span class="identifier">cpp_regex_traits</span><span class="special"><></span></code> for all patterns. This causes all
|
||
regex objects to use the global <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span></code>.
|
||
If you compile with <code class="computeroutput"><span class="identifier">BOOST_XPRESSIVE_USE_C_TRAITS</span></code>
|
||
defined, then xpressive will use <code class="computeroutput"><span class="identifier">c_regex_traits</span><span class="special"><></span></code> by default.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_dynamic_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_dynamic_regexes">Using
|
||
Custom Traits with Dynamic Regexes</a>
|
||
</h3>
|
||
<p>
|
||
To create a dynamic regex that uses a custom traits object, you must use
|
||
<code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_compiler.html" title="Struct template regex_compiler">regex_compiler<></a></code></code>.
|
||
The basic steps are shown in the following example:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Declare a regex_compiler that uses the global C locale</span>
|
||
<span class="identifier">regex_compiler</span><span class="special"><</span><span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*,</span> <span class="identifier">c_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="special">></span> <span class="identifier">crxcomp</span><span class="special">;</span>
|
||
<span class="identifier">cregex</span> <span class="identifier">crx</span> <span class="special">=</span> <span class="identifier">crxcomp</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+"</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// Declare a regex_compiler that uses a custom std::locale</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">loc</span> <span class="special">=</span> <span class="comment">/* ... create a locale here ... */</span><span class="special">;</span>
|
||
<span class="identifier">regex_compiler</span><span class="special"><</span><span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*,</span> <span class="identifier">cpp_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="special">></span> <span class="identifier">cpprxcomp</span><span class="special">(</span><span class="identifier">loc</span><span class="special">);</span>
|
||
<span class="identifier">cregex</span> <span class="identifier">cpprx</span> <span class="special">=</span> <span class="identifier">cpprxcomp</span><span class="special">.</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"\\w+"</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
The <code class="computeroutput"><span class="identifier">regex_compiler</span></code> objects
|
||
act as regex factories. Once they have been imbued with a locale, every regex
|
||
object they create will use that locale.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_static_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.using_custom_traits_with_static_regexes">Using
|
||
Custom Traits with Static Regexes</a>
|
||
</h3>
|
||
<p>
|
||
If you want a particular static regex to use a different set of traits, you
|
||
can use the special <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code> pattern modifier. For instance:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Define a regex that uses the global C locale</span>
|
||
<span class="identifier">c_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="identifier">ctraits</span><span class="special">;</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">crx</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">ctraits</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// Define a regex that uses a customized std::locale</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">locale</span> <span class="identifier">loc</span> <span class="special">=</span> <span class="comment">/* ... create a locale here ... */</span><span class="special">;</span>
|
||
<span class="identifier">cpp_regex_traits</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="identifier">cpptraits</span><span class="special">(</span><span class="identifier">loc</span><span class="special">);</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">cpprx1</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">cpptraits</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// A shorthand for above</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">cpprx2</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)(</span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
The <code class="computeroutput"><span class="identifier">imbue</span><span class="special">()</span></code>
|
||
pattern modifier must wrap the entire pattern. It is an error to <code class="computeroutput"><span class="identifier">imbue</span></code> only part of a static regex. For
|
||
example:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// ERROR! Cannot imbue() only part of a regex</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">error</span> <span class="special">=</span> <span class="identifier">_w</span> <span class="special">>></span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)(</span> <span class="identifier">_w</span> <span class="special">);</span>
|
||
</pre>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.localization_and_regex_traits.h4"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.localization_and_regex_traits.searching_non_character_data_with__literal_null_regex_traits__literal_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.localization_and_regex_traits.searching_non_character_data_with__literal_null_regex_traits__literal_">Searching
|
||
Non-Character Data With <code class="literal">null_regex_traits</code></a>
|
||
</h3>
|
||
<p>
|
||
With xpressive static regexes, you are not limitted to searching for patterns
|
||
in character sequences. You can search for patterns in raw bytes, integers,
|
||
or anything that conforms to the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.chart_requirements">Char
|
||
Concept</a>. The <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code> makes it simple. It is a stub implementation
|
||
of the <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Regex
|
||
Traits Concept</a>. It recognizes no character classes and does no case-sensitive
|
||
mappings.
|
||
</p>
|
||
<p>
|
||
For example, with <code class="computeroutput"><span class="identifier">null_regex_traits</span><span class="special"><></span></code>, you can write a static regex to
|
||
find a pattern in a sequence of integers as follows:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// some integral data to search</span>
|
||
<span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">data</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span><span class="number">0</span><span class="special">,</span> <span class="number">1</span><span class="special">,</span> <span class="number">2</span><span class="special">,</span> <span class="number">3</span><span class="special">,</span> <span class="number">4</span><span class="special">,</span> <span class="number">5</span><span class="special">,</span> <span class="number">6</span><span class="special">};</span>
|
||
|
||
<span class="comment">// create a null_regex_traits<> object for searching integers ...</span>
|
||
<span class="identifier">null_regex_traits</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">nul</span><span class="special">;</span>
|
||
|
||
<span class="comment">// imbue a regex object with the null_regex_traits ...</span>
|
||
<span class="identifier">basic_regex</span><span class="special"><</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="special">*></span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">imbue</span><span class="special">(</span><span class="identifier">nul</span><span class="special">)(</span><span class="number">1</span> <span class="special">>></span> <span class="special">+((</span><span class="identifier">set</span><span class="special">=</span> <span class="number">2</span><span class="special">,</span><span class="number">3</span><span class="special">)</span> <span class="special">|</span> <span class="number">4</span><span class="special">)</span> <span class="special">>></span> <span class="number">5</span><span class="special">);</span>
|
||
<span class="identifier">match_results</span><span class="special"><</span><span class="keyword">int</span> <span class="keyword">const</span> <span class="special">*></span> <span class="identifier">what</span><span class="special">;</span>
|
||
|
||
<span class="comment">// search for the pattern in the array of integers ...</span>
|
||
<span class="identifier">regex_search</span><span class="special">(</span><span class="identifier">data</span><span class="special">,</span> <span class="identifier">data</span> <span class="special">+</span> <span class="number">7</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span><span class="special">);</span>
|
||
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">matched</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(*</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">first</span> <span class="special">==</span> <span class="number">1</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(*</span><span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">].</span><span class="identifier">second</span> <span class="special">==</span> <span class="number">6</span><span class="special">);</span>
|
||
</pre>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks" title="Tips 'N Tricks">Tips 'N Tricks</a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
Squeeze the most performance out of xpressive with these tips and tricks.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.compile_patterns_once_and_reuse_them"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.compile_patterns_once_and_reuse_them">Compile
|
||
Patterns Once And Reuse Them</a>
|
||
</h3>
|
||
<p>
|
||
Compiling a regex (dynamic or static) is <span class="emphasis"><em>far</em></span> more expensive
|
||
than executing a match or search. If you have the option, prefer to compile
|
||
a pattern into a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
object once and reuse it rather than recreating it over and over.
|
||
</p>
|
||
<p>
|
||
Since <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
objects are not mutated by any of the regex algorithms, they are completely
|
||
thread-safe once their initialization (and that of any grammars of which
|
||
they are members) completes. The easiest way to reuse your patterns is to
|
||
simply make your <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>
|
||
objects "static const".
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.reuse__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__objects"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.reuse__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__objects">Reuse
|
||
match_results<>
|
||
Objects</a>
|
||
</h3>
|
||
<p>
|
||
The <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object caches dynamically allocated memory. For this reason, it is far better
|
||
to reuse the same <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object if you have to do many regex searches.
|
||
</p>
|
||
<p>
|
||
Caveat: <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
objects are not thread-safe, so don't go wild reusing them across threads.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_take_a__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__object"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_take_a__literal__classname_alt__boost__xpressive__match_results__match_results_lt__gt___classname___literal__object">Prefer
|
||
Algorithms That Take A match_results<>
|
||
Object</a>
|
||
</h3>
|
||
<p>
|
||
This is a corollary to the previous tip. If you are doing multiple searches,
|
||
you should prefer the regex algorithms that accept a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object over the ones that don't, and you should reuse the same <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object each time. If you don't provide a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>
|
||
object, a temporary one will be created for you and discarded when the algorithm
|
||
returns. Any memory cached in the object will be deallocated and will have
|
||
to be reallocated the next time.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_accept_iterator_ranges_over_null_terminated_strings"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.prefer_algorithms_that_accept_iterator_ranges_over_null_terminated_strings">Prefer
|
||
Algorithms That Accept Iterator Ranges Over Null-Terminated Strings</a>
|
||
</h3>
|
||
<p>
|
||
xpressive provides overloads of the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_match.html" title="Function regex_match">regex_match()</a></code></code>
|
||
and <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_search.html" title="Function regex_search">regex_search()</a></code></code>
|
||
algorithms that operate on C-style null-terminated strings. You should prefer
|
||
the overloads that take iterator ranges. When you pass a null-terminated
|
||
string to a regex algorithm, the end iterator is calculated immediately by
|
||
calling <code class="computeroutput"><span class="identifier">strlen</span></code>. If you already
|
||
know the length of the string, you can avoid this overhead by calling the
|
||
regex algorithms with a <code class="computeroutput"><span class="special">[</span><span class="identifier">begin</span><span class="special">,</span> <span class="identifier">end</span><span class="special">)</span></code>
|
||
pair.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h4"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.use_static_regexes"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.use_static_regexes">Use
|
||
Static Regexes</a>
|
||
</h3>
|
||
<p>
|
||
On average, static regexes execute about 10 to 15% faster than their dynamic
|
||
counterparts. It's worth familiarizing yourself with the static regex dialect.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h5"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.understand__literal_syntax_option_type__optimize__literal_"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.understand__literal_syntax_option_type__optimize__literal_">Understand
|
||
<code class="literal">syntax_option_type::optimize</code></a>
|
||
</h3>
|
||
<p>
|
||
The <code class="computeroutput"><span class="identifier">optimize</span></code> flag tells the
|
||
regex compiler to spend some extra time analyzing the pattern. It can cause
|
||
some patterns to execute faster, but it increases the time to compile the
|
||
pattern, and often increases the amount of memory consumed by the pattern.
|
||
If you plan to reuse your pattern, <code class="computeroutput"><span class="identifier">optimize</span></code>
|
||
is usually a win. If you will only use the pattern once, don't use <code class="computeroutput"><span class="identifier">optimize</span></code>.
|
||
</p>
|
||
<h2>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h6"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.common_pitfalls"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.common_pitfalls">Common
|
||
Pitfalls</a>
|
||
</h2>
|
||
<p>
|
||
Keep the following tips in mind to avoid stepping in potholes with xpressive.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h7"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.create_grammars_on_a_single_thread"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.create_grammars_on_a_single_thread">Create
|
||
Grammars On A Single Thread</a>
|
||
</h3>
|
||
<p>
|
||
With static regexes, you can create grammars by nesting regexes inside one
|
||
another. When compiling the outer regex, both the outer and inner regex objects,
|
||
and all the regex objects to which they refer either directly or indirectly,
|
||
are modified. For this reason, it's dangerous for global regex objects to
|
||
participate in grammars. It's best to build regex grammars from a single
|
||
thread. Once built, the resulting regex grammar can be executed from multiple
|
||
threads without problems.
|
||
</p>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.tips_n_tricks.h8"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.tips_n_tricks.beware_nested_quantifiers"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.tips_n_tricks.beware_nested_quantifiers">Beware
|
||
Nested Quantifiers</a>
|
||
</h3>
|
||
<p>
|
||
This is a pitfall common to many regular expression engines. Some patterns
|
||
can cause exponentially bad performance. Often these patterns involve one
|
||
quantified term nested withing another quantifier, such as <code class="computeroutput"><span class="string">"(a*)*"</span></code>, although in many cases,
|
||
the problem is harder to spot. Beware of patterns that have nested quantifiers.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.concepts"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts" title="Concepts">Concepts</a>
|
||
</h3></div></div></div>
|
||
<h3>
|
||
<a name="boost_xpressive.user_s_guide.concepts.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.chart_requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.chart_requirements">CharT
|
||
requirements</a>
|
||
</h3>
|
||
<p>
|
||
If type <code class="computeroutput"><span class="identifier">BidiIterT</span></code> is used
|
||
as a template argument to <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>,
|
||
then <code class="computeroutput"><span class="identifier">CharT</span></code> is <code class="computeroutput"><span class="identifier">iterator_traits</span><span class="special"><</span><span class="identifier">BidiIterT</span><span class="special">>::</span><span class="identifier">value_type</span></code>. Type <code class="computeroutput"><span class="identifier">CharT</span></code>
|
||
must have a trivial default constructor, copy constructor, assignment operator,
|
||
and destructor. In addition the following requirements must be met for objects;
|
||
<code class="computeroutput"><span class="identifier">c</span></code> of type <code class="computeroutput"><span class="identifier">CharT</span></code>,
|
||
<code class="computeroutput"><span class="identifier">c1</span></code> and <code class="computeroutput"><span class="identifier">c2</span></code>
|
||
of type <code class="computeroutput"><span class="identifier">CharT</span> <span class="keyword">const</span></code>,
|
||
and <code class="computeroutput"><span class="identifier">i</span></code> of type <code class="computeroutput"><span class="keyword">int</span></code>:
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.concepts.t0"></a><p class="title"><b>Table 47.14. CharT Requirements</b></p>
|
||
<div class="table-contents"><table class="table" summary="CharT Requirements">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
<span class="bold"><strong>Expression</strong></span>
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
<span class="bold"><strong>Return type</strong></span>
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
<span class="bold"><strong>Assertion / Note / Pre- / Post-condition</strong></span>
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Default constructor (must be trivial).
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span><span class="special">(</span><span class="identifier">c1</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Copy constructor (must be trivial).
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">=</span>
|
||
<span class="identifier">c2</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Assignment operator (must be trivial).
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">==</span>
|
||
<span class="identifier">c2</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> has the same value as <code class="computeroutput"><span class="identifier">c2</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">!=</span>
|
||
<span class="identifier">c2</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> and <code class="computeroutput"><span class="identifier">c2</span></code>
|
||
are not equal.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">c1</span> <span class="special"><</span>
|
||
<span class="identifier">c2</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">true</span></code> if the value
|
||
of <code class="computeroutput"><span class="identifier">c1</span></code> is less than
|
||
<code class="computeroutput"><span class="identifier">c2</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">></span>
|
||
<span class="identifier">c2</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">true</span></code> if the value
|
||
of <code class="computeroutput"><span class="identifier">c1</span></code> is greater
|
||
than <code class="computeroutput"><span class="identifier">c2</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">c1</span> <span class="special"><=</span>
|
||
<span class="identifier">c2</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> is less than or equal to
|
||
<code class="computeroutput"><span class="identifier">c2</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">c1</span> <span class="special">>=</span>
|
||
<span class="identifier">c2</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">c1</span></code> is greater than or equal to
|
||
<code class="computeroutput"><span class="identifier">c2</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">intmax_t</span> <span class="identifier">i</span>
|
||
<span class="special">=</span> <span class="identifier">c1</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">int</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span></code> must be convertible
|
||
to an integral type.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span> <span class="identifier">c</span><span class="special">(</span><span class="identifier">i</span><span class="special">);</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span></code> must be constructable
|
||
from an integral type.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><h3>
|
||
<a name="boost_xpressive.user_s_guide.concepts.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.traits_requirements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.traits_requirements">Traits
|
||
Requirements</a>
|
||
</h3>
|
||
<p>
|
||
In the following table <code class="computeroutput"><span class="identifier">X</span></code>
|
||
denotes a traits class defining types and functions for the character container
|
||
type <code class="computeroutput"><span class="identifier">CharT</span></code>; <code class="computeroutput"><span class="identifier">u</span></code> is an object of type <code class="computeroutput"><span class="identifier">X</span></code>;
|
||
<code class="computeroutput"><span class="identifier">v</span></code> is an object of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">X</span></code>;
|
||
<code class="computeroutput"><span class="identifier">p</span></code> is a value of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span><span class="special">*</span></code>; <code class="computeroutput"><span class="identifier">I1</span></code>
|
||
and <code class="computeroutput"><span class="identifier">I2</span></code> are <code class="computeroutput"><span class="identifier">Input</span> <span class="identifier">Iterators</span></code>;
|
||
<code class="computeroutput"><span class="identifier">c</span></code> is a value of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span></code>;
|
||
<code class="computeroutput"><span class="identifier">s</span></code> is an object of type <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>;
|
||
<code class="computeroutput"><span class="identifier">cs</span></code> is an object of type
|
||
<code class="computeroutput"><span class="keyword">const</span> <span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>;
|
||
<code class="computeroutput"><span class="identifier">b</span></code> is a value of type <code class="computeroutput"><span class="keyword">bool</span></code>; <code class="computeroutput"><span class="identifier">i</span></code>
|
||
is a value of type <code class="computeroutput"><span class="keyword">int</span></code>; <code class="computeroutput"><span class="identifier">F1</span></code> and <code class="computeroutput"><span class="identifier">F2</span></code>
|
||
are values of type <code class="computeroutput"><span class="keyword">const</span> <span class="identifier">CharT</span><span class="special">*</span></code>; <code class="computeroutput"><span class="identifier">loc</span></code>
|
||
is an object of type <code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>; and <code class="computeroutput"><span class="identifier">ch</span></code>
|
||
is an object of <code class="computeroutput"><span class="keyword">const</span> <span class="keyword">char</span></code>.
|
||
</p>
|
||
<div class="table">
|
||
<a name="boost_xpressive.user_s_guide.concepts.t1"></a><p class="title"><b>Table 47.15. Traits Requirements</b></p>
|
||
<div class="table-contents"><table class="table" summary="Traits Requirements">
|
||
<colgroup>
|
||
<col>
|
||
<col>
|
||
<col>
|
||
</colgroup>
|
||
<thead><tr>
|
||
<th>
|
||
<p>
|
||
<span class="bold"><strong>Expression</strong></span>
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
<span class="bold"><strong>Return type</strong></span>
|
||
</p>
|
||
</th>
|
||
<th>
|
||
<p>
|
||
<span class="bold"><strong>Assertion / Note<br> Pre / Post condition</strong></span>
|
||
</p>
|
||
</th>
|
||
</tr></thead>
|
||
<tbody>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
The character container type used in the implementation of class
|
||
template <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/basic_regex.html" title="Struct template basic_regex">basic_regex<></a></code></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">basic_string</span><span class="special"><</span><span class="identifier">CharT</span><span class="special">></span></code>
|
||
or <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="identifier">CharT</span><span class="special">></span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<span class="emphasis"><em>Implementation defined</em></span>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A copy constructible type that represents the locale used by the
|
||
traits class.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_class_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<span class="emphasis"><em>Implementation defined</em></span>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
A bitmask type representing a particular character classification.
|
||
Multiple values of this type can be bitwise-or'ed together to obtain
|
||
a new valid value.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">hash</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">unsigned</span> <span class="keyword">char</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Yields a value between <code class="computeroutput"><span class="number">0</span></code>
|
||
and <code class="computeroutput"><span class="identifier">UCHAR_MAX</span></code> inclusive.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">widen</span><span class="special">(</span><span class="identifier">ch</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">CharT</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Widens the specified <code class="computeroutput"><span class="keyword">char</span></code>
|
||
and returns the resulting <code class="computeroutput"><span class="identifier">CharT</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">in_range</span><span class="special">(</span><span class="identifier">r1</span><span class="special">,</span>
|
||
<span class="identifier">r2</span><span class="special">,</span>
|
||
<span class="identifier">c</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
For any characters <code class="computeroutput"><span class="identifier">r1</span></code>
|
||
and <code class="computeroutput"><span class="identifier">r2</span></code>, returns
|
||
<code class="computeroutput"><span class="keyword">true</span></code> if <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span>
|
||
<span class="identifier">c</span> <span class="special">&&</span>
|
||
<span class="identifier">c</span> <span class="special"><=</span>
|
||
<span class="identifier">r2</span></code>. Requires that <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span>
|
||
<span class="identifier">r2</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">in_range_nocase</span><span class="special">(</span><span class="identifier">r1</span><span class="special">,</span>
|
||
<span class="identifier">r2</span><span class="special">,</span>
|
||
<span class="identifier">c</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
For characters <code class="computeroutput"><span class="identifier">r1</span></code>
|
||
and <code class="computeroutput"><span class="identifier">r2</span></code>, returns
|
||
<code class="computeroutput"><span class="keyword">true</span></code> if there is some
|
||
character <code class="computeroutput"><span class="identifier">d</span></code> for
|
||
which <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">d</span><span class="special">)</span>
|
||
<span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code> and <code class="computeroutput"><span class="identifier">r1</span>
|
||
<span class="special"><=</span> <span class="identifier">d</span>
|
||
<span class="special">&&</span> <span class="identifier">d</span>
|
||
<span class="special"><=</span> <span class="identifier">r2</span></code>.
|
||
Requires that <code class="computeroutput"><span class="identifier">r1</span> <span class="special"><=</span> <span class="identifier">r2</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns a character such that for any character <code class="computeroutput"><span class="identifier">d</span></code>
|
||
that is to be considered equivalent to <code class="computeroutput"><span class="identifier">c</span></code>
|
||
then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span>
|
||
<span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate</span><span class="special">(</span><span class="identifier">d</span><span class="special">)</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
For all characters <code class="computeroutput"><span class="identifier">C</span></code>
|
||
that are to be considered equivalent to <code class="computeroutput"><span class="identifier">c</span></code>
|
||
when comparisons are to be performed without regard to case, then
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">c</span><span class="special">)</span>
|
||
<span class="special">==</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">translate_nocase</span><span class="special">(</span><span class="identifier">C</span><span class="special">)</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
|
||
<span class="identifier">F2</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns a sort key for the character sequence designated by the
|
||
iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code> such that if the character sequence
|
||
<code class="computeroutput"><span class="special">[</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span></code> sorts before the character sequence
|
||
<code class="computeroutput"><span class="special">[</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code> then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span> <span class="special"><</span>
|
||
<span class="identifier">v</span><span class="special">.</span><span class="identifier">transform</span><span class="special">(</span><span class="identifier">H1</span><span class="special">,</span>
|
||
<span class="identifier">H2</span><span class="special">)</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
|
||
<span class="identifier">F2</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns a sort key for the character sequence designated by the
|
||
iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code> such that if the character sequence
|
||
<code class="computeroutput"><span class="special">[</span><span class="identifier">G1</span><span class="special">,</span> <span class="identifier">G2</span><span class="special">)</span></code> sorts before the character sequence
|
||
<code class="computeroutput"><span class="special">[</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code> when character case is not considered
|
||
then <code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">G1</span><span class="special">,</span>
|
||
<span class="identifier">G2</span><span class="special">)</span>
|
||
<span class="special"><</span> <span class="identifier">v</span><span class="special">.</span><span class="identifier">transform_primary</span><span class="special">(</span><span class="identifier">H1</span><span class="special">,</span> <span class="identifier">H2</span><span class="special">)</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_classname</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
|
||
<span class="identifier">F2</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">char_class_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Converts the character sequence designated by the iterator range
|
||
<code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span><span class="identifier">F2</span><span class="special">)</span></code> into a bitmask type that can subsequently
|
||
be passed to <code class="computeroutput"><span class="identifier">isctype</span></code>.
|
||
Values returned from <code class="computeroutput"><span class="identifier">lookup_classname</span></code>
|
||
can be safely bitwise or'ed together. Returns <code class="computeroutput"><span class="number">0</span></code>
|
||
if the character sequence is not the name of a character class
|
||
recognized by <code class="computeroutput"><span class="identifier">X</span></code>.
|
||
The value returned shall be independent of the case of the characters
|
||
in the sequence.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_collatename</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
|
||
<span class="identifier">F2</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">string_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns a sequence of characters that represents the collating
|
||
element consisting of the character sequence designated by the
|
||
iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code>. Returns an empty string if the
|
||
character sequence is not a valid collating element.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">isctype</span><span class="special">(</span><span class="identifier">c</span><span class="special">,</span>
|
||
<span class="identifier">v</span><span class="special">.</span><span class="identifier">lookup_classname</span><span class="special">(</span><span class="identifier">F1</span><span class="special">,</span>
|
||
<span class="identifier">F2</span><span class="special">))</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">bool</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns <code class="computeroutput"><span class="keyword">true</span></code> if character
|
||
<code class="computeroutput"><span class="identifier">c</span></code> is a member of
|
||
the character class designated by the iterator range <code class="computeroutput"><span class="special">[</span><span class="identifier">F1</span><span class="special">,</span> <span class="identifier">F2</span><span class="special">)</span></code>, <code class="computeroutput"><span class="keyword">false</span></code>
|
||
otherwise.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">value</span><span class="special">(</span><span class="identifier">c</span><span class="special">,</span>
|
||
<span class="identifier">i</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="keyword">int</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns the value represented by the digit <code class="computeroutput"><span class="identifier">c</span></code>
|
||
in base <code class="computeroutput"><span class="identifier">i</span></code> if the
|
||
character <code class="computeroutput"><span class="identifier">c</span></code> is
|
||
a valid digit in base <code class="computeroutput"><span class="identifier">i</span></code>;
|
||
otherwise returns <code class="computeroutput"><span class="special">-</span><span class="number">1</span></code>.<br> [Note: the value of <code class="computeroutput"><span class="identifier">i</span></code> will only be <code class="computeroutput"><span class="number">8</span></code>, <code class="computeroutput"><span class="number">10</span></code>,
|
||
or <code class="computeroutput"><span class="number">16</span></code>. -end note]
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">u</span><span class="special">.</span><span class="identifier">imbue</span><span class="special">(</span><span class="identifier">loc</span><span class="special">)</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Imbues <code class="computeroutput"><span class="identifier">u</span></code> with the
|
||
locale <code class="computeroutput"><span class="identifier">loc</span></code>, returns
|
||
the previous locale used by <code class="computeroutput"><span class="identifier">u</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
<tr>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">v</span><span class="special">.</span><span class="identifier">getloc</span><span class="special">()</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
<code class="computeroutput"><span class="identifier">X</span><span class="special">::</span><span class="identifier">locale_type</span></code>
|
||
</p>
|
||
</td>
|
||
<td>
|
||
<p>
|
||
Returns the current locale used by <code class="computeroutput"><span class="identifier">v</span></code>.
|
||
</p>
|
||
</td>
|
||
</tr>
|
||
</tbody>
|
||
</table></div>
|
||
</div>
|
||
<br class="table-break"><h3>
|
||
<a name="boost_xpressive.user_s_guide.concepts.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.concepts.acknowledgements"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.concepts.acknowledgements">Acknowledgements</a>
|
||
</h3>
|
||
<p>
|
||
This section is adapted from the equivalent page in the <a href="../../../libs/regex" target="_top">Boost.Regex</a>
|
||
documentation and from the <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1429.htm" target="_top">proposal</a>
|
||
to add regular expressions to the Standard Library.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_xpressive.user_s_guide.examples"></a><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">Examples</a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
Below you can find six complete sample programs. <br>
|
||
</p>
|
||
<p></p>
|
||
<h5>
|
||
<a name="boost_xpressive.user_s_guide.examples.h0"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_whole_string_matches_a_regex">See
|
||
if a whole string matches a regex</a>
|
||
</h5>
|
||
<p>
|
||
This is the example from the Introduction. It is reproduced here for your
|
||
convenience.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">hello</span><span class="special">(</span> <span class="string">"hello world!"</span> <span class="special">);</span>
|
||
|
||
<span class="identifier">sregex</span> <span class="identifier">rex</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\w+) (\\w+)!"</span> <span class="special">);</span>
|
||
<span class="identifier">smatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_match</span><span class="special">(</span> <span class="identifier">hello</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">rex</span> <span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">1</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// first capture</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">2</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// second capture</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program outputs the following:
|
||
</p>
|
||
<pre class="programlisting">hello world!
|
||
hello
|
||
world
|
||
</pre>
|
||
<p>
|
||
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
|
||
</p>
|
||
<p></p>
|
||
<h5>
|
||
<a name="boost_xpressive.user_s_guide.examples.h1"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.see_if_a_string_contains_a_sub_string_that_matches_a_regex">See
|
||
if a string contains a sub-string that matches a regex</a>
|
||
</h5>
|
||
<p>
|
||
Notice in this example how we use custom <code class="computeroutput"><span class="identifier">mark_tag</span></code>s
|
||
to make the pattern more readable. We can use the <code class="computeroutput"><span class="identifier">mark_tag</span></code>s
|
||
later to index into the <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/match_results.html" title="Struct template match_results">match_results<></a></code></code>.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*</span><span class="identifier">str</span> <span class="special">=</span> <span class="string">"I was born on 5/30/1973 at 7am."</span><span class="special">;</span>
|
||
|
||
<span class="comment">// define some custom mark_tags with names more meaningful than s1, s2, etc.</span>
|
||
<span class="identifier">mark_tag</span> <span class="identifier">day</span><span class="special">(</span><span class="number">1</span><span class="special">),</span> <span class="identifier">month</span><span class="special">(</span><span class="number">2</span><span class="special">),</span> <span class="identifier">year</span><span class="special">(</span><span class="number">3</span><span class="special">),</span> <span class="identifier">delim</span><span class="special">(</span><span class="number">4</span><span class="special">);</span>
|
||
|
||
<span class="comment">// this regex finds a date</span>
|
||
<span class="identifier">cregex</span> <span class="identifier">date</span> <span class="special">=</span> <span class="special">(</span><span class="identifier">month</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span><span class="special">))</span> <span class="comment">// find the month ...</span>
|
||
<span class="special">>></span> <span class="special">(</span><span class="identifier">delim</span><span class="special">=</span> <span class="special">(</span><span class="identifier">set</span><span class="special">=</span> <span class="char">'/'</span><span class="special">,</span><span class="char">'-'</span><span class="special">))</span> <span class="comment">// followed by a delimiter ...</span>
|
||
<span class="special">>></span> <span class="special">(</span><span class="identifier">day</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span><span class="special">))</span> <span class="special">>></span> <span class="identifier">delim</span> <span class="comment">// and a day followed by the same delimiter ...</span>
|
||
<span class="special">>></span> <span class="special">(</span><span class="identifier">year</span><span class="special">=</span> <span class="identifier">repeat</span><span class="special"><</span><span class="number">1</span><span class="special">,</span><span class="number">2</span><span class="special">>(</span><span class="identifier">_d</span> <span class="special">>></span> <span class="identifier">_d</span><span class="special">));</span> <span class="comment">// and the year.</span>
|
||
|
||
<span class="identifier">cmatch</span> <span class="identifier">what</span><span class="special">;</span>
|
||
|
||
<span class="keyword">if</span><span class="special">(</span> <span class="identifier">regex_search</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">what</span><span class="special">,</span> <span class="identifier">date</span> <span class="special">)</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// whole match</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">day</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the day</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">month</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the month</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">year</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the year</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="identifier">delim</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span> <span class="comment">// the delimiter</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program outputs the following:
|
||
</p>
|
||
<pre class="programlisting">5/30/1973
|
||
30
|
||
5
|
||
1973
|
||
/
|
||
</pre>
|
||
<p>
|
||
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
|
||
</p>
|
||
<p></p>
|
||
<h5>
|
||
<a name="boost_xpressive.user_s_guide.examples.h2"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.replace_all_sub_strings_that_match_a_regex">Replace
|
||
all sub-strings that match a regex</a>
|
||
</h5>
|
||
<p>
|
||
The following program finds dates in a string and marks them up with pseudo-HTML.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"I was born on 5/30/1973 at 7am."</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// essentially the same regex as in the previous example, but using a dynamic regex</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">date</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\d{1,2})([/-])(\\d{1,2})\\2((?:\\d{2}){1,2})"</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// As in Perl, $& is a reference to the sub-string that matched the regex</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">format</span><span class="special">(</span> <span class="string">"<date>$&</date>"</span> <span class="special">);</span>
|
||
|
||
<span class="identifier">str</span> <span class="special">=</span> <span class="identifier">regex_replace</span><span class="special">(</span> <span class="identifier">str</span><span class="special">,</span> <span class="identifier">date</span><span class="special">,</span> <span class="identifier">format</span> <span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">str</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program outputs the following:
|
||
</p>
|
||
<pre class="programlisting">I was born on <date>5/30/1973</date> at 7am.
|
||
</pre>
|
||
<p>
|
||
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
|
||
</p>
|
||
<p></p>
|
||
<h5>
|
||
<a name="boost_xpressive.user_s_guide.examples.h3"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.find_all_the_sub_strings_that_match_a_regex_and_step_through_them_one_at_a_time">Find
|
||
all the sub-strings that match a regex and step through them one at a time</a>
|
||
</h5>
|
||
<p>
|
||
The following program finds the words in a wide-character string. It uses
|
||
<code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>. Notice
|
||
that dereferencing a <code class="computeroutput"><span class="identifier">wsregex_iterator</span></code>
|
||
yields a <code class="computeroutput"><span class="identifier">wsmatch</span></code> object.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">wstring</span> <span class="identifier">str</span><span class="special">(</span> <span class="identifier">L</span><span class="string">"This is his face."</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// find a whole word</span>
|
||
<span class="identifier">wsregex</span> <span class="identifier">token</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">alnum</span><span class="special">;</span>
|
||
|
||
<span class="identifier">wsregex_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">token</span> <span class="special">);</span>
|
||
<span class="identifier">wsregex_iterator</span> <span class="identifier">end</span><span class="special">;</span>
|
||
|
||
<span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">wsmatch</span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span> <span class="special">=</span> <span class="special">*</span><span class="identifier">cur</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">wcout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="identifier">L</span><span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program outputs the following:
|
||
</p>
|
||
<pre class="programlisting">This
|
||
is
|
||
his
|
||
face
|
||
</pre>
|
||
<p>
|
||
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
|
||
</p>
|
||
<p></p>
|
||
<h5>
|
||
<a name="boost_xpressive.user_s_guide.examples.h4"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_into_tokens_that_each_match_a_regex">Split
|
||
a string into tokens that each match a regex</a>
|
||
</h5>
|
||
<p>
|
||
The following program finds race times in a string and displays first the
|
||
minutes and then the seconds. It uses <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"Eric: 4:40, Karl: 3:35, Francesca: 2:32"</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// find a race time</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">time</span> <span class="special">=</span> <span class="identifier">sregex</span><span class="special">::</span><span class="identifier">compile</span><span class="special">(</span> <span class="string">"(\\d):(\\d\\d)"</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// for each match, the token iterator should first take the value of</span>
|
||
<span class="comment">// the first marked sub-expression followed by the value of the second</span>
|
||
<span class="comment">// marked sub-expression</span>
|
||
<span class="keyword">int</span> <span class="keyword">const</span> <span class="identifier">subs</span><span class="special">[]</span> <span class="special">=</span> <span class="special">{</span> <span class="number">1</span><span class="special">,</span> <span class="number">2</span> <span class="special">};</span>
|
||
|
||
<span class="identifier">sregex_token_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">time</span><span class="special">,</span> <span class="identifier">subs</span> <span class="special">);</span>
|
||
<span class="identifier">sregex_token_iterator</span> <span class="identifier">end</span><span class="special">;</span>
|
||
|
||
<span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="special">*</span><span class="identifier">cur</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program outputs the following:
|
||
</p>
|
||
<pre class="programlisting">4
|
||
40
|
||
3
|
||
35
|
||
2
|
||
32
|
||
</pre>
|
||
<p>
|
||
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
|
||
</p>
|
||
<p></p>
|
||
<h5>
|
||
<a name="boost_xpressive.user_s_guide.examples.h5"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.split_a_string_using_a_regex_as_a_delimiter">Split
|
||
a string using a regex as a delimiter</a>
|
||
</h5>
|
||
<p>
|
||
The following program takes some text that has been marked up with html and
|
||
strips out the mark-up. It uses a regex that matches an HTML tag and a <code class="literal"><code class="computeroutput"><a class="link" href="../boost/xpressive/regex_token_iterator.html" title="Struct template regex_token_iterator">regex_token_iterator<></a></code></code>
|
||
that returns the parts of the string that do <span class="emphasis"><em>not</em></span> match
|
||
the regex.
|
||
</p>
|
||
<pre class="programlisting"><span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">iostream</span><span class="special">></span>
|
||
<span class="preprocessor">#include</span> <span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">/</span><span class="identifier">xpressive</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span>
|
||
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">;</span>
|
||
|
||
<span class="keyword">int</span> <span class="identifier">main</span><span class="special">()</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span><span class="special">(</span> <span class="string">"Now <bold>is the time <i>for all good men</i> to come to the aid of their</bold> country."</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// find a HTML tag</span>
|
||
<span class="identifier">sregex</span> <span class="identifier">html</span> <span class="special">=</span> <span class="char">'<'</span> <span class="special">>></span> <span class="identifier">optional</span><span class="special">(</span><span class="char">'/'</span><span class="special">)</span> <span class="special">>></span> <span class="special">+</span><span class="identifier">_w</span> <span class="special">>></span> <span class="char">'>'</span><span class="special">;</span>
|
||
|
||
<span class="comment">// the -1 below directs the token iterator to display the parts of</span>
|
||
<span class="comment">// the string that did NOT match the regular expression.</span>
|
||
<span class="identifier">sregex_token_iterator</span> <span class="identifier">cur</span><span class="special">(</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">begin</span><span class="special">(),</span> <span class="identifier">str</span><span class="special">.</span><span class="identifier">end</span><span class="special">(),</span> <span class="identifier">html</span><span class="special">,</span> <span class="special">-</span><span class="number">1</span> <span class="special">);</span>
|
||
<span class="identifier">sregex_token_iterator</span> <span class="identifier">end</span><span class="special">;</span>
|
||
|
||
<span class="keyword">for</span><span class="special">(</span> <span class="special">;</span> <span class="identifier">cur</span> <span class="special">!=</span> <span class="identifier">end</span><span class="special">;</span> <span class="special">++</span><span class="identifier">cur</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'{'</span> <span class="special"><<</span> <span class="special">*</span><span class="identifier">cur</span> <span class="special"><<</span> <span class="char">'}'</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
|
||
<span class="keyword">return</span> <span class="number">0</span><span class="special">;</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
This program outputs the following:
|
||
</p>
|
||
<pre class="programlisting">{Now }{is the time }{for all good men}{ to come to the aid of their}{ country.}
|
||
</pre>
|
||
<p>
|
||
<br> <a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
|
||
</p>
|
||
<p></p>
|
||
<h5>
|
||
<a name="boost_xpressive.user_s_guide.examples.h6"></a>
|
||
<span class="phrase"><a name="boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results"></a></span><a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples.display_a_tree_of_nested_results">Display
|
||
a tree of nested results</a>
|
||
</h5>
|
||
<p>
|
||
Here is a helper class to demonstrate how you might display a tree of nested
|
||
results:
|
||
</p>
|
||
<pre class="programlisting"><span class="comment">// Displays nested results to std::cout with indenting</span>
|
||
<span class="keyword">struct</span> <span class="identifier">output_nested_results</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">int</span> <span class="identifier">tabs_</span><span class="special">;</span>
|
||
|
||
<span class="identifier">output_nested_results</span><span class="special">(</span> <span class="keyword">int</span> <span class="identifier">tabs</span> <span class="special">=</span> <span class="number">0</span> <span class="special">)</span>
|
||
<span class="special">:</span> <span class="identifier">tabs_</span><span class="special">(</span> <span class="identifier">tabs</span> <span class="special">)</span>
|
||
<span class="special">{</span>
|
||
<span class="special">}</span>
|
||
|
||
<span class="keyword">template</span><span class="special"><</span> <span class="keyword">typename</span> <span class="identifier">BidiIterT</span> <span class="special">></span>
|
||
<span class="keyword">void</span> <span class="keyword">operator</span> <span class="special">()(</span> <span class="identifier">match_results</span><span class="special"><</span> <span class="identifier">BidiIterT</span> <span class="special">></span> <span class="keyword">const</span> <span class="special">&</span><span class="identifier">what</span> <span class="special">)</span> <span class="keyword">const</span>
|
||
<span class="special">{</span>
|
||
<span class="comment">// first, do some indenting</span>
|
||
<span class="keyword">typedef</span> <span class="keyword">typename</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">iterator_traits</span><span class="special"><</span> <span class="identifier">BidiIterT</span> <span class="special">>::</span><span class="identifier">value_type</span> <span class="identifier">char_type</span><span class="special">;</span>
|
||
<span class="identifier">char_type</span> <span class="identifier">space_ch</span> <span class="special">=</span> <span class="identifier">char_type</span><span class="special">(</span><span class="char">' '</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">fill_n</span><span class="special">(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">ostream_iterator</span><span class="special"><</span><span class="identifier">char_type</span><span class="special">>(</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special">),</span> <span class="identifier">tabs_</span> <span class="special">*</span> <span class="number">4</span><span class="special">,</span> <span class="identifier">space_ch</span> <span class="special">);</span>
|
||
|
||
<span class="comment">// output the match</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">cout</span> <span class="special"><<</span> <span class="identifier">what</span><span class="special">[</span><span class="number">0</span><span class="special">]</span> <span class="special"><<</span> <span class="char">'\n'</span><span class="special">;</span>
|
||
|
||
<span class="comment">// output any nested matches</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">for_each</span><span class="special">(</span>
|
||
<span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">begin</span><span class="special">(),</span>
|
||
<span class="identifier">what</span><span class="special">.</span><span class="identifier">nested_results</span><span class="special">().</span><span class="identifier">end</span><span class="special">(),</span>
|
||
<span class="identifier">output_nested_results</span><span class="special">(</span> <span class="identifier">tabs_</span> <span class="special">+</span> <span class="number">1</span> <span class="special">)</span> <span class="special">);</span>
|
||
<span class="special">}</span>
|
||
<span class="special">};</span>
|
||
</pre>
|
||
<p>
|
||
<a class="link" href="user_s_guide.html#boost_xpressive.user_s_guide.examples" title="Examples">top</a>
|
||
</p>
|
||
</div>
|
||
<div class="footnotes">
|
||
<br><hr style="width:100; text-align:left;margin-left: 0">
|
||
<div id="ftn.boost_xpressive.user_s_guide.introduction.f0" class="footnote"><p><a href="#boost_xpressive.user_s_guide.introduction.f0" class="para"><sup class="para">[36] </sup></a>
|
||
See <a href="http://www.osl.iu.edu/~tveldhui/papers/Expression-Templates/exprtmpl.html" target="_top">Expression
|
||
Templates</a>
|
||
</p></div>
|
||
<div id="ftn.boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="footnote"><p><a href="#boost_xpressive.user_s_guide.symbol_tables_and_attributes.f0" class="para"><sup class="para">[37] </sup></a>
|
||
Many thanks to David Jenkins, who contributed this example.
|
||
</p></div>
|
||
</div>
|
||
</div>
|
||
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
|
||
<td align="left"></td>
|
||
<td align="right"><div class="copyright-footer">Copyright © 2007 Eric Niebler<p>
|
||
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
||
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
|
||
</p>
|
||
</div></td>
|
||
</tr></table>
|
||
<hr>
|
||
<div class="spirit-nav">
|
||
<a accesskey="p" href="../xpressive.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../xpressive.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="reference.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
|
||
</div>
|
||
</body>
|
||
</html>
|