882 lines
74 KiB
HTML
882 lines
74 KiB
HTML
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
|
||
<html>
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
||
<title>User manual</title>
|
||
<link rel="stylesheet" href="../../../doc/src/boostbook.css" type="text/css">
|
||
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
|
||
<link rel="home" href="../index.html" title="The Boost C++ Libraries BoostBook Documentation Subset">
|
||
<link rel="up" href="../metaparse.html" title="Chapter 24. Boost.Metaparse">
|
||
<link rel="prev" href="getting_started_with_boost_metap.html" title="Getting started with Boost.Metaparse">
|
||
<link rel="next" href="versioning.html" title="Versioning">
|
||
</head>
|
||
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
|
||
<table cellpadding="2" width="100%"><tr>
|
||
<td valign="top"><img alt="Boost C++ Libraries" width="277" height="86" src="../../../boost.png"></td>
|
||
<td align="center"><a href="../../../index.html">Home</a></td>
|
||
<td align="center"><a href="../../../libs/libraries.htm">Libraries</a></td>
|
||
<td align="center"><a href="http://www.boost.org/users/people.html">People</a></td>
|
||
<td align="center"><a href="http://www.boost.org/users/faq.html">FAQ</a></td>
|
||
<td align="center"><a href="../../../more/index.htm">More</a></td>
|
||
</tr></table>
|
||
<hr>
|
||
<div class="spirit-nav">
|
||
<a accesskey="p" href="getting_started_with_boost_metap.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../metaparse.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="versioning.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
|
||
<a name="metaparse.user_manual"></a><a name="manual"></a><a class="link" href="user_manual.html" title="User manual">User manual</a>
|
||
</h2></div></div></div>
|
||
<div class="toc"><dl class="toc">
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser">What is a parser</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.parsing_based_on_constexpr">Parsing
|
||
based on <code class="computeroutput"><span class="keyword">constexpr</span></code></a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_types_of_grammars_can_be_us">What
|
||
types of grammars can be used?</a></span></dt>
|
||
</dl></div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser" title="What is a parser">What is a parser</a>
|
||
</h3></div></div></div>
|
||
<div class="toc"><dl class="toc">
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.the_input_of_the_parsers">The
|
||
input of the parsers</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.source_positions">Source
|
||
positions</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.error_handling">Error
|
||
handling</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.some_examples_of_simple_parsers">Some
|
||
examples of simple parsers</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.combining_parsers">Combining
|
||
parsers</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.sequence">Sequence</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition">Repetition</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.what_can_be_built_from_a_compile">What
|
||
can be built from a compile-time string?</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.grammars">Grammars</a></span></dt>
|
||
</dl></div>
|
||
<p>
|
||
See the <a class="link" href="reference.html#parser">parser</a> section of the <a class="link" href="reference.html#reference">reference</a>
|
||
for the explanation of what a parser is.
|
||
</p>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.the_input_of_the_parsers"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.the_input_of_the_parsers" title="The input of the parsers">The
|
||
input of the parsers</a>
|
||
</h4></div></div></div>
|
||
<p>
|
||
Parsers take a <a class="link" href="reference.html#string"><code class="computeroutput"><span class="identifier">string</span></code></a>
|
||
as input, which represents a string for template metaprograms. For example
|
||
the string <code class="computeroutput"><span class="string">"Hello World!"</span></code>
|
||
can be defined the following way:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">string</span><span class="special"><</span><span class="char">'H'</span><span class="special">,</span><span class="char">'e'</span><span class="special">,</span><span class="char">'l'</span><span class="special">,</span><span class="char">'l'</span><span class="special">,</span><span class="char">'o'</span><span class="special">,</span><span class="char">' '</span><span class="special">,</span><span class="char">'W'</span><span class="special">,</span><span class="char">'o'</span><span class="special">,</span><span class="char">'r'</span><span class="special">,</span><span class="char">'l'</span><span class="special">,</span><span class="char">'d'</span><span class="special">,</span><span class="char">'!'</span><span class="special">></span>
|
||
</pre>
|
||
<p>
|
||
This syntax makes the input of the parsers difficult to read. Metaparse
|
||
works with compilers using C++98, but the input of the parsers has to be
|
||
defined the way it is described above.
|
||
</p>
|
||
<p>
|
||
Based on <code class="computeroutput"><span class="keyword">constexpr</span></code>, a feature
|
||
provided by C++11, Metaparse provides a macro, <a class="link" href="reference.html#BOOST_METAPARSE_STRING"><code class="computeroutput"><span class="identifier">BOOST_METAPARSE_STRING</span></code></a> for defining
|
||
strings:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">BOOST_METAPARSE_STRING</span><span class="special">(</span><span class="string">"Hello World!"</span><span class="special">)</span>
|
||
</pre>
|
||
<p>
|
||
This defines a <a class="link" href="reference.html#string"><code class="computeroutput"><span class="identifier">string</span></code></a>
|
||
as well, however, it is easier to read. The maximum length of the string
|
||
that can be defined this way is limited, however, this limit is configurable.
|
||
It is specified by the <code class="computeroutput"><span class="identifier">BOOST_METAPARSE_LIMIT_STRING_SIZE</span></code>
|
||
macro.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.source_positions"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.source_positions" title="Source positions">Source
|
||
positions</a>
|
||
</h4></div></div></div>
|
||
<p>
|
||
A source position is described using a compile-time data structure. The
|
||
following functions can be used to query it:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
<a class="link" href="reference.html#get_col"><code class="computeroutput"><span class="identifier">get_col</span></code></a>
|
||
</li>
|
||
<li class="listitem">
|
||
<a class="link" href="reference.html#get_line"><code class="computeroutput"><span class="identifier">get_line</span></code></a>
|
||
</li>
|
||
</ul></div>
|
||
<p>
|
||
The beginning of the input is <a class="link" href="reference.html#start"><code class="computeroutput"><span class="identifier">start</span></code></a>
|
||
which requires <code class="computeroutput"><span class="special"><</span><span class="identifier">boost</span><span class="special">/</span><span class="identifier">metaparse</span><span class="special">/</span><span class="identifier">start</span><span class="special">.</span><span class="identifier">hpp</span><span class="special">></span></code> to be included.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.error_handling"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.error_handling" title="Error handling">Error
|
||
handling</a>
|
||
</h4></div></div></div>
|
||
<p>
|
||
An error is described using a compile-time data structure. It contains
|
||
information about the source position where the error was detected and
|
||
some <a class="link" href="reference.html#parsing_error_message">description</a> about the
|
||
error. <a class="link" href="reference.html#debug_parsing_error"><code class="computeroutput"><span class="identifier">debug_parsing_error</span></code></a>
|
||
can be used to display the error message. Metaparse provides the <a class="link" href="reference.html#BOOST_METAPARSE_DEFINE_ERROR"><code class="computeroutput"><span class="identifier">BOOST_METAPARSE_DEFINE_ERROR</span></code></a>
|
||
macro for defining simple <a class="link" href="reference.html#parsing_error_message">parsing
|
||
error message</a>s.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.some_examples_of_simple_parsers"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.some_examples_of_simple_parsers" title="Some examples of simple parsers">Some
|
||
examples of simple parsers</a>
|
||
</h4></div></div></div>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
A parser that parses nothing and always succeeds is <a class="link" href="reference.html#return_"><code class="computeroutput"><span class="identifier">return_</span></code></a>.
|
||
</li>
|
||
<li class="listitem">
|
||
A parser that always fails is <a class="link" href="reference.html#fail"><code class="computeroutput"><span class="identifier">fail</span></code></a>.
|
||
</li>
|
||
<li class="listitem">
|
||
A parser that parses one character and returns the parsed character
|
||
as the result is <a class="link" href="reference.html#one_char"><code class="computeroutput"><span class="identifier">one_char</span></code></a>.
|
||
</li>
|
||
</ul></div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.combining_parsers"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.combining_parsers" title="Combining parsers">Combining
|
||
parsers</a>
|
||
</h4></div></div></div>
|
||
<p>
|
||
Complex parsers can be built by combining simple parsers. The parser library
|
||
contains a number of parser combinators that build new parsers from already
|
||
existing ones.
|
||
</p>
|
||
<p>
|
||
For example <a class="link" href="reference.html#accept_when"><code class="computeroutput"><span class="identifier">accept_when</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">Parser</span><span class="special">,</span> <span class="identifier">Predicate</span><span class="special">,</span> <span class="identifier">RejectErrorMsg</span><span class="special">></span></code> is a parser. It uses <code class="computeroutput"><span class="identifier">Parser</span></code> to parse the input. When <code class="computeroutput"><span class="identifier">Parser</span></code> rejects the input, the combinator
|
||
returns the error <code class="computeroutput"><span class="identifier">Parser</span></code>
|
||
failed with. When <code class="computeroutput"><span class="identifier">Parser</span></code>
|
||
is successful, the combinator validates the result using <code class="computeroutput"><span class="identifier">Predicate</span></code>. If the predicate returns true,
|
||
the combinator accepts the input, otherwise it generates an error with
|
||
the message <code class="computeroutput"><span class="identifier">RejectErrorMsg</span></code>.
|
||
</p>
|
||
<p>
|
||
Having <a class="link" href="reference.html#accept_when"><code class="computeroutput"><span class="identifier">accept_when</span></code></a>,
|
||
<a class="link" href="reference.html#one_char"><code class="computeroutput"><span class="identifier">one_char</span></code></a>
|
||
can be used to build parsers that accept only digit characters, only whitespaces,
|
||
etc. For example <a class="link" href="reference.html#digit"><code class="computeroutput"><span class="identifier">digit</span></code></a>
|
||
accepts only digit characters:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">typedef</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">metaparse</span><span class="special">::</span><span class="identifier">accept_when</span><span class="special"><</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">metaparse</span><span class="special">::</span><span class="identifier">one_char</span><span class="special">,</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">metaparse</span><span class="special">::</span><span class="identifier">util</span><span class="special">::</span><span class="identifier">is_digit</span><span class="special">,</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">metaparse</span><span class="special">::</span><span class="identifier">errors</span><span class="special">::</span><span class="identifier">digit_expected</span>
|
||
<span class="special">></span>
|
||
<span class="identifier">digit</span><span class="special">;</span>
|
||
</pre>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.sequence"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.sequence" title="Sequence">Sequence</a>
|
||
</h4></div></div></div>
|
||
<p>
|
||
The result of a successful parsing is some value and the remaining string
|
||
that was not parsed. The remaining string can be processed by another parser.
|
||
The parser library provides a parser combinator, <a class="link" href="reference.html#sequence"><code class="computeroutput"><span class="identifier">sequence</span></code></a>, that takes a number
|
||
of parsers as arguments and builds a new parser from them that:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
Parses the input using the first parser
|
||
</li>
|
||
<li class="listitem">
|
||
If parsing succeeds, it parses the remaining string with the second
|
||
parser
|
||
</li>
|
||
<li class="listitem">
|
||
It continues applying the parsers in order as long as they succeed
|
||
</li>
|
||
<li class="listitem">
|
||
If all of them succeed, it returns the list of results
|
||
</li>
|
||
<li class="listitem">
|
||
If any of the parsers fails, the combinator fails as well and returns
|
||
the error the first failing parser returned with
|
||
</li>
|
||
</ul></div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.repetition"></a><a name="repetition"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition" title="Repetition">Repetition</a>
|
||
</h4></div></div></div>
|
||
<div class="toc"><dl class="toc">
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl">Introducing
|
||
foldl</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldr">Introducing
|
||
foldr</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl_start_with_par">Introducing
|
||
foldl_start_with_parser</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldr_start_with_par">Introducing
|
||
foldr_start_with_parser</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl_reject_incompl">Introducing
|
||
foldl_reject_incomplete_start_with_parser</a></span></dt>
|
||
<dt><span class="section"><a href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.finding_the_right_folding_parser">Finding
|
||
the right folding parser combinator</a></span></dt>
|
||
</dl></div>
|
||
<p>
|
||
It is a common thing to parse a list of things of unknown length. As an
|
||
example let's start with something simple: the text is a list of numbers.
|
||
For example:
|
||
</p>
|
||
<pre class="programlisting"><span class="number">11</span> <span class="number">13</span> <span class="number">3</span> <span class="number">21</span>
|
||
</pre>
|
||
<p>
|
||
We want the result of parsing to be the sum of these values. Metaparse
|
||
provides the <a class="link" href="reference.html#int_"><code class="computeroutput"><span class="identifier">int_</span></code></a>
|
||
parser we can use to parse one of these numbers. Metaparse provides the
|
||
<a class="link" href="reference.html#token"><code class="computeroutput"><span class="identifier">token</span></code></a>
|
||
combinator to consume the whitespaces after the number. So the following
|
||
parser parses one number and the whitespaces after it:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">int_token</span> <span class="special">=</span> <span class="identifier">token</span><span class="special"><</span><span class="identifier">int_</span><span class="special">>;</span>
|
||
</pre>
|
||
<p>
|
||
The result of parsing is a boxed integer value: the value of the parsed
|
||
number. For example parsing <a class="link" href="reference.html#BOOST_METAPARSE_STRING"><code class="computeroutput"><span class="identifier">BOOST_METAPARSE_STRING</span></code></a><code class="computeroutput"><span class="special">(</span><span class="string">"13 "</span><span class="special">)</span></code> gives <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">mpl</span><span class="special">::</span><span class="identifier">int_</span><span class="special"><</span><span class="number">13</span><span class="special">></span></code>
|
||
as the result.
|
||
</p>
|
||
<p>
|
||
Our example input is a list of numbers. Each number can be parsed by <code class="computeroutput"><span class="identifier">int_token</span></code>:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/repeated_diag0.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
This diagram shows how the repeated application of <code class="computeroutput"><span class="identifier">int_token</span></code>
|
||
can parse the example input. Metaparse provides the <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a> parser to easily implement
|
||
this. The result of parsing is a typelist: the list of the individual numbers.
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/repeated_diag1.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
This diagram shows how <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">int_token</span><span class="special">></span></code> works. It uses the <code class="computeroutput"><span class="identifier">int_token</span></code>
|
||
parser repeatedly and builds a <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">mpl</span><span class="special">::</span><span class="identifier">vector</span></code>
|
||
from the results it provides.
|
||
</p>
|
||
<p>
|
||
But we need the sum of these, so we need to summarise the result. We can
|
||
do this by wrapping our parser, <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">int_token</span><span class="special">></span></code>
|
||
with <a class="link" href="reference.html#transform"><code class="computeroutput"><span class="identifier">transform</span></code></a>.
|
||
That gives us the opportunity to specify a function transforming this typelist
|
||
to some other value - the sum of the elements in our case. Initially let's
|
||
ignore how to summarise the elements in the vector. Let's assume that it
|
||
can be implemented by a lambda expression and use <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">mpl</span><span class="special">::</span><span class="identifier">lambda</span><span class="special"><...>::</span><span class="identifier">type</span></code>
|
||
representing that lambda expression. Here is an example using <a class="link" href="reference.html#transform"><code class="computeroutput"><span class="identifier">transform</span></code></a> and this lambda expression:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">sum_parser</span> <span class="special">=</span>
|
||
<span class="identifier">transform</span><span class="special"><</span>
|
||
<span class="identifier">repeated</span><span class="special"><</span><span class="identifier">int_token</span><span class="special">>,</span>
|
||
<span class="identifier">boost</span><span class="special">::</span><span class="identifier">mpl</span><span class="special">::</span><span class="identifier">lambda</span><span class="special"><...>::</span><span class="identifier">type</span>
|
||
<span class="special">>;</span>
|
||
</pre>
|
||
<p>
|
||
The <a class="link" href="reference.html#transform"><code class="computeroutput"><span class="identifier">transform</span></code></a><code class="computeroutput"><span class="special"><></span></code> parser combinator wraps the <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">int_token</span><span class="special">></span></code> to build the parser we need. Here is
|
||
a diagram showing how it works:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/repeated_diag2.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
As the diagram shows, the <a class="link" href="reference.html#transform"><code class="computeroutput"><span class="identifier">transform</span></code></a><code class="computeroutput"><span class="special"><</span></code><a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">int_token</span><span class="special">>,</span>
|
||
<span class="special">...></span></code> parser parses the input
|
||
using <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">int_token</span><span class="special">></span></code> and then does some processing on the
|
||
result of parsing.
|
||
</p>
|
||
<p>
|
||
Let's implement the missing lambda expression that tells <a class="link" href="reference.html#transform"><code class="computeroutput"><span class="identifier">transform</span></code></a> how to change the result
|
||
coming from <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">int_token</span><span class="special">></span></code>. We can summarise the numbers in a
|
||
typelist by using Boost.MPL's <code class="computeroutput"><span class="identifier">fold</span></code>
|
||
or <code class="computeroutput"><span class="identifier">accumulate</span></code>. Here is
|
||
an example doing that:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">sum_op</span> <span class="special">=</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">lambda</span><span class="special"><</span><span class="identifier">mpl</span><span class="special">::</span><span class="identifier">plus</span><span class="special"><</span><span class="identifier">mpl</span><span class="special">::</span><span class="identifier">_1</span><span class="special">,</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">_2</span><span class="special">>>::</span><span class="identifier">type</span><span class="special">;</span>
|
||
|
||
<span class="keyword">using</span> <span class="identifier">sum_parser</span> <span class="special">=</span>
|
||
<span class="identifier">transform</span><span class="special"><</span>
|
||
<span class="identifier">repeated</span><span class="special"><</span><span class="identifier">int_token</span><span class="special">>,</span>
|
||
<span class="identifier">mpl</span><span class="special">::</span><span class="identifier">lambda</span><span class="special"><</span>
|
||
<span class="identifier">mpl</span><span class="special">::</span><span class="identifier">fold</span><span class="special"><</span><span class="identifier">mpl</span><span class="special">::</span><span class="identifier">_1</span><span class="special">,</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">int_</span><span class="special"><</span><span class="number">0</span><span class="special">>,</span> <span class="identifier">sum_op</span><span class="special">></span>
|
||
<span class="special">>::</span><span class="identifier">type</span>
|
||
<span class="special">>;</span>
|
||
</pre>
|
||
<p>
|
||
Here is an extended version of the above diagram showing what happens here:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/repeated_diag3.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
This example parses the input, builds the list of numbers and then loops
|
||
over it and summarises the values. It starts with the second argument of
|
||
<code class="computeroutput"><span class="identifier">fold</span></code>, <code class="computeroutput"><span class="identifier">int_</span><span class="special"><</span><span class="number">0</span><span class="special">></span></code>
|
||
and adds every item of the list of numbers (which is the result of the
|
||
parser <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">int_token</span><span class="special">></span></code>) one by one.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Note that <a class="link" href="reference.html#transform"><code class="computeroutput"><span class="identifier">transform</span></code></a>
|
||
wraps another parser, <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">int_token</span><span class="special">></span></code> here. It parses the input with that
|
||
parser, gets the result of that parsing and changes that result. <a class="link" href="reference.html#transform"><code class="computeroutput"><span class="identifier">transform</span></code></a>
|
||
itself will be a parser returning that updated result.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h5 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl"></a><a name="introducing-foldl"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl" title="Introducing foldl">Introducing
|
||
foldl</a>
|
||
</h5></div></div></div>
|
||
<p>
|
||
It works, however, this is rather inefficient: it has a loop parsing
|
||
the integers one by one, building a typelist and then it loops over this
|
||
typelist to summarise the result. Using template metaprograms in your
|
||
applications can have a serious impact on the compiler's memory usage
|
||
and the speed of the compilation, therefore I recommend being careful
|
||
with these things.
|
||
</p>
|
||
<p>
|
||
Metaparse offers more efficient ways of achieving the same result. You
|
||
don't need two loops: you can merge them together and add every number
|
||
to your summary right after parsing it. Metaparse offers the <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a> for this.
|
||
</p>
|
||
<p>
|
||
With <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>
|
||
you specify:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
the parser to parse the individual elements of the list (which is
|
||
<code class="computeroutput"><span class="identifier">int_token</span></code> in our
|
||
example)
|
||
</li>
|
||
<li class="listitem">
|
||
the initial value used for folding (which is <code class="computeroutput"><span class="identifier">int_</span><span class="special"><</span><span class="number">0</span><span class="special">></span></code> in our example)
|
||
</li>
|
||
<li class="listitem">
|
||
the forward operation merging the sub-result we have so far and the
|
||
value coming from the last application of the parser (this was <code class="computeroutput"><span class="identifier">sum_op</span></code> in our example)
|
||
</li>
|
||
</ul></div>
|
||
<p>
|
||
Our parser can be implemented this way:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">better_sum_parser</span> <span class="special">=</span> <span class="identifier">foldl</span><span class="special"><</span><span class="identifier">int_token</span><span class="special">,</span> <span class="identifier">mpl</span><span class="special">::</span><span class="identifier">int_</span><span class="special"><</span><span class="number">0</span><span class="special">>,</span> <span class="identifier">sum_op</span><span class="special">>;</span>
|
||
</pre>
|
||
<p>
|
||
As you can see the implementation of the parser is more compact. Here
|
||
is a diagram showing what happens when you use this parser to parse some
|
||
input:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/foldl_diag1.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
As you can see, not only the implementation of the parser is more compact,
|
||
but it achieves the same result by doing less as well. It parses the
|
||
input by applying <code class="computeroutput"><span class="identifier">int_token</span></code>
|
||
repeatedly, just like the previous solution. But it produces the final
|
||
result without building a typelist as an internal step. Here is how it
|
||
works internally:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/foldl_diag2.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
It summarises the results of the repeated <code class="computeroutput"><span class="identifier">int_token</span></code>
|
||
application using <code class="computeroutput"><span class="identifier">sum_op</span></code>.
|
||
This implementation is more efficient. It accepts an empty string as
|
||
a valid input: the sum of it is <code class="computeroutput"><span class="number">0</span></code>.
|
||
It may be good for you, in which case you are done. If you don't wan
|
||
to accept it, you can use <a class="link" href="reference.html#foldl1"><code class="computeroutput"><span class="identifier">foldl1</span></code></a>
|
||
instead of <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>.
|
||
This is the same, but it rejects empty input. (Metaparse offers <a class="link" href="reference.html#repeated1"><code class="computeroutput"><span class="identifier">repeated1</span></code></a>
|
||
as well if you choose the first approach and would like to reject empty
|
||
string)
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h5 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.repetition.introducing_foldr"></a><a name="introducing-foldr"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldr" title="Introducing foldr">Introducing
|
||
foldr</a>
|
||
</h5></div></div></div>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Note that if you are reading this manual for the first time, you probably
|
||
want to skip this section and proceed with <a class="link" href="user_manual.html#introducing-foldl_start_with_parser">Introducing
|
||
foldl_start_with_parser</a>
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
You might have noticed that Metaparse offers <a class="link" href="reference.html#foldr"><code class="computeroutput"><span class="identifier">foldr</span></code></a> as well. The difference
|
||
between <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>
|
||
and <a class="link" href="reference.html#foldr"><code class="computeroutput"><span class="identifier">foldr</span></code></a>
|
||
is the direction in which the results are summarised. (<code class="computeroutput"><span class="identifier">l</span></code> stands for <span class="emphasis"><em>from the Left</em></span>
|
||
and <code class="computeroutput"><span class="identifier">r</span></code> stands for <span class="emphasis"><em>from
|
||
the Right</em></span>) Here is a diagram showing how <code class="computeroutput"><span class="identifier">better_sum_parser</span></code>
|
||
works if it is implemented using <a class="link" href="reference.html#foldr"><code class="computeroutput"><span class="identifier">foldr</span></code></a>:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/foldr_diag1.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
As you can see this is very similar to using <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>, but the results coming
|
||
out of the individual applications of <code class="computeroutput"><span class="identifier">int_token</span></code>
|
||
are summarised in a right-to-left order. As <code class="computeroutput"><span class="identifier">sum_op</span></code>
|
||
is addition, it does not affect the end result, but in other cases it
|
||
might.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Note that the implementation of <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a> is more efficient than
|
||
<a class="link" href="reference.html#foldr"><code class="computeroutput"><span class="identifier">foldr</span></code></a>.
|
||
Prefer <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>
|
||
whenever possible.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
As you might expect it, Metaparse offers <a class="link" href="reference.html#foldr1"><code class="computeroutput"><span class="identifier">foldr1</span></code></a> as well, which folds
|
||
from the right and rejects empty input.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h5 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl_start_with_par"></a><a name="introducing-foldl_start_with_parser"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl_start_with_par" title="Introducing foldl_start_with_parser">Introducing
|
||
foldl_start_with_parser</a>
|
||
</h5></div></div></div>
|
||
<p>
|
||
Let's change the grammar of our little language. Instead of a list of
|
||
numbers, let's expect numbers separated by a <code class="computeroutput"><span class="special">+</span></code>
|
||
symbol. Our example input becomes the following:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">BOOST_METAPARSE_STRING</span><span class="special">(</span><span class="string">"11 + 13 + 3 + 21"</span><span class="special">)</span>
|
||
</pre>
|
||
<p>
|
||
Parsing it with <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>
|
||
or <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a>
|
||
is difficult: there has to be a <code class="computeroutput"><span class="special">+</span></code>
|
||
symbol before every element <span class="emphasis"><em>except</em></span> the first one.
|
||
None of the already introduced repetition constructs offer a way of treating
|
||
the first element in a different way.
|
||
</p>
|
||
<p>
|
||
If we forget about the first number for a moment, the rest of the input
|
||
is <code class="computeroutput"><span class="string">"+ 13 + 3 + 21"</span></code>.
|
||
This can easily be parsed by <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>
|
||
(or <a class="link" href="reference.html#repeated"><code class="computeroutput"><span class="identifier">repeated</span></code></a>):
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">plus_token</span> <span class="special">=</span> <span class="identifier">token</span><span class="special"><</span><span class="identifier">lit_c</span><span class="special"><</span><span class="char">'+'</span><span class="special">>>;</span>
|
||
<span class="keyword">using</span> <span class="identifier">plus_int</span> <span class="special">=</span> <span class="identifier">last_of</span><span class="special"><</span><span class="identifier">plus_token</span><span class="special">,</span> <span class="identifier">int_token</span><span class="special">>;</span>
|
||
|
||
<span class="keyword">using</span> <span class="identifier">sum_parser2</span> <span class="special">=</span> <span class="identifier">foldl</span><span class="special"><</span><span class="identifier">plus_int</span><span class="special">,</span> <span class="identifier">int_</span><span class="special"><</span><span class="number">0</span><span class="special">>,</span> <span class="identifier">sum_op</span><span class="special">>;</span>
|
||
</pre>
|
||
<p>
|
||
It uses <code class="computeroutput"><span class="identifier">plus_int</span></code>, that
|
||
is <a class="link" href="reference.html#last_of"><code class="computeroutput"><span class="identifier">last_of</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">plus_token</span><span class="special">,</span> <span class="identifier">int_token</span><span class="special">></span></code> as the parser that is used repeatedly
|
||
to get the numbers. It does the following:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
Uses <code class="computeroutput"><span class="identifier">plus_token</span></code> to
|
||
parse the <code class="computeroutput"><span class="special">+</span></code> symbol and
|
||
any whitespace that might follow it.
|
||
</li>
|
||
<li class="listitem">
|
||
Uses then <code class="computeroutput"><span class="identifier">int_token</span></code>
|
||
to parse the number
|
||
</li>
|
||
<li class="listitem">
|
||
Combines the above two with <a class="link" href="reference.html#last_of"><code class="computeroutput"><span class="identifier">last_of</span></code></a> to use both parsers
|
||
in order and keep only the result of using the second one (the result
|
||
of parsing the <code class="computeroutput"><span class="special">+</span></code> symbol
|
||
is thrown away - we don't care about it).
|
||
</li>
|
||
</ul></div>
|
||
<p>
|
||
This way <a class="link" href="reference.html#last_of"><code class="computeroutput"><span class="identifier">last_of</span></code></a><code class="computeroutput"><span class="special"><</span><span class="identifier">plus_token</span><span class="special">,</span> <span class="identifier">int_token</span><span class="special">></span></code> returns the value of the number as
|
||
the result of parsing, just like our previous parser, <code class="computeroutput"><span class="identifier">int_token</span></code>
|
||
did. Because of this, it can be used as a drop-in replacement of <code class="computeroutput"><span class="identifier">int_token</span></code> in the previous example and
|
||
we get a parser for our updated language. Or at least for all number
|
||
except the first one.
|
||
</p>
|
||
<p>
|
||
This <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>
|
||
can not parse the first element, because it expects a <code class="computeroutput"><span class="special">+</span></code>
|
||
symbol before every number. You might think of making the <code class="computeroutput"><span class="special">+</span></code> symbol optional in the above approach
|
||
- don't do that. It makes the parser accept <code class="computeroutput"><span class="string">"11
|
||
+ 13 3 21"</span></code> as well as the <code class="computeroutput"><span class="special">+</span></code>
|
||
symbol is now optional <span class="emphasis"><em>everywhere</em></span>.
|
||
</p>
|
||
<p>
|
||
What you could do is parsing the first element with <code class="computeroutput"><span class="identifier">int_token</span></code>,
|
||
the rest of the elements with the above <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>-based solution and add
|
||
the result of the two. This is left as an exercise to the reader.
|
||
</p>
|
||
<p>
|
||
Metaparse offers <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a> to implement
|
||
this. <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>
|
||
is the same as <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a>.
|
||
The difference is that instead of an initial value to combine the list
|
||
elements with it takes an <span class="emphasis"><em>initial parser</em></span>:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">plus_token</span> <span class="special">=</span> <span class="identifier">token</span><span class="special"><</span><span class="identifier">lit_c</span><span class="special"><</span><span class="char">'+'</span><span class="special">>>;</span>
|
||
<span class="keyword">using</span> <span class="identifier">plus_int</span> <span class="special">=</span> <span class="identifier">last_of</span><span class="special"><</span><span class="identifier">plus_token</span><span class="special">,</span> <span class="identifier">int_token</span><span class="special">>;</span>
|
||
|
||
<span class="keyword">using</span> <span class="identifier">sum_parser3</span> <span class="special">=</span> <span class="identifier">foldl_start_with_parser</span><span class="special"><</span><span class="identifier">plus_int</span><span class="special">,</span> <span class="identifier">int_token</span><span class="special">,</span> <span class="identifier">sum_op</span><span class="special">>;</span>
|
||
</pre>
|
||
<p>
|
||
<a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>
|
||
starts with applying that initial parser and uses the result it returns
|
||
as the initial value for folding. It does the same as <a class="link" href="reference.html#foldl"><code class="computeroutput"><span class="identifier">foldl</span></code></a> after that. The following
|
||
diagram shows how it can be used to parse a list of numbers separated
|
||
by <code class="computeroutput"><span class="special">+</span></code> symbols:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/foldl_start_with_parser_diag1.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
As the diagram shows, it start parsing the list of numbers with <code class="computeroutput"><span class="identifier">int_token</span></code>, uses its value as the starting
|
||
value for folding (earlier approaches were using the value <code class="computeroutput"><span class="identifier">int_</span><span class="special"><</span><span class="number">0</span><span class="special">></span></code> as
|
||
this starting value). Then it parses all elements of the list by using
|
||
<code class="computeroutput"><span class="identifier">plus_int</span></code> multiple times.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h5 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.repetition.introducing_foldr_start_with_par"></a><a name="introducing-foldr_start_with_parser"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldr_start_with_par" title="Introducing foldr_start_with_parser">Introducing
|
||
foldr_start_with_parser</a>
|
||
</h5></div></div></div>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Note that if you are reading this manual for the first time, you probably
|
||
want to skip this section and try creating some parsers using <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>
|
||
instead.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<p>
|
||
<a href="../foldl_start_with_parser.hpp" target="_top"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>
|
||
has its <span class="emphasis"><em>from the right</em></span> pair, <a class="link" href="reference.html#foldr_start_with_parser"><code class="computeroutput"><span class="identifier">foldr_start_with_parser</span></code></a>. It
|
||
uses the same elements as <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a> but
|
||
in a different order. Here is a parser for our example language implemented
|
||
with <a class="link" href="reference.html#foldr_start_with_parser"><code class="computeroutput"><span class="identifier">foldr_start_with_parser</span></code></a>:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">using</span> <span class="identifier">plus_token</span> <span class="special">=</span> <span class="identifier">token</span><span class="special"><</span><span class="identifier">lit_c</span><span class="special"><</span><span class="char">'+'</span><span class="special">>>;</span>
|
||
<span class="keyword">using</span> <span class="identifier">int_plus</span> <span class="special">=</span> <span class="identifier">first_of</span><span class="special"><</span><span class="identifier">int_token</span><span class="special">,</span> <span class="identifier">plus_token</span><span class="special">>;</span>
|
||
|
||
<span class="keyword">using</span> <span class="identifier">sum_parser4</span> <span class="special">=</span> <span class="identifier">foldr_start_with_parser</span><span class="special"><</span><span class="identifier">int_plus</span><span class="special">,</span> <span class="identifier">int_token</span><span class="special">,</span> <span class="identifier">sum_op</span><span class="special">>;</span>
|
||
</pre>
|
||
<p>
|
||
Note that it uses <code class="computeroutput"><span class="identifier">int_plus</span></code>
|
||
instead of <code class="computeroutput"><span class="identifier">plus_int</span></code>.
|
||
This is because the parser the initial value for folding comes from is
|
||
used after <code class="computeroutput"><span class="identifier">int_plus</span></code> has
|
||
parsed the input as many times as it could. It might sound strange for
|
||
the first time, but the following diagram should help you understand
|
||
how it works:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/foldr_start_with_parser_diag1.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
As you can see, it starts with the parser that is applied repeatedly
|
||
on the input, thus instead of parsing <code class="computeroutput"><span class="identifier">plus_token</span>
|
||
<span class="identifier">int_token</span></code> repeatedly, we need
|
||
to parse <code class="computeroutput"><span class="identifier">int_token</span> <span class="identifier">plus_token</span></code>
|
||
repeatedly. The last number is not followed by <code class="computeroutput"><span class="special">+</span></code>,
|
||
thus <code class="computeroutput"><span class="identifier">int_plus</span></code> fails to
|
||
parse it and it stops the iteration. <a class="link" href="reference.html#foldr_start_with_parser"><code class="computeroutput"><span class="identifier">foldr_start_with_parser</span></code></a> then
|
||
uses the other parser, <code class="computeroutput"><span class="identifier">int_token</span></code>
|
||
to parse the input. It succeeds and the result it returns is used as
|
||
the starting value for folding from the right.
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Note that as the above description also suggests, the implementation
|
||
of <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>
|
||
is more efficient than <a class="link" href="reference.html#foldr_start_with_parser"><code class="computeroutput"><span class="identifier">foldr_start_with_parser</span></code></a>. Prefer
|
||
<a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>
|
||
whenever possible.
|
||
</p></td></tr>
|
||
</table></div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h5 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl_reject_incompl"></a><a name="introducing-foldl_reject_incomplete_start_with_parser"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.introducing_foldl_reject_incompl" title="Introducing foldl_reject_incomplete_start_with_parser">Introducing
|
||
foldl_reject_incomplete_start_with_parser</a>
|
||
</h5></div></div></div>
|
||
<p>
|
||
Using a parser built with <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a> we can
|
||
parse the input when the input is correct. However, it is not always
|
||
the case. Consider the following input for example:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">BOOST_METAPARSE_STRING</span><span class="special">(</span><span class="string">"11 + 13 + 3 + 21 +"</span><span class="special">)</span>
|
||
</pre>
|
||
<p>
|
||
This is an invalid expression. However, if we parse it using the <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>-based
|
||
parser presented earlier (<code class="computeroutput"><span class="identifier">sum_parser3</span></code>),
|
||
it accepts the input and the result is <code class="computeroutput"><span class="number">48</span></code>.
|
||
This is because <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a> parses
|
||
the input <span class="emphasis"><em>as long as it can</em></span>. It parses the first<code class="computeroutput"><span class="identifier">int_token</span></code> (<code class="computeroutput"><span class="number">11</span></code>)
|
||
and then it starts parsing the <code class="computeroutput"><span class="identifier">plus_int</span></code>
|
||
elements (<code class="computeroutput"><span class="special">+</span> <span class="number">13</span></code>,
|
||
<code class="computeroutput"><span class="special">+</span> <span class="number">3</span></code>,
|
||
<code class="computeroutput"><span class="special">+</span> <span class="number">21</span></code>).
|
||
After parsing all of these, it tries to parse the remaining <code class="computeroutput"><span class="string">" +"</span></code> input using <code class="computeroutput"><span class="identifier">plus_int</span></code> which fails and therefore
|
||
<a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>
|
||
stops after <code class="computeroutput"><span class="special">+</span> <span class="number">21</span></code>.
|
||
</p>
|
||
<p>
|
||
The problem is that the parser parses the longest sub-expression starting
|
||
from the beginning, that represents a valid expression. The rest is ignored.
|
||
The parser can be wrapped by <a class="link" href="reference.html#entire_input"><code class="computeroutput"><span class="identifier">entire_input</span></code></a> to make sure to
|
||
reject expressions with invalid extra characters at the end, however,
|
||
that won't make the error message useful. (<a class="link" href="reference.html#entire_input"><code class="computeroutput"><span class="identifier">entire_input</span></code></a> can only tell the
|
||
author of the invalid expression that after <code class="computeroutput"><span class="special">+</span>
|
||
<span class="number">21</span></code> is something wrong).
|
||
</p>
|
||
<p>
|
||
Metaparse provides <a class="link" href="reference.html#foldl_reject_incomplete_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_reject_incomplete_start_with_parser</span></code></a>,
|
||
which does the same as <a class="link" href="reference.html#foldl_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_start_with_parser</span></code></a>, except
|
||
that once no further repetitions are found, it checks <span class="emphasis"><em>where</em></span>
|
||
the repeated parser (in our example <code class="computeroutput"><span class="identifier">plus_int</span></code>)
|
||
fails. When it can make any progress (eg. it finds a <code class="computeroutput"><span class="special">+</span></code>
|
||
symbol), then <a class="link" href="reference.html#foldl_reject_incomplete_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_reject_incomplete_start_with_parser</span></code></a>
|
||
assumes, that the expression's author intended to make the repetition
|
||
longer, but made a mistake and propagates the error message coming from
|
||
that last broken expression.
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/foldl_reject_incomplete_start_with_parser_diag1.png" width="70%"></span>
|
||
</p>
|
||
<p>
|
||
The above diagram shows how <a class="link" href="reference.html#foldl_reject_incomplete_start_with_parser"><code class="computeroutput"><span class="identifier">foldl_reject_incomplete_start_with_parser</span></code></a>
|
||
parses the example invalid input and how it fails. This can be used for
|
||
better error reporting from the parsers.
|
||
</p>
|
||
<p>
|
||
Other folding parsers also have their <code class="computeroutput"><span class="identifier">f</span></code>
|
||
version. (eg. <a class="link" href="reference.html#foldr_reject_incomplete"><code class="computeroutput"><span class="identifier">foldr_reject_incomplete</span></code></a>,
|
||
<a class="link" href="reference.html#foldl_reject_incomplete1"><code class="computeroutput"><span class="identifier">foldl_reject_incomplete1</span></code></a>,
|
||
etc).
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h5 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.repetition.finding_the_right_folding_parser"></a><a name="finding-the-right-folding-parser-combinator"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.repetition.finding_the_right_folding_parser" title="Finding the right folding parser combinator">Finding
|
||
the right folding parser combinator</a>
|
||
</h5></div></div></div>
|
||
<p>
|
||
As you might have noticed, there are a lot of different folding parser
|
||
combinators. To help you find the right one, the following naming convention
|
||
is used:
|
||
</p>
|
||
<p>
|
||
<span class="inlinemediaobject"><img src="../images/metaparse/folds.png" width="70%"></span>
|
||
</p>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../../doc/src/images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Note that there is no <code class="computeroutput"><span class="identifier">foldr_reject_incomplete_start_with_parser</span></code>.
|
||
The <code class="computeroutput"><span class="identifier">p</span></code> version of the
|
||
right-folding parsers applies the special parser, whose result is the
|
||
initial value, after the repeated elements. Therefore, when the parser
|
||
parsing one repeated element fails, <code class="computeroutput"><span class="identifier">foldr_start_with_parser</span></code>
|
||
would apply that special final parser instead of checking how the repeated
|
||
element's parser failed.
|
||
</p></td></tr>
|
||
</table></div>
|
||
</div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.what_can_be_built_from_a_compile"></a><a name="result_types"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.what_can_be_built_from_a_compile" title="What can be built from a compile-time string?">What
|
||
can be built from a compile-time string?</a>
|
||
</h4></div></div></div>
|
||
<p>
|
||
Parsers built using Metaparse are template metaprograms parsing text (or
|
||
code) at compile-time. Here is a list of things that can be the "result"
|
||
of parsing:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
A <span class="emphasis"><em>type</em></span>. An example for this is a parser parsing
|
||
a <code class="computeroutput"><span class="identifier">printf</span></code> format string
|
||
and returning the typelist (eg. <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">mpl</span><span class="special">::</span><span class="identifier">vector</span></code>)
|
||
of the expected arguments.
|
||
</li>
|
||
<li class="listitem">
|
||
A <span class="emphasis"><em>constant value</em></span>. An example for this is the result
|
||
of a calculator language. See the <a class="link" href="getting_started_with_boost_metap.html#getting_started">Getting
|
||
Started</a> section for further details.
|
||
</li>
|
||
<li class="listitem">
|
||
A <span class="emphasis"><em>runtime object</em></span>. A static runtime object can
|
||
be generated that might be used at runtime. An example for this is
|
||
parsing regular expressions at compile-time and building <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">xpressive</span><span class="special">::</span><span class="identifier">sregex</span></code> objects. See the <code class="computeroutput"><span class="identifier">regex</span></code> example of Metaparse for an
|
||
example.
|
||
</li>
|
||
<li class="listitem">
|
||
A C++ <span class="emphasis"><em>function</em></span>, which might be called at runtime.
|
||
A C++ function can be generated that can be called at runtime. It is
|
||
good for generating native (and optimised) code from EDSLs. See the
|
||
<code class="computeroutput"><span class="identifier">compile_to_native_code</span></code>
|
||
example of Metaparse as an example for this.
|
||
</li>
|
||
<li class="listitem">
|
||
A <a class="link" href="reference.html#metafunction_class"><span class="emphasis"><em>template metafunction
|
||
class</em></span></a>. The result of parsing might be a type, which
|
||
is a <a class="link" href="reference.html#metafunction_class">template metafunction class</a>.
|
||
This is good for building an EDSL for template metaprogramming. See
|
||
the <code class="computeroutput"><span class="identifier">meta_hs</span></code> example
|
||
of Metaparse as an example for this.
|
||
</li>
|
||
</ul></div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h4 class="title">
|
||
<a name="metaparse.user_manual.what_is_a_parser.grammars"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_is_a_parser.grammars" title="Grammars">Grammars</a>
|
||
</h4></div></div></div>
|
||
<p>
|
||
Metaparse provides a way to define grammars in a syntax that resembles
|
||
EBNF. The <a class="link" href="reference.html#grammar"><code class="computeroutput"><span class="identifier">grammar</span></code></a>
|
||
template can be used to define a grammar. It can be used the following
|
||
way:
|
||
</p>
|
||
<pre class="programlisting"><span class="identifier">grammar</span><span class="special"><</span><span class="identifier">BOOST_METAPARSE_STRING</span><span class="special">(</span><span class="string">"plus_exp"</span><span class="special">)></span>
|
||
<span class="special">::</span><span class="identifier">import</span><span class="special"><</span><span class="identifier">BOOST_METAPARSE_STRING</span><span class="special">(</span><span class="string">"int_token"</span><span class="special">),</span> <span class="identifier">token</span><span class="special"><</span><span class="identifier">int_</span><span class="special">>>::</span><span class="identifier">type</span>
|
||
|
||
<span class="special">::</span><span class="identifier">rule</span><span class="special"><</span><span class="identifier">BOOST_METAPARSE_STRING</span><span class="special">(</span><span class="string">"ws ::= (' ' | '\n' | '\r' | '\t')*"</span><span class="special">)>::</span><span class="identifier">type</span>
|
||
<span class="special">::</span><span class="identifier">rule</span><span class="special"><</span><span class="identifier">BOOST_METAPARSE_STRING</span><span class="special">(</span><span class="string">"plus_token ::= '+' ws"</span><span class="special">),</span> <span class="identifier">front</span><span class="special"><</span><span class="identifier">_1</span><span class="special">>>::</span><span class="identifier">type</span>
|
||
<span class="special">::</span><span class="identifier">rule</span><span class="special"><</span><span class="identifier">BOOST_METAPARSE_STRING</span><span class="special">(</span><span class="string">"plus_exp ::= int_token (plus_token int_token)*"</span><span class="special">),</span> <span class="identifier">plus_action</span><span class="special">>::</span><span class="identifier">type</span>
|
||
</pre>
|
||
<p>
|
||
The code above defines a parser from a grammar definition. The start symbol
|
||
of the grammar is <code class="computeroutput"><span class="identifier">plus_exp</span></code>.
|
||
The lines beginning with <code class="computeroutput"><span class="special">::</span><span class="identifier">rule</span></code> define rules. Rules optionally have
|
||
a semantic action, which is a metafunction class that transforms the result
|
||
of parsing after the rule has been applied. Existing parsers can be bound
|
||
to names and be used in the rules by importing them. Lines beginning with
|
||
<code class="computeroutput"><span class="special">::</span><span class="identifier">import</span></code>
|
||
bind existing parsers to names.
|
||
</p>
|
||
<p>
|
||
The result of a grammar definition is a parser which can be given to other
|
||
parser combinators or be used directly. Given that grammars can import
|
||
existing parsers and build new ones, they are parser combinators as well.
|
||
</p>
|
||
</div>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="metaparse.user_manual.parsing_based_on_constexpr"></a><a class="link" href="user_manual.html#metaparse.user_manual.parsing_based_on_constexpr" title="Parsing based on constexpr">Parsing
|
||
based on <code class="computeroutput"><span class="keyword">constexpr</span></code></a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
Metaparse is based on template metaprogramming, however, C++11 provides
|
||
<code class="computeroutput"><span class="keyword">constexpr</span></code>, which can be used
|
||
for parsing at compile-time as well. While implementing parsers based on
|
||
<code class="computeroutput"><span class="keyword">constexpr</span></code> is easier for a C++
|
||
developer, since its syntax resembles the regular syntax of the language,
|
||
the result of parsing has to be a <code class="computeroutput"><span class="keyword">constexpr</span></code>
|
||
value. Parsers based on template metaprogramming can build types as the result
|
||
of parsing. These types may be boxed <code class="computeroutput"><span class="keyword">constexpr</span></code>
|
||
values but can be metafunction classes, classes with static functions which
|
||
can be called at runtime, etc.
|
||
</p>
|
||
<p>
|
||
When a parser built with Metaparse needs a sub-parser for processing a part
|
||
of the input text and generating a <code class="computeroutput"><span class="keyword">constexpr</span></code>
|
||
value as the result of parsing, one can implement the sub-parser based on
|
||
<code class="computeroutput"><span class="keyword">constexpr</span></code> functions. Metaparse
|
||
can be integrated with them and lift their results into C++ template metaprogramming.
|
||
An example demonstrating this feature can be found among the examples (<code class="computeroutput"><span class="identifier">constexpr_parser</span></code>). This capability makes
|
||
it possible to integrate Metaparse with parsing libraries based on <code class="computeroutput"><span class="keyword">constexpr</span></code>.
|
||
</p>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="metaparse.user_manual.what_types_of_grammars_can_be_us"></a><a class="link" href="user_manual.html#metaparse.user_manual.what_types_of_grammars_can_be_us" title="What types of grammars can be used?">What
|
||
types of grammars can be used?</a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
It is possible to write parsers for <span class="emphasis"><em>context free grammars</em></span>
|
||
using Metaparse. However, this is not the most general category of grammars
|
||
that can be used. As Metaparse is a highly extendable framework, it is not
|
||
clear what should be considered to be the limit of Metaparse itself. For
|
||
example Metaparse provides the <a class="link" href="reference.html#accept_when"><code class="computeroutput"><span class="identifier">accept_when</span></code></a> <a class="link" href="reference.html#parser_combinator">parser
|
||
combinator</a>. It can be used to provide arbitrary predicates for enabled/disabling
|
||
a specific rule. One can go as far as providing the Turing machine (as a
|
||
<a class="link" href="reference.html#metafunction">metafunction</a>) of the entire grammar as
|
||
a predicate, so one can build parsers for <span class="emphasis"><em>unrestricted grammars</em></span>
|
||
that can be parsed using a Turing machine. Note that such a parser would
|
||
not be considered to be a parser built with Metaparse, however, it is not
|
||
clear how far a solution might go and still be considered using Metaparse.
|
||
</p>
|
||
<p>
|
||
Metaparse assumes that the parsers are <span class="emphasis"><em>deterministic</em></span>,
|
||
as they have only "one" result. It is of course possible to write
|
||
parsers and combinators that return a set (or list or some other container)
|
||
of results as that "one" result, but that can be considered building
|
||
a new parser library. There is no clear boundary for Metaparse.
|
||
</p>
|
||
<p>
|
||
Metaparse supports building <span class="emphasis"><em>top-down parsers</em></span> and <span class="emphasis"><em>left-recursion</em></span>
|
||
is not supported as it would lead to infinite recursion. <span class="emphasis"><em>Right-recursion</em></span>
|
||
is supported, however, in most cases the <a class="link" href="user_manual.html#repetition">iterative
|
||
parser combinators</a> provide better alternatives.
|
||
</p>
|
||
</div>
|
||
</div>
|
||
<table xmlns:rev="http://www.cs.rpi.edu/~gregod/boost/tools/doc/revision" width="100%"><tr>
|
||
<td align="left"></td>
|
||
<td align="right"><div class="copyright-footer">Copyright © 2015 Abel Sinkovics<p>
|
||
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
||
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
|
||
</p>
|
||
</div></td>
|
||
</tr></table>
|
||
<hr>
|
||
<div class="spirit-nav">
|
||
<a accesskey="p" href="getting_started_with_boost_metap.html"><img src="../../../doc/src/images/prev.png" alt="Prev"></a><a accesskey="u" href="../metaparse.html"><img src="../../../doc/src/images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../../../doc/src/images/home.png" alt="Home"></a><a accesskey="n" href="versioning.html"><img src="../../../doc/src/images/next.png" alt="Next"></a>
|
||
</div>
|
||
</body>
|
||
</html>
|