Ticket #2672: tokenizer.patch

File tokenizer.patch, 3.1 KB (added by Charles Brockman <cmbrockman@…>, 14 years ago)
  • libs/tokenizer/introduc.htm

     
    1616
    1717  <h1 align="center">Introduction</h1>
    1818
    19   <p align="left">The boost Tokenizer package provides a flexible and easy to
    20   use way to break of a string or other character sequence into a series of
    21   tokens. Below is a simple example that will break up a phrase into
     19  <p align="left">The Boost Tokenizer package provides a flexible and
     20  easy-to-use way to break a string or other character sequence into a series
     21  of tokens. Below is a simple example that will break up a phrase into
    2222  words.</p>
    2323
    2424  <div align="left">
     
    4040</pre>
    4141  </div>
    4242
    43   <p align="left">You can choose how the string gets broken up. You do this
    44   by specifying the TokenizerFunction. If you do not specify anything, the
    45   default TokenizerFunction is char_delimiters_separator&lt;char&gt; which
     43  <p align="left">You can choose how the string gets parsed by using the
     44  TokenizerFunction. If you do not specify anything, the default
     45  TokenizerFunction is <em>char_delimiters_separator&lt;char&gt;</em> which
    4646  defaults to breaking up a string based on space and punctuation. Here is an
    47   example of using another TokenizerFunction called escaped_list_separator.
    48   This TokenizerFunction parses a superset of comma separated value (csv)
    49   lines. The format looks like this</p>
     47  example using another TokenizerFunction called
     48  <em>escaped_list_separator</em>. This TokenizerFunction parses a superset
     49  of comma-separated value (CSV) lines. The format looks like this:</p>
    5050
    5151  <p align="left">Field 1,"putting quotes around fields, allows commas",Field
    5252  3</p>
    5353
    5454  <p align="left">Below is an example that will break the previous line into
    55   its 3 fields</p>
     55  its three fields.</p>
    5656
    5757  <div align="left">
    5858    <pre>
     
    7373</pre>
    7474  </div>
    7575
    76   <p align="left">Finally, for some TokenizerFunctions you have to pass in
     76  <p align="left">Finally, for some TokenizerFunctions you have to pass
    7777  something into the constructor in order to do anything interesting. An
    78   example is offset_separator. This class breaks a string into tokens based
    79   on offsets for example</p>
     78  example is the offset_separator. This class breaks a string into tokens based
     79  on offsets. For example, when <em>12252001</em> is parsed using offsets of
     80  2,2,4 it becomes <em>12 25 2001</em>. Below is the code used.</p>
    8081
    81   <p align="left">12252001 when parsed using offsets of 2,2,4 becomes 12 25
    82   2001. Below is an example to parse this.</p>
    83 
    8482  <div align="left">
    8583    <pre>
    8684// simple_example_3.cpp
     
    105103  <p align="left">&nbsp;</p>
    106104  <hr>
    107105
    108   <p><a href="http://validator.w3.org/check?uri=referer"><img border="0" src=
    109   "http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01 Transitional"
    110   height="31" width="88"></a></p>
    111 
    112106  <p>Revised
    113107  <!--webbot bot="Timestamp" s-type="EDITED" s-format="%d %B, %Y" startspan -->25 December, 2006<!--webbot bot="Timestamp" endspan i-checksum="38518" --></p>
    114108