[[Image(ImprovingBoostDocs:ibdp.png,nolink)]]
* [ImprovingBoostDocs Improving Boost Docs]
* [ImprovingBoostDocs About this project]
* [BoostDocsRepository Boost docs repository]
* [UnifiedLookAndFeelProject Unified look and feel project]
* [HelpingBoostAuthors Helping Boost authors]
* [GlueDocsProject Glue docs project]
* [StandardCppLibraryDocumentation Standard C++ Library docs]
* [DocumentationBestPractices Documentation best practices]
* [DocumentationTools Documentation tools]
* [ImprovingBoostDocsSubprojects Subprojects]
* [BoostDocTest Boost.DocTest ]
* [BoostHtmlStylesheet Boost HTML stylesheet]
* [BoostKateSupport Boost Kate support]
* [BoostPdfStylesheet Boost PDF stylesheet]
* [BoostSpecificWikiMacros Boost-specific WikiMacros]
* [BoostTracStylesheet Boost Trac stylesheet]
* [BoostscriptProject Boostscript]
* [GoogleSearchBoxProject Google Search box project]
* '''[HtmlToDockbookProject HTML to docbook]'''
* [QuickbookWikiProcessor Quickbook WikiProcessor]
* [QuickbookSourceStylesheetProject Quickbook source stylesheet]
* [SvgIconsSetProject SVG icons set project]
* [SyntaxHighlightingProject Syntax-highlighting project]
* [BoostTracSyntaxColoring Trac syntax-coloring]
* [DebuggerVisualizers Debugger visualizers]
* [BrowserTestingChart Browser-testing chart]
* [LibrariesLogos Logo playground]
----
[[Image(ImprovingBoostDocsSubprojects:html_to_dockbook.png,nolink)]]
=== Problem ===
In order to have a complete conversion tool, it is necessary to be able to convert existing documentation written in HTML to Quickbook. As this currently stands, good progress is being made on the following parts of the document-conversion pipeline:
{{{
docbook --[boostbook + xsltproc]--> HTML --[quickbook css]-->quickbook
}}}
However, this project still lacks an important part:
{{{
HTML --[html to docbook (missing)]--> docbook --> [above pipeline] --> result
}}}
The aim of this subproject, then, is to investigate open-source solutions to this problem and to try and see which one will work best for Boost.
=== Converting HTML to docbook XML ===
What exactly should this tool do? As input it should take an HTML document (which may not necessarily be valid XHTML) and map the HTML tags to docbook XML. For example:
{{{
My Section
Some text
}}}
should become something like:
{{{
}}}
Two main problems present themselves. First, what should the tool do if the original document doesn't validate as XHTML? Second, there will certainly be a many-to-one mapping from HTML to docbook. Is it possible to determine a general solution for this?
=== Open-source solutions ===
For myself, I am comfortable with the idea of recommending [http://tidy.sourceforge.net Tidy] for producing validated XHTML. It is open-source and cross-platform. Furthermore, it is the original author's responsibility to ensure that his or her input is valid, and I feel that this task falls out of the scope of this subproject.
Regarding the second point, I have found the following resources for projects that attempt to address this problem:
1. [http://www.eecs.umich.edu/~ppadala/projects/tidy/ http://www.eecs.umich.edu/~ppadala/projects/tidy/]
2. [http://wiki.docbook.org/topic/Html2DocBook http://wiki.docbook.org/topic/Html2DocBook]
The first of these initially seemed promising (and was proposed as a possible solution by Matias) but I was unable to make it compile. This makes me wonder whether this project is dead or not. I have sent an email to the developer and am awaiting a response.
The second of these, as an XSL stylesheet, seems the more natural solution. It is still not perfect and does not completely obviate the need for manual rechecking and retagging, but I feel that using this and adapting it for our own needs may be fruitful. I have not yet tried hacking the stylesheet (this is the next thing I will try) but of the things I have found so far, this one seems the most promising.
=== Conclusion (so far) ===
With the (still only limited) investigation I have done so far, I think that the most natural solution for converting one XML format to another is to use an XSL stylesheet. Short of developing one specifically for this project, it is best to use the one provided in solution 2 above as this has been developed by someone who already has a lot of experience with docbook (I have not yet been in touch with him). Further adapting it for Boost's requirements may, I feel, be the most fruitful solution.
----
=== Active developers ===
[[Br]][[Image(People:glyn_matthews.png,50)]]
[[Br]]'''Glyn Matthews'''
[[Br]][http://www.linkedin.com/pub/4/74/b55 Linked In profile]
[[Br]]''glyn dot matthews at gmail dot com''
----
* [ImprovingBoostDocs Improving Boost Docs]
* [ImprovingBoostDocs About this project]
* [BoostDocsRepository Boost docs repository]
* [UnifiedLookAndFeelProject Unified look and feel project]
* [HelpingBoostAuthors Helping Boost authors]
* [GlueDocsProject Glue docs project]
* [StandardCppLibraryDocumentation Standard C++ Library docs]
* [DocumentationBestPractices Documentation best practices]
* [DocumentationTools Documentation tools]
* [ImprovingBoostDocsSubprojects Subprojects]
* [BoostDocTest Boost.DocTest ]
* [BoostHtmlStylesheet Boost HTML stylesheet]
* [BoostKateSupport Boost Kate support]
* [BoostPdfStylesheet Boost PDF stylesheet]
* [BoostSpecificWikiMacros Boost-specific WikiMacros]
* [BoostTracStylesheet Boost Trac stylesheet]
* [BoostscriptProject Boostscript]
* [GoogleSearchBoxProject Google Search box project]
* '''[HtmlToDockbookProject HTML to docbook]'''
* [QuickbookWikiProcessor Quickbook WikiProcessor]
* [QuickbookSourceStylesheetProject Quickbook source stylesheet]
* [SvgIconsSetProject SVG icons set project]
* [SyntaxHighlightingProject Syntax-highlighting project]
* [BoostTracSyntaxColoring Trac syntax-coloring]
* [DebuggerVisualizers Debugger visualizers]
* [BrowserTestingChart Browser-testing chart]
* [LibrariesLogos Logo playground]