Version 7 (modified by 15 years ago) ( diff ) | ,
---|
- Improving Boost Docs
- About this project
- Boost docs repository
- Unified look & feel project
- Docs translations project
- Glue docs project
- Subprojects
- Boost HTML stylesheet
- Boost Kate support
- Boost PDF stylesheet
- Boost specific WikiMacros
- Boost Trac stylesheet
- Boost.Build support for doc tools
- Boostscript
- Google Search Box project
- HTML to docbook
- Improving Boostbook
- Quickbook as a WikiProcessor for our Trac
- Quickbook source stylesheet
- SVG icons set project
- Syntax highlighting project
- Trac Syntax Coloring for Boostbook, Quickbook and Jamfiles
- Browser Testing Chart
- Logo Playground
Problem
In order to have a complete conversion tool, it's necessary to be able to convert existing documentation, written in HTML, to quickbook. As this currently stand, good progress is being made on the following part of the document conversion pipeline:
docbook --[boostbook + xsltproc]--> HTML --[quickbook css]-->quickbook
However, this project still lacks an important part:
HTML --[html to docbook (missing)]--> docbook --> [above pipeline] --> result
The aim of this subproject then, is to investigate some open source solutions to this problem, and try and see which one will work best for boost.
Converting HTML to docbook XML
What exactly should this tool do? As input it should take an HTML document (which may not necessarily be valid XHTML) and map the HTML tags to docbook XML. For example:
<h1>My Section</h1> <p>Some text</p>
should become something like:
<section id="my_section"> <title>My Section</title> <para>Some text</para> </section>
Two main problems present themselves. In the first case, what should the tool do if the original document doesn't validate as XHTML? Secondly, there will certainly be a many-to-one mapping from HTML to docbook. Is it possible to determine a general solution for this?
Open Source Solutions
For me, I'm comfortable with the idea of recommending Tidy to produce validating XHTML. Its open source and cross platform. Furthermore its the original author's responsibility to ensure that their input is valid and I feel that this task falls out of the scope of this sub-project.
For the second point, I have found the following resources for projects which have attempted to address this problem:
The first of these seemed initially promising (and was proposed as a possible solution by Matias) but I was unable to make it compile. This makes me wonder whether this project is dead or not. I've sent an e-mail to the developer and I'm awaiting a response.
The second of these, as an XSL stylesheet, seems the more natural solution. Its still not perfect and doesn't completely obviate the need for manually rechecking and retagging, but I feel that using this and adapting it for own needs my be fruitful. I haven't tried hacking the stylesheet yet (this will be the next thing I try) but of the things I've found so far this seems the most promising.
Conclusion (so far)
With the (still only limited) investigation I've done so far, I think that the most natural solution for converting what is one XML format to another, is to use an XSL stylesheet. Short of developing one specifically for this project, it is best to use the one provided in solution 2 as this has been developed by someone who already has a lot of experience with docbook (I haven't yet been in touch with him yet). Further adapting it for boost's requirements, I feel, may be the most fruitful solution.
- Improving Boost Docs
- About this project
- Boost docs repository
- Unified look & feel project
- Docs translations project
- Glue docs project
- Subprojects
- Boost HTML stylesheet
- Boost Kate support
- Boost PDF stylesheet
- Boost specific WikiMacros
- Boost Trac stylesheet
- Boost.Build support for doc tools
- Boostscript
- Google Search Box project
- HTML to docbook
- Improving Boostbook
- Quickbook as a WikiProcessor for our Trac
- Quickbook source stylesheet
- SVG icons set project
- Syntax highlighting project
- Trac Syntax Coloring for Boostbook, Quickbook and Jamfiles
- Browser Testing Chart
- Logo Playground