Boost C++ Libraries: Ticket #5644: tellg returns different results for gnu libstd++ and others standard libraries for codecvt_null https://svn.boost.org/trac10/ticket/5644 <p> I believe the tellg funtion below shall return the stream position 2 after reading single wide character and it does so for non-gnu standard library. But it returs 1 instead for gnu libstd++ (mingw 4.5.2) though as you may see the wifstream::read call reads two bytes indeed. You may also see that subsequent read operations will be incorrect -- you will get 5bff instead of 005b. </p> <p> You may also want to take <a class="ext-link" href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269"><span class="icon">​</span>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269</a> into consideration. </p> <p> -- </p> <p> Michael Kochetkov </p> <pre class="wiki">#include &lt;fstream&gt; #include &lt;iostream&gt; #include &lt;stdexcept&gt; #include &lt;boost/archive/codecvt_null.hpp&gt; int main() { try { std::wifstream is; is.imbue(std::locale(std::locale::classic(), new boost::archive::codecvt_null&lt;wchar_t&gt;())); is.exceptions(std::ios::badbit); // Samples.txt shall look in hex like this: // FEFF 005B 0030 0020 ¦ 0020 0020 0031 005D | ?[0 1] is.open("samples.txt",std::ios::in | std::ios::binary); unsigned int bom = 0; is.read(reinterpret_cast&lt;wchar_t*&gt;(&amp;bom),1); const unsigned short bomLE = 0xFEFF; if (bom != bomLE) { throw std::runtime_error("Invalid BOM. Only LE is supported"); } std::cout &lt;&lt; "Current position: " &lt;&lt; is.tellg() &lt;&lt; std::endl; } catch(const std::exception&amp; e) { std::cout &lt;&lt; e.what() &lt;&lt; std::endl; } } </pre> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/5644 Trac 1.4.3 Michael.Kv@… Mon, 27 Jun 2011 11:05:17 GMT attachment set https://svn.boost.org/trac10/ticket/5644 https://svn.boost.org/trac10/ticket/5644 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">samples.txt</span> </li> </ul> <p> The sample file for code example. </p> Ticket Robert Ramey Sun, 31 Jul 2011 17:56:24 GMT status changed; resolution set https://svn.boost.org/trac10/ticket/5644#comment:1 https://svn.boost.org/trac10/ticket/5644#comment:1 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">wontfix</span> </li> </ul> <p> I've looked at the above and also at <a class="ext-link" href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269"><span class="icon">​</span>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269</a> . </p> <p> I have to confess I'm not all that knowledgeable about code_cvt - I didnt' write the original code - I just included it. Still I'm happy to look at your issue. Having done so, I've got a couple of observations. </p> <p> a) the response to <a class="ext-link" href="http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269"><span class="icon">​</span>http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269</a> by someone a lot more knowledgeable than I suggests that this shouldn't be messed with. </p> <p> b) In your example above, we're opening the stream with std::ios::binary . This seems to me to conflict with the usage of code_cvt facets. I know that they are implemented at the stream buffer levels so this "should" be OK - but still seems to me fundamentally wrong. Specifically, if reading one byte from the file results in two bytes after conversion, we expect tellg to respond with 1 while we just got two bytes. </p> <p> All of the above suggests there is nothing for me to do here. </p> <p> Robert Ramey </p> Ticket