Opened 11 years ago
Closed 11 years ago
#5644 closed Bugs (wontfix)
tellg returns different results for gnu libstd++ and others standard libraries for codecvt_null
| Reported by: | Owned by: | Robert Ramey | |
|---|---|---|---|
| Milestone: | To Be Determined | Component: | serialization | 
| Version: | Boost 1.46.1 | Severity: | Problem | 
| Keywords: | Cc: | 
Description
I believe the tellg funtion below shall return the stream position 2 after reading single wide character and it does so for non-gnu standard library. But it returs 1 instead for gnu libstd++ (mingw 4.5.2) though as you may see the wifstream::read call reads two bytes indeed. You may also see that subsequent read operations will be incorrect -- you will get 5bff instead of 005b.
You may also want to take http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269 into consideration.
--
Michael Kochetkov
#include <fstream>
#include <iostream>
#include <stdexcept>
#include <boost/archive/codecvt_null.hpp>
int
main() {
	try {
		std::wifstream is;
		is.imbue(std::locale(std::locale::classic(), new boost::archive::codecvt_null<wchar_t>()));
		is.exceptions(std::ios::badbit);
		// Samples.txt shall look in hex like this:
		// FEFF 005B 0030 0020 ¦ 0020 0020 0031 005D | ?[0   1]
		is.open("samples.txt",std::ios::in | std::ios::binary);
		unsigned int bom = 0;
		is.read(reinterpret_cast<wchar_t*>(&bom),1);
		const unsigned short bomLE = 0xFEFF;
		if (bom != bomLE) {
			throw std::runtime_error("Invalid BOM. Only LE is supported");
		}
		std::cout << "Current position: " <<  is.tellg() << std::endl;
	}
	catch(const std::exception& e) {
		std::cout << e.what() << std::endl;
	}
}
      Attachments (1)
Change History (2)
by , 11 years ago
| Attachment: | samples.txt added | 
|---|
comment:1 by , 11 years ago
| Resolution: | → wontfix | 
|---|---|
| Status: | new → closed | 
I've looked at the above and also at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269 .
I have to confess I'm not all that knowledgeable about code_cvt - I didnt' write the original code - I just included it. Still I'm happy to look at your issue. Having done so, I've got a couple of observations.
a) the response to http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49269 by someone a lot more knowledgeable than I suggests that this shouldn't be messed with.
b) In your example above, we're opening the stream with std::ios::binary . This seems to me to conflict with the usage of code_cvt facets. I know that they are implemented at the stream buffer levels so this "should" be OK - but still seems to me fundamentally wrong. Specifically, if reading one byte from the file results in two bytes after conversion, we expect tellg to respond with 1 while we just got two bytes.
All of the above suggests there is nothing for me to do here.
Robert Ramey

The sample file for code example.