Boost C++ Libraries: Ticket #8544: Calling managed DLL from within boost::context may cause a crash https://svn.boost.org/trac10/ticket/8544 <p> Only Windows platform is affected. </p> <p> If the code running in the context (coroutine) invokes anything that involves crossing clr.dll (mscorwks.dll) boundary, a crash occurs with about 50% probability. </p> <p> My investigation showed that call stack of the problem is consistent with clr.dll!Thread::<a class="missing wiki">InitThread</a> throwing <a class="missing wiki">OutOfMemory</a> exception. With some deep debugging I narrowed the problem down to <a class="missing wiki">CommitThreadStack</a> function inside the clr.dll. That method accesses a dword located at FS:[0xE0C] (this is called "deallocaton stack" on <a class="ext-link" href="http://en.wikipedia.org/wiki/Win32_Thread_Information_Block"><span class="icon">​</span>TIB wiki page</a>) and compares it with current top of the stack (FS:[0x4]). It appears that exception is thrown if FS:[0xE0C] value is greater than FS:[0x4] (or, perhaps, FS:[0x8]). That variable is not very well documented, but I believe the pair FS:[0xE0C] - FS:[0x4] defines the maximum stack size. On windows 7, the difference between these is always 0x100000, which gives stack size of 1M. Interestingly, that always the case even if the fiber or thread were created with smaller stack size. </p> <p> jump_context never touches that variable. As a result, the value in FS:[0xE0C] is defined by the calling thread, and therefore it contains arbitrary value. If it is greater than current top of stack, problem occurs. </p> <p> clr.dllCommitThreadStack also appears to be accessing FS:[0xF78], but it's purpose and whether the value stored in it affects the behavior is unknown. </p> <p> My current workaround of writing current bottom of the stack to FS:[0xE0C] prior to calling managed DLL appears to be working: </p> <p> (MS VS specific) </p> <div class="wiki-code"><div class="code"><pre><span class="n">DWORD</span> <span class="n">store</span> <span class="o">=</span> <span class="n">__readfsdword</span><span class="p">(</span><span class="mh">0xE0C</span><span class="p">);</span> <span class="n">__writefsdword</span><span class="p">(</span><span class="mh">0xE0C</span><span class="p">,</span> <span class="n">__readfsdword</span><span class="p">(</span><span class="mh">0x8</span><span class="p">));</span> <span class="n">call_managed_dll</span><span class="p">();</span> <span class="n">__writefsdword</span><span class="p">(</span><span class="mh">0xE0C</span><span class="p">,</span> <span class="n">store</span><span class="p">);</span> </pre></div></div><p> This bug is very obscure and so far I only managed to observe it on Windows7 and Windows server 2008. </p> <p> Suggested fix: Store and restore FS:[0xE0C] in jump_context. </p> en-us Boost C++ Libraries /htdocs/site/boost.png https://svn.boost.org/trac10/ticket/8544 Trac 1.4.3 olli Sat, 04 May 2013 05:12:45 GMT <link>https://svn.boost.org/trac10/ticket/8544#comment:1 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8544#comment:1</guid> <description> <p> I've committed a fix for Win32 toboost-trunk - could you verify the fix, please? What about 64bit Windows? Does it check the 'deallocation stack' TIB-member too (if yes I assume it is located at another position). </p> </description> <category>Ticket</category> </item> <item> <author>vitaly.blinov@…</author> <pubDate>Sat, 04 May 2013 14:32:35 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/8544#comment:2 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8544#comment:2</guid> <description> <p> Thanks Oliver, </p> <p> I'll test it early next week. I haven't had a chance to try it with 64bit libs, but I suspect the behaviour might be similar. I'll do my best to put together a small test project so we'll know for sure. </p> </description> <category>Ticket</category> </item> <item> <author>vitaly.blinov@…</author> <pubDate>Mon, 06 May 2013 14:07:01 GMT</pubDate> <title>attachment set https://svn.boost.org/trac10/ticket/8544 https://svn.boost.org/trac10/ticket/8544 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">CoroCLRTest.zip</span> </li> </ul> <p> VS 2012 solution demonstrating the problem </p> Ticket anonymous Mon, 06 May 2013 14:10:47 GMT <link>https://svn.boost.org/trac10/ticket/8544#comment:3 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8544#comment:3</guid> <description> <p> Okay, I managed to put together a minimal solution that reproduces the problem. Attached contains 2012 solution with CLR dll, which exports a native method and native EXE calling that exported method. I believe such calls will go through something MS calls "double thunking". This is rather special use case, but it is not unusual at all. </p> <p> Will try to reproduce it on 64 bit windows first. </p> </description> <category>Ticket</category> </item> <item> <author>vitaly.blinov@…</author> <pubDate>Mon, 06 May 2013 14:37:06 GMT</pubDate> <title/> <link>https://svn.boost.org/trac10/ticket/8544#comment:4 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8544#comment:4</guid> <description> <p> The fix from the boost-trunk appears to be working, I verified it with the test project attached (it has a bug - it falls into an infinite loop with the fix :) ) Will now try to adapt the tests to 64 bit Windows. There is no information about NT_TIB structure on 64 bit windows on the internet though. </p> </description> <category>Ticket</category> </item> <item> <author>vitaly.blinov@…</author> <pubDate>Mon, 06 May 2013 15:19:06 GMT</pubDate> <title>attachment set https://svn.boost.org/trac10/ticket/8544 https://svn.boost.org/trac10/ticket/8544 <ul> <li><strong>attachment</strong> → <span class="trac-field-new">CoroCLRTest.2.zip</span> </li> </ul> <p> Updated solution, includes 64 bit libraries and workaround </p> Ticket vitaly.blinov@… Mon, 06 May 2013 15:21:18 GMT <link>https://svn.boost.org/trac10/ticket/8544#comment:5 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8544#comment:5</guid> <description> <p> 64 bit libraries appear to have the same vulnerability. I attached updated solution with 64 bit configuration. In the depths of the Internet I found that on 64 bit Windows "deallocation stack" must be located at GS:[0x1478]. This is unconfirmed, but workaround with this assumption works. </p> </description> <category>Ticket</category> </item> <item> <dc:creator>olli</dc:creator> <pubDate>Wed, 08 May 2013 06:20:43 GMT</pubDate> <title>status changed; resolution set https://svn.boost.org/trac10/ticket/8544#comment:6 https://svn.boost.org/trac10/ticket/8544#comment:6 <ul> <li><strong>status</strong> <span class="trac-field-old">new</span> → <span class="trac-field-new">closed</span> </li> <li><strong>resolution</strong> → <span class="trac-field-new">fixed</span> </li> </ul> <p> deallocation stack for 64bit windows will be stored/restored too - please verify. thank you! </p> Ticket vitaly.blinov@… Wed, 08 May 2013 11:53:53 GMT <link>https://svn.boost.org/trac10/ticket/8544#comment:7 </link> <guid isPermaLink="false">https://svn.boost.org/trac10/ticket/8544#comment:7</guid> <description> <p> Replying to <a class="ticket" href="https://svn.boost.org/trac10/ticket/8544#comment:6" title="Comment 6">olli</a>: </p> <blockquote class="citation"> <p> please verify. </p> </blockquote> <p> Compiled context library from the trunk. Run tests on both 32 and 64 bit platforms, CLR DLL calls were sucessful in all configurations. Verified. </p> <blockquote class="citation"> <p> thank you! </p> </blockquote> <p> No problem. Thank you for this library, it really rocks! </p> </description> <category>Ticket</category> </item> </channel> </rss>