#12818 closed Bugs (fixed)
regex: badly needs fuzzing
Reported by: | Owned by: | John Maddock | |
---|---|---|---|
Milestone: | Boost 1.64.0 | Component: | regex |
Version: | Boost 1.63.0 | Severity: | Problem |
Keywords: | Cc: | kcc@…, aizatsky@…, mmoroz@… |
Description
Hello,
I've applied libFuzzer (http://tutorial.libfuzzer.info) to regexp library and found 5 heap-buffer-overflows, stack overflow, assert failure, use of uninitialized data, SIGSEGV, infinite loop, undefined shift, invalid enum value and a bunch of memory leaks in just half an hour:
SUMMARY: AddressSanitizer: heap-buffer-overflow boost/regex/v4/perl_matcher.hpp:132:10 in char const* boost::re_detail_106300::re_skip_past_null<char>(char const*)
SUMMARY: AddressSanitizer: heap-buffer-overflow boost/regex/v4/perl_matcher.hpp:221:29 in gnu_cxx::normal_iterator<char const*, std::string> boost::re_detail_106300::re_is_set_member<gnu_cxx::normal_iterator<char const*, std::string>, char, boost::regex_traits<char, boost::cpp_regex_traits<char> >, unsigned int>(gnu_cxx::normal_iterator<char const*, std::string>, gnu_cxx::normal_iterator<char const*, std::string>, boost::re_detail_106300::re_set_long<unsigned int> const*, boost::re_detail_106300::regex_data<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, bool)
SUMMARY: AddressSanitizer: heap-buffer-overflow /sanitizer_common_interceptors.inc:278 in interceptor_strlen
SUMMARY: AddressSanitizer: heap-buffer-overflow boost/regex/v4/perl_matcher.hpp:166:19 in gnu_cxx::normal_iterator<char const*, std::string> boost::re_detail_106300::re_is_set_member<gnu_cxx::normal_iterator<char const*, std::string>, char, boost::regex_traits<char, boost::cpp_regex_traits<char> >, unsigned int>(gnu_cxx::normal_iterator<char const*, std::string>, gnu_cxx::normal_iterator<char const*, std::string>, boost::re_detail_106300::re_set_long<unsigned int> const*, boost::re_detail_106300::regex_data<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, bool)
a.out: boost/regex/v4/perl_matcher_common.hpp:606: bool boost::re_detail_106300::perl_matcher<gnu_cxx::normal_iterator<const char *, std::basic_string<char> >, std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<const char *, std::basic_string<char> > > >, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::match_backref() = __gnu_cxx::__normal_iterator<const char *, std::basic_string<char> >, Allocator = std::allocator<boost::sub_match<__gnu_cxx::__normal_iterator<const char *, std::basic_string<char> > > >, traits = boost::regex_traits<char, boost::cpp_regex_traits<char> >: Assertion `r.first != r.second' failed.
SUMMARY: MemorySanitizer: use-of-uninitialized-value boost/regex/v4/perl_matcher.hpp:166:13 in std::1::wrap_iter<char const*> boost::re_detail_106300::re_is_set_member<std::__1::__wrap_iter<char const*>, char, boost::regex_traits<char, boost::cpp_regex_traits<char> >, unsigned int>(std::1::wrap_iter<char const*>, std::1::wrap_iter<char const*>, boost::re_detail_106300::re_set_long<unsigned int> const*, boost::re_detail_106300::regex_data<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > > const&, bool)
SUMMARY: AddressSanitizer: heap-buffer-overflow ./boost/regex/v4/basic_regex_parser.hpp:2599:68 in boost::re_detail_106300::basic_regex_parser<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::parse_perl_extension()
boost/regex/v4/basic_regex_parser.hpp:2599:68: runtime error: load of value 56794092, which is not a valid value for type 'boost::re_detail_106300::syntax_element_type'
Direct leak of 4096 byte(s) in 1 object(s) allocated from:
SUMMARY: AddressSanitizer: stack-overflow ./boost/regex/v4/basic_regex_creator.hpp:1054 in boost::re_detail_106300::basic_regex_creator<char, boost::regex_traits<char, boost::cpp_regex_traits<char> > >::create_startmap(boost::re_detail_106300::re_syntax_base*, unsigned char*, unsigned int*, unsigned char)
SUMMARY: AddressSanitizer: SEGV
ALARM: working on the last Unit for 17 seconds
boost/regex/v4/basic_regex_parser.hpp:904:49: runtime error: shift exponent 325804978 is too large for 32-bit type 'unsigned int'
Full reports and triggering inputs for each bug are attached.
Test that I used is simply:
int LLVMFuzzerTestOneInput(const uint8_t *Data, size_t Size) {
try {
std::string str((char*)Data, Size); boost::regex e(str); boost::match_results<std::string::const_iterator> what; boost::regex_match(str, what, e, boost::match_default | boost::match_partial);
} catch (const std::exception&) {} return 0;
}
I would suggest to rerun the fuzzer after fixing these bugs as fuzzer was mostly choking on the existing bugs as they are easy to trigger.
Also it can make sense to set up continuous fuzzing using https://github.com/google/oss-fuzz which will automatically test latest code.
Attachments (1)
Change History (9)
by , 6 years ago
Attachment: | regexp.crashes.zip added |
---|
comment:1 by , 6 years ago
Summary: | regexp: badly needs fuzzing → regex: badly needs fuzzing |
---|
follow-up: 3 comment:2 by , 6 years ago
comment:3 by , 6 years ago
Replying to anonymous:
I strongly suggest running static analysis over regex before trying to find bugs by fuzzing. Coverity is free for open source and IME works well to identify the source of vulnerabilities, rather than providing N crashers that exercise some M<N bugs which need to be reverse-engineered with gdb.
Working as a Security Engineer I strongly disagree. Fuzzing is far more valuable than static analysis (if testcases are provided by the fuzzer) because static analysis often causes lots of false positives and its results are often hardly actionable for developers (we have tried both with Firefox). The OP already attached a set of testcases, the only thing that is better than that is patches for fixes. You should setup some fuzzing CI as it was suggested already and oss-fuzz even offers the resources for free.
comment:4 by , 6 years ago
Milestone: | To Be Determined → Boost 1.64.0 |
---|---|
Resolution: | → fixed |
Status: | new → closed |
Confirmed and fixed in develop: most of the issues you identified were duplicates of a couple of core issues, but further fuzzing with a dictionary revealed many more. Thanks for the heads up on this!
comment:5 by , 5 years ago
Since someone asked, and for future reference, most of the issues fixed here relate to parsing invalid regular expressions. The only runtime issues relate to recursive regular expressions: one infinite loop for some inputs, one memory leak, and one compiler/platform portability issue. The actual fixes are all labelled "de-fuzz" here: https://github.com/boostorg/regex/commits/develop
comment:6 by , 5 years ago
I've just added boost regex to oss-fuzz (using the fuzz target from this report). https://github.com/google/oss-fuzz/pull/851/files
If someone from boost wants to extend fuzzing to more parts of boost (or improve how regex is fuzzed) -- you are welcome!
comment:7 by , 5 years ago
There are at least 5 more bugs/crashes in the current trunk. Is someone interested in being automatically CC-ed to these and future bugs?
☆ 3460 boost: Integer-overflow in boost::re_detail_NUMBER::basic_regex_parser<char, boost::regex_traits<char, boos ☆ 3464 boost: Integer-overflow in boost::re_detail_NUMBER::perl_matcher<std::__1::__wrap_iter<char const*>, std::_ ☆ 3469 boost: ASSERT: jmp->type == syntax_element_jump ☆ 3471 boost: Stack-overflow in boost::re_detail_NUMBER::basic_regex_parser<char, boost::regex_traits<char, boos ☆ 3472 boost: Stack-overflow in boost::re_detail_NUMBER::perl_matcher<std::__1::__wrap_iter<char const*>, std::_
comment:8 by , 5 years ago
two more:
3478 boost: Stack-buffer-overflow in boost::re_detail_NUMBER::perl_matcher... 3479 boost: Null-dereference READ in boost::re_detail_NUMBER::basic_regex...
I strongly suggest running static analysis over regex before trying to find bugs by fuzzing. Coverity is free for open source and IME works well to identify the source of vulnerabilities, rather than providing N crashers that exercise some M<N bugs which need to be reverse-engineered with gdb.