Bug 2173 - stack frame size detection is broken
stack frame size detection is broken
Status: RESOLVED WONTFIX
Product: PCRE
Classification: Unclassified
Component: Code
8.41
x86 Linux
: medium bug
: ---
Assigned To: Philip Hazel
https://jira.mariadb.org/browse/MDEV-...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2017-10-05 15:12 BST by Sergei Golubchik
Modified: 2017-10-09 10:50 BST (History)
2 users (show)

See Also:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Sergei Golubchik 2017-10-05 15:12:39 BST
The documented way to check the stack frame size is

     -pcre_exec(NULL, NULL, NULL, -999, -999, 0, NULL, 0)

This doesn't work for a while, at least since gcc 5.4.0, probably earlier. The code looks like that:
=====================================================
int
pcre_exec(const pcre *argument_re, const pcre_extra *extra_data,
  PCRE_SPTR subject, int length, int start_offset, int options, int *offsets,
  int offsetcount)
{
if (re == NULL && extra_data == NULL && subject == NULL && length == -999 && start_offset == -999)
  return match(NULL, NULL, NULL, 0, NULL, NULL, 0);
...
=====================================================
static int
match(PCRE_PUCHAR eptr, const pcre_uchar *ecode,
  PCRE_PUCHAR mstart, int offset_top, match_data *md, eptrblock *eptrb,
  unsigned int rdepth)
{
if (ecode == NULL)
  {
  if (rdepth == 0)
    return match((PCRE_PUCHAR)&rdepth, NULL, NULL, 0, NULL, NULL, 1);
  else
    {
    int len = (int)((char *)&rdepth - (char *)eptr);
    return (len > 0)? -len : len;
    }
  }
...
=====================================================
while match() itself is rather big and complex, the code path for ecode==NULL is simple.
Compiler clones match(), extracting that code path into a separate very small function and inlines it directly into pcre_exec(). The result is:

$ pcretest -m -C
PCRE version 8.41 2017-07-05
Compiled with
  8-bit support
  UTF-8 support
  Unicode properties support
  Just-in-time compiler support: x86 64bit (little endian + unaligned)
  Newline sequence is LF
  \R matches all Unicode newlines
  Internal link size = 2
  POSIX malloc threshold = 10
  Parentheses nest limit = 250
  Default match limit = 10000000
  Default recursion depth limit = 8192
  Match recursion uses stack: approximate frame size = 4 bytes

As a fix, I've declared match() with __attribute__((noinline,noclone)). This helped.
Comment 1 Philip Hazel 2017-10-05 17:57:33 BST
This was always somewhat dodgy code, and since it was released I have discovered that all kinds of compiler variations can alter the answer that you get. With hindsight, it should never have been released. PCRE1 (the 8.xx series) is very much in "maintenance only" mode. PCRE2 (the 10.xx series) has been out for nearly 3 years now, and its most recent release, 10.30, no longer uses the stack for remembering backtracking points. It keeps its own data in frames whose size is known (and available). The amount of heap used can be explicitly limited.

As no change in PCRE1 is likely in this area, I am closing this as WONTFIX.
Comment 2 Sergei Golubchik 2017-10-06 19:20:11 BST
Fair enough.

I've created https://jira.mariadb.org/browse/MDEV-14024 to migrate MariaDB to PCRE2.

As this migration can only happen in the development branch, old stable releases will have to use bundled patched PCRE1. If distributions will get unhappy with that, I'll refer them to this bug report.
Comment 3 Petr Pisar 2017-10-09 08:44:25 BST
Do you have patch for this bug? Otherwise I cannot see a reason why MariaDB should bundle PCRE1.
Comment 4 Sergei Golubchik 2017-10-09 10:50:43 BST
My patch was
=====================================================
diff --git a/pcre/pcre_exec.c b/pcre/pcre_exec.c
--- a/pcre/pcre_exec.c
+++ b/pcre/pcre_exec.c
@@ -509,6 +509,12 @@
                  (e.g. stopped by repeated call or recursion limit)
 */
 
+#ifdef __GNUC__
+static int
+match(REGISTER PCRE_PUCHAR eptr, REGISTER const pcre_uchar *ecode,
+  PCRE_PUCHAR mstart, int offset_top, match_data *md, eptrblock *eptrb,
+  unsigned int rdepth) __attribute__((noinline,noclone));
+#endif
 static int
 match(REGISTER PCRE_PUCHAR eptr, REGISTER const pcre_uchar *ecode,
   PCRE_PUCHAR mstart, int offset_top, match_data *md, eptrblock *eptrb,
=====================================================

I don't know if that's good enough for you, pcre doesn't seem to use __attribute__ much.

MariaDB source code bundles PCRE1 for years, but we generally compile with the system-wide PCRE1, if it exists and is usable. In the context of this bug I detect "usable" with this check:

=====================================================
      SET(CMAKE_REQUIRED_LIBRARIES "pcre")
      CHECK_C_SOURCE_RUNS("
        #include <pcre.h>
        int main() {
        return -pcre_exec(NULL, NULL, NULL, -999, -999, 0, NULL, 0) < 256;
        }"  PCRE_STACK_SIZE_OK)
      SET(CMAKE_REQUIRED_LIBRARIES)
=====================================================

Basically, if stack frame size detection works, cmake will use system PCRE1. But for my system PCRE1 (see above) stack frame size "is" 4.