Saturday, August 07, 2021

lang/ruby and powerpc* __builtin_setjmp

Hello Jeremy, ports list,

lang/ruby uses __builtin_setjmp and __builtin_longjmp, but these are
broken in clang-11 on powerpc and powerpc64. I want to avoid these
builtins and use _setjmp(3) on powerpc*, see diff below.

The symptoms of the broken __builtin_setjmp are easy to miss. Our
ruby packages for powerpc (macppc), linked by ld.bfd, seem to have
no symptoms. If I change the linker to ld.lld, then "make build" has
a chance to crash SIGSEGV. In lang/ruby/2.7, the crash often happened
near the end of rdoc generation. I produced a similar SIGSEGV by
sending SIGALRM to the Ruby code "trap(:ALRM){exit}; sleep".

Yesterday (Fri), I found the problem in ruby-2.7.4/eval.c where
rb_longjmp() uses EC_JUMP_TAG to __builtin_longjmp. In the powerpc
ABI, each .c file has a different TOC pointer in register r30. The
broken builtin jumps from eval.c to another .c without changing r30,
so some code runs with the wrong r30. SIGSEGV happens if r30 + addend
points outside of the global offset table. This depends on the
linker; ld.lld might use larger addends than ld.bfd.

In the powerpc64 ABI, every .c file in libruby should use the same
TOC pointer, so the previous problem shouldn't happen; but something
else must be wrong, because __builtin_setjmp causes a crash when I
pass a wrong option, like "ruby -e" with no -e code:

$ ruby27 -e
ruby27: [BUG] Segmentation fault at 0x0000000000000018
ruby 2.7.4p191 (2021-07-07 revision a21a3b7d23) [powerpc64-openbsd]
...

Avoiding __builtin_setjmp prevents the crash:

$ ruby27 -e
Traceback (most recent call last):
ruby27: no code specified for -e (RuntimeError)

I long knew of this powerpc64 crash, but didn't know the cause until
today (Sat). Jeremy, you asked me for a backtrace, but I never
provided one. (I don't know how to backtrace through shlibs with
devel/gdb on powerpc64.) I forgot about the crash, because I normally
pass correct options to ruby.

Upstream ruby disables __builtin_setjmp in powerpc64 Linux;
see WRKSRC/configure.ac "# __builtin longjmp in ppc64* Linux".

I have not checked with upstream llvm/clang. I neither reported a
bug for __builtin_setjmp, nor checked whether a later version of clang
fixes the builtin. The code for these builtins is in or near
/usr/src/gnu/llvm/llvm/lib/Target/PowerPC/PPCISelLowering.cpp

If we disable the builtins on more archs, we might benefit from the
xor cookie (https://marc.info/?l=openbsd-cvs&m=146306879432422&w=2).
I didn't think to disable the builtins on not powerpc*.

Is this diff ok to commit?

--George

Index: Makefile.inc
===================================================================
RCS file: /cvs/ports/lang/ruby/Makefile.inc,v
retrieving revision 1.24
diff -u -p -r1.24 Makefile.inc
--- Makefile.inc 20 Mar 2020 16:44:24 -0000 1.24
+++ Makefile.inc 7 Aug 2021 15:55:24 -0000
@@ -38,6 +38,13 @@ CONFIGURE_ARGS += --enable-shared \
--without-bundled-libffi \
--disable-option-checking

+.if ${MACHINE_ARCH} == "powerpc" || ${MACHINE_ARCH} == "powerpc64"
+# clang-11's __builtin_setjmp is broken, has chance of SIGSEGV during
+# "make build" on powerpc with ld.lld, or when passing a wrong option
+# (like "ruby -e" with no -e code) on powerpc64.
+CONFIGURE_ARGS += --with-setjmp-type=_setjmp
+.endif
+
CONFIGURE_ENV += LIBruby${BINREV}_VERSION=${LIBruby${BINREV}_VERSION} \
PREFIX="${PREFIX}" \
CPPFLAGS="-DOPENSSL_NO_STATIC_ENGINE -I${LOCALBASE}/include" \
Index: 2.6/Makefile
===================================================================
RCS file: /cvs/ports/lang/ruby/2.6/Makefile,v
retrieving revision 1.15
diff -u -p -r1.15 Makefile
--- 2.6/Makefile 9 Jul 2021 17:05:46 -0000 1.15
+++ 2.6/Makefile 7 Aug 2021 15:55:24 -0000
@@ -1,6 +1,7 @@
# $OpenBSD: Makefile,v 1.15 2021/07/09 17:05:46 jeremy Exp $

VERSION = 2.6.8
+REVISION-main = 0
DISTNAME = ruby-${VERSION}
SHARED_LIBS = ruby26 0.0
NEXTVER = 2.7
Index: 2.7/Makefile
===================================================================
RCS file: /cvs/ports/lang/ruby/2.7/Makefile,v
retrieving revision 1.12
diff -u -p -r1.12 Makefile
--- 2.7/Makefile 9 Jul 2021 17:05:03 -0000 1.12
+++ 2.7/Makefile 7 Aug 2021 15:55:24 -0000
@@ -1,6 +1,7 @@
# $OpenBSD: Makefile,v 1.12 2021/07/09 17:05:03 jeremy Exp $

VERSION = 2.7.4
+REVISION-main = 0
DISTNAME = ruby-${VERSION}
SHARED_LIBS = ruby27 0.0
NEXTVER = 2.8
Index: 3.0/Makefile
===================================================================
RCS file: /cvs/ports/lang/ruby/3.0/Makefile,v
retrieving revision 1.4
diff -u -p -r1.4 Makefile
--- 3.0/Makefile 9 Jul 2021 17:04:32 -0000 1.4
+++ 3.0/Makefile 7 Aug 2021 15:55:24 -0000
@@ -1,6 +1,7 @@
# $OpenBSD: Makefile,v 1.4 2021/07/09 17:04:32 jeremy Exp $

VERSION = 3.0.2
+REVISION-main = 0
DISTNAME = ruby-${VERSION}
SHARED_LIBS = ruby30 0.0
NEXTVER = 3.1

No comments:

Post a Comment