2026-04-20 23:47 AEST

View Issue Details Jump to Notes ]
IDProjectCategoryView StatusLast Update
0000583mercuryBugpublic2026-04-14 12:50
Reporterwangp 
Assigned To 
PrioritynormalSeveritycrashReproducibilityalways
StatusnewResolutionopen 
Product Version 
Target VersionFixed in Version 
Summary0000583: crash in LLDS .par grades
DescriptionHere are some results of trying to bootcheck the asm_fast.par.gc grade on various Linux distributions. All tests were performed in a container on the same Linux host kernel, on a x86-64 machine.

--------

Debian 12
  GNU ld (GNU Binutils for Debian) 2.40
  gcc version 12.2.0 (Debian 12.2.0-14+deb12u1)
    - mercury-srcdist-rotd-2026-03-19 - ok

Debian 13
  GNU ld (GNU Binutils for Debian) 2.44
  gcc version 14.2.0 (Debian 14.2.0-19)
    - mercury-srcdist-rotd-2026-03-19 - ok

--------

Ubuntu 20.04
  GNU ld (GNU Binutils for Ubuntu) 2.34
  gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.2)
    - mercury-srcdist-rotd-2026-03-19 - ok

Ubuntu 22.04
  GNU ld (GNU Binutils for Ubuntu) 2.38
  gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04.3)
    - mercury-srcdist-rotd-2026-03-19 - ok

Ubuntu 24.04
  GNU ld (GNU Binutils for Ubuntu) 2.42
  gcc version 11.5.0 (Ubuntu 11.5.0-1ubuntu1~24.04.1)
    - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful
  gcc version 12.4.0 (Ubuntu 12.4.0-2ubuntu1~24.04.1)
    - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful
  gcc version 13.3.0 (Ubuntu 13.3.0-6ubuntu2~24.04.1) - default
    - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful
  gcc version 14.2.0 (Ubuntu 14.2.0-4ubuntu2~24.04.1)
    - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful

--------

Alpine Linux 3.20
  GNU ld (GNU Binutils) 2.42
  gcc version 13.2.1 20240309 (Alpine 13.2.1_git20240309)
    - mercury-srcdist-rotd-2026-03-19 - ok

Alpine Linux 3.22
  GNU ld (GNU Binutils) 2.44
  gcc version 14.2.0 (Alpine 14.2.0)
    - mercury-srcdist-rotd-2026-03-19 - ok

Alpine Linux 3.23
  GNU ld (GNU Binutils) 2.45.1
  gcc version 15.2.0 (Alpine 15.2.0)
    - mercury-srcdist-rotd-2026-03-19 - ok

--------

AlmaLinux 9
  GNU ld version 2.35.2-67.el9_7.1
  gcc version 11.5.0 20240719 (Red Hat 11.5.0-11) (GCC)
    - mercury-srcdist-rotd-2026-03-19 - ok

AlmaLinux 10
  GNU ld version 2.41-58.el10_1.2.alma.1
  gcc version 14.3.1 20250617 (Red Hat 14.3.1-2) (GCC) - default
    - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful
  gcc version 15.1.1 20250521 (Red Hat 15.1.1-2) (GCC)
    - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful

--------

Fedora 43
  GNU ld version 2.45.1-4.fc43
  gcc version 14.3.1 20250808 (Red Hat 14.3.1-1) (GCC)
    - mercury-srcdist-rotd-2026-03-19 - ok
  gcc version 15.2.1 20260123 (Red Hat 15.2.1-7) (GCC) - default
    - mercury-srcdist-rotd-2026-03-19 - ok

--------

OpenSuSE Leap 15.6
  GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.45.0.20251103-150100757
  gcc version 7.5.0 (SUSE Linux)
    - mercury-srcdist-rotd-2026-03-19 - ok

OpenSuSE Tumbleweed
  GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.45.0.20251103-2
  gcc version 15.2.1 20260202 (SUSE Linux)
    - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful

--------

Void Linux
  GNU ld (GNU Binutils) 2.44
  gcc version 14.2.1 20250405 (GCC)
    - mercury-srcdist-rotd-2026-03-19 - ok

--------

I'm not sure what conclusions we can draw. The problem appears to be something particular to a distribution, rather than an issue with certain gcc or binutils versions. (But keeping in mind that distributions will have applied their own patches to those software packages.)
TagsNo tags attached.
Attached Files

-Relationships
+Relationships

-Notes

~0001244

wangp (developer)

GC_DONT_GC=1 allows mmake depend to pass. The problem is likely premature collection of some object.

GC_NPROCS=1 GC_MARKERS=1 doesn't help.

Taking the working stage2 mercury_compile binary from Ubuntu 22.04, and running it on Ubuntu 24.04 does NOT work.
Taking the working stage2 mercury_compile binary from AlmaLinux 9, and running it on AlmaLinux 10 does NOT work.
Therefore, I suspect an issue in the interaction of Boehm GC + asm_fast.par.gc + glibc version (glibc is dynamically linked).

Table of libc versions:

    Alpine Linux ok musl
    Ubuntu 20.04 ok glibc 2.31
    AlmaLinux 9 ok glibc 2.34
    Ubuntu 22.04 ok glibc 2.35
    Debian 12 ok glibc 2.36
    OpenSuSE Leap 15.6 ok glibc 2.38

    AlmaLinux 10 crash glibc 2.39
    Ubuntu 24.04 crash glibc 2.39

    Debian 13 ok glibc 2.41

    Fedora 43 ok glibc 2.42
    OpenSuSE Tumbleweed crash glibc 2.42

~0001245

zs (developer)

I remember that I diagnosed the issue years ago. It was that the various disjuncts did not flush their output variables to their own stack slots, because those variables did not have their "own" stack slots. This was because the stack slot allocation pass was not updated to tell the graph colouring algorithm "allocate the variables that are all live at the same time at the end of a parallel conjunction to different stack slots". The parallel conjunction will work

- either if the output vars are allocated to distinct stack slots anyway by chance,

- or if, of the variables incorrectly sharing a slot, only one is needed after the parallel concjunction, and it happens to be stored to the slot last.

The first condition does happen very often in small benchmark programs, and the second also happens often (due to left-to-right flow of data), which is why this was not detected during initial development.

There is no point in looking for any correlations between gc and crashes until this issue is fixed.

~0001246

wangp (developer)

There is no parallel conjunction involved.

~0001247

juliensf (administrator)

Is the problem restricted to glibc-based systems?

~0001249

zs (developer)

A bootcheck in a parallel grade does involve parallel conjunctions. You just don't ordinarily don't notice them, because in non-parallel (or non-LLDS) grades they get converted to plain conjunctions. See for example library/integer.m,
and specifically pos_mul_karatsuba. I think there are a couple in the compiler as well, though my bootcheck of a compiler that I modified to abort on parallel_conj does not get past the library due to integer.m.

~0001250

wangp (developer)

I haven't seen the problem on Alpine Linux (which uses musl), and Void Linux (which has both glibc and musl variants, both test ok).

~0001251

wangp (developer)

The crash is more easily reproducable by building a small program in asm_fast.par.gc grade, as long as it allocates enough memory to trigger GC.
Building samples/e.m then running `./e 5000` is sufficient to trigger the crash.

In the following, I will consider Debian and Ubuntu only.
I managed to narrow down the crash to the CFLAGS used to build glibc on Ubuntu 24.04.

Ubuntu 22.04 (ok):

    CFLAGS = -pipe -O2 -g -fdebug-prefix-map=/<<PKGBUILDDIR>>=. -O3

Ubuntu 24.04 (crash):

    CFLAGS = -pipe -O2 -g -fdebug-prefix-map=/<<PKGBUILDDIR>>=. -O3 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer

The CFLAGS can be seen in the buildlogs for the glibc packages.

Ubuntu 22.04:
https://blueprints.launchpad.net/~ubuntu-security-proposed/+archive/ubuntu/ppa/+build/32209404/+files/buildlog_ubuntu-jammy-amd64.glibc_2.35-0ubuntu3.13_BUILDING.txt.gz
Ubuntu 24.04:
https://blueprints.launchpad.net/~ubuntu-security-proposed/+archive/ubuntu/ppa/+build/32209392/+files/buildlog_ubuntu-noble-amd64.glibc_2.39-0ubuntu8.7_BUILDING.txt.gz


I manually built glibc 2.41 on Ubuntu 24.04 and was able to confirm that `./e 5000` crashes if glibc is built with:

    CFLAGS = ... -fno-omit-frame-pointer

and does NOT crash when glibc is built without `-fno-omit-frame-pointer`.

(Even though Ubuntu 24.04 ships with glibc 2.39, I had trouble building that version for reasons unrelated to the CFLAGS.)


Extra information, may be of interest:
the CFLAGS for the Debian/Ubuntu glibc package are (in part) derived by filtering the output of `dpkg-buildflags --get CFLAGS`.
That command prints the following output on each distribution.

  Debian 12:
    -g -O2 -ffile-prefix-map=/=. -fstack-protector-strong -Wformat -Werror=format-security

  Debian 13:
    -g -O2 -Werror=implicit-function-declaration -ffile-prefix-map=/=. -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection

  Ubuntu 20.04:
    -g -O2 -fdebug-prefix-map=/=. -fstack-protector-strong -Wformat -Werror=format-security

  Ubuntu 22.04:
    -g -O2 -ffile-prefix-map=/=. -flto=auto -ffat-lto-objects -flto=auto -ffat-lto-objects -fstack-protector-strong -Wformat -Werror=format-security

  Ubuntu 24.04:
    -g -O2 -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -ffile-prefix-map=/=. -flto=auto -ffat-lto-objects -fstack-protector-strong -fstack-clash-protection -Wformat -Werror=format-security -fcf-protection

~0001252

wangp (developer)

Actually, the problem is most likely something to do with shared libraries or PIC (again).

The problem should not be anything to do with glibc. glibc is just one of the few libraries that every basic Mercury programs will link to, *and* is usually dynamically linked.

I've confirmed that forcing binaries to be statically linked:

  - allows `./samples/e 5000` to pass

  - allows bootcheck in asm_fast.par.gc to finish, albeit with unexpected test case failures.[1]

I enabled static linking for bootcheck by adding these lines to Mmake.params:

    EXTRA_MCFLAGS += --linkage static
    EXTRA_MLFLAGS += -static
    EXTRA_CFLAGS += -static
    EXTRA_LDFLAGS += -static

[1] The unexpected test case failures look like the following.
The correct output was produced in accessibility_t1.out, but there is an "Aborted" error message coming from somewhere. The test case program itself runs fine.

    { test -f accessibility_t1.inp && cat accessibility_t1.inp; } | ./accessibility_t1 > accessibility_t1.out 2>&1 || \
        { grep . accessibility_t1.out /dev/null; exit 1; }
    Aborted
    accessibility_t1.out:Hello.

~0001253

wangp (developer)

The "Aborted" failures were unrelated, and fixed by commit f3e541949.
+Notes

-Issue History
Date Modified Username Field Change
2026-03-31 14:16 wangp New Issue
2026-04-01 16:52 wangp Severity minor => crash
2026-04-01 16:52 wangp Reproducibility sometimes => always
2026-04-02 16:49 wangp Note Added: 0001244
2026-04-02 17:03 zs Note Added: 0001245
2026-04-02 17:15 wangp Note Added: 0001246
2026-04-02 17:19 juliensf Note Added: 0001247
2026-04-02 17:43 zs Note Added: 0001249
2026-04-02 17:48 wangp Note Added: 0001250
2026-04-07 15:25 wangp Note Added: 0001251
2026-04-10 18:16 wangp Note Added: 0001252
2026-04-14 12:50 wangp Note Added: 0001253
+Issue History