| View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||||||
| ID | Project | Category | View Status | Date Submitted | Last Update | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0000583 | mercury | Bug | public | 2026-03-31 14:16 | 2026-04-02 17:48 | ||||||||
| Reporter | wangp | ||||||||||||
| Assigned To | |||||||||||||
| Priority | normal | Severity | crash | Reproducibility | always | ||||||||
| Status | new | Resolution | open | ||||||||||
| Product Version | |||||||||||||
| Target Version | Fixed in Version | ||||||||||||
| Summary | 0000583: crash in LLDS .par grades | ||||||||||||
| Description | Here are some results of trying to bootcheck the asm_fast.par.gc grade on various Linux distributions. All tests were performed in a container on the same Linux host kernel, on a x86-64 machine. -------- Debian 12 GNU ld (GNU Binutils for Debian) 2.40 gcc version 12.2.0 (Debian 12.2.0-14+deb12u1) - mercury-srcdist-rotd-2026-03-19 - ok Debian 13 GNU ld (GNU Binutils for Debian) 2.44 gcc version 14.2.0 (Debian 14.2.0-19) - mercury-srcdist-rotd-2026-03-19 - ok -------- Ubuntu 20.04 GNU ld (GNU Binutils for Ubuntu) 2.34 gcc version 9.4.0 (Ubuntu 9.4.0-1ubuntu1~20.04.2) - mercury-srcdist-rotd-2026-03-19 - ok Ubuntu 22.04 GNU ld (GNU Binutils for Ubuntu) 2.38 gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04.3) - mercury-srcdist-rotd-2026-03-19 - ok Ubuntu 24.04 GNU ld (GNU Binutils for Ubuntu) 2.42 gcc version 11.5.0 (Ubuntu 11.5.0-1ubuntu1~24.04.1) - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful gcc version 12.4.0 (Ubuntu 12.4.0-2ubuntu1~24.04.1) - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful gcc version 13.3.0 (Ubuntu 13.3.0-6ubuntu2~24.04.1) - default - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful gcc version 14.2.0 (Ubuntu 14.2.0-4ubuntu2~24.04.1) - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful -------- Alpine Linux 3.20 GNU ld (GNU Binutils) 2.42 gcc version 13.2.1 20240309 (Alpine 13.2.1_git20240309) - mercury-srcdist-rotd-2026-03-19 - ok Alpine Linux 3.22 GNU ld (GNU Binutils) 2.44 gcc version 14.2.0 (Alpine 14.2.0) - mercury-srcdist-rotd-2026-03-19 - ok Alpine Linux 3.23 GNU ld (GNU Binutils) 2.45.1 gcc version 15.2.0 (Alpine 15.2.0) - mercury-srcdist-rotd-2026-03-19 - ok -------- AlmaLinux 9 GNU ld version 2.35.2-67.el9_7.1 gcc version 11.5.0 20240719 (Red Hat 11.5.0-11) (GCC) - mercury-srcdist-rotd-2026-03-19 - ok AlmaLinux 10 GNU ld version 2.41-58.el10_1.2.alma.1 gcc version 14.3.1 20250617 (Red Hat 14.3.1-2) (GCC) - default - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful gcc version 15.1.1 20250521 (Red Hat 15.1.1-2) (GCC) - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful -------- Fedora 43 GNU ld version 2.45.1-4.fc43 gcc version 14.3.1 20250808 (Red Hat 14.3.1-1) (GCC) - mercury-srcdist-rotd-2026-03-19 - ok gcc version 15.2.1 20260123 (Red Hat 15.2.1-7) (GCC) - default - mercury-srcdist-rotd-2026-03-19 - ok -------- OpenSuSE Leap 15.6 GNU ld (GNU Binutils; SUSE Linux Enterprise 15) 2.45.0.20251103-150100757 gcc version 7.5.0 (SUSE Linux) - mercury-srcdist-rotd-2026-03-19 - ok OpenSuSE Tumbleweed GNU ld (GNU Binutils; openSUSE Tumbleweed) 2.45.0.20251103-2 gcc version 15.2.1 20260202 (SUSE Linux) - mercury-srcdist-rotd-2026-03-19 - building of stage 3 dependencies not successful -------- Void Linux GNU ld (GNU Binutils) 2.44 gcc version 14.2.1 20250405 (GCC) - mercury-srcdist-rotd-2026-03-19 - ok -------- I'm not sure what conclusions we can draw. The problem appears to be something particular to a distribution, rather than an issue with certain gcc or binutils versions. (But keeping in mind that distributions will have applied their own patches to those software packages.) | ||||||||||||
| Tags | No tags attached. | ||||||||||||
| Attached Files |
| ||||||||||||
Notes |
|
|
wangp (developer) 2026-04-02 16:49 |
GC_DONT_GC=1 allows mmake depend to pass. The problem is likely premature collection of some object. GC_NPROCS=1 GC_MARKERS=1 doesn't help. Taking the working stage2 mercury_compile binary from Ubuntu 22.04, and running it on Ubuntu 24.04 does NOT work. Taking the working stage2 mercury_compile binary from AlmaLinux 9, and running it on AlmaLinux 10 does NOT work. Therefore, I suspect an issue in the interaction of Boehm GC + asm_fast.par.gc + glibc version (glibc is dynamically linked). Table of libc versions: Alpine Linux ok musl Ubuntu 20.04 ok glibc 2.31 AlmaLinux 9 ok glibc 2.34 Ubuntu 22.04 ok glibc 2.35 Debian 12 ok glibc 2.36 OpenSuSE Leap 15.6 ok glibc 2.38 AlmaLinux 10 crash glibc 2.39 Ubuntu 24.04 crash glibc 2.39 Debian 13 ok glibc 2.41 Fedora 43 ok glibc 2.42 OpenSuSE Tumbleweed crash glibc 2.42 |
|
zs (developer) 2026-04-02 17:03 |
I remember that I diagnosed the issue years ago. It was that the various disjuncts did not flush their output variables to their own stack slots, because those variables did not have their "own" stack slots. This was because the stack slot allocation pass was not updated to tell the graph colouring algorithm "allocate the variables that are all live at the same time at the end of a parallel conjunction to different stack slots". The parallel conjunction will work - either if the output vars are allocated to distinct stack slots anyway by chance, - or if, of the variables incorrectly sharing a slot, only one is needed after the parallel concjunction, and it happens to be stored to the slot last. The first condition does happen very often in small benchmark programs, and the second also happens often (due to left-to-right flow of data), which is why this was not detected during initial development. There is no point in looking for any correlations between gc and crashes until this issue is fixed. |
|
wangp (developer) 2026-04-02 17:15 |
There is no parallel conjunction involved. |
|
juliensf (administrator) 2026-04-02 17:19 |
Is the problem restricted to glibc-based systems? |
|
zs (developer) 2026-04-02 17:43 |
A bootcheck in a parallel grade does involve parallel conjunctions. You just don't ordinarily don't notice them, because in non-parallel (or non-LLDS) grades they get converted to plain conjunctions. See for example library/integer.m, and specifically pos_mul_karatsuba. I think there are a couple in the compiler as well, though my bootcheck of a compiler that I modified to abort on parallel_conj does not get past the library due to integer.m. |
|
wangp (developer) 2026-04-02 17:48 |
I haven't seen the problem on Alpine Linux (which uses musl), and Void Linux (which has both glibc and musl variants, both test ok). |
Issue History |
|||
| Date Modified | Username | Field | Change |
|---|---|---|---|
| 2026-03-31 14:16 | wangp | New Issue | |
| 2026-04-01 16:52 | wangp | Severity | minor => crash |
| 2026-04-01 16:52 | wangp | Reproducibility | sometimes => always |
| 2026-04-02 16:49 | wangp | Note Added: 0001244 | |
| 2026-04-02 17:03 | zs | Note Added: 0001245 | |
| 2026-04-02 17:15 | wangp | Note Added: 0001246 | |
| 2026-04-02 17:19 | juliensf | Note Added: 0001247 | |
| 2026-04-02 17:43 | zs | Note Added: 0001249 | |
| 2026-04-02 17:48 | wangp | Note Added: 0001250 | |


