Mercury Bugs - mercury
View Issue Details
0000544mercuryBugpublic2022-02-05 17:082022-02-07 18:10
Assigned Tozs 
PrioritynormalSeverityminorReproducibilityhave not tried
PlatformOSOS Version
Product Version 
Target VersionFixed in Version 
Summary0000544: Test failure: valid/static causes compiler abort in grade at -O5 and above
Descriptionvalid/static is failing with a compiler abort in the grade at -O5 and above.

Uncaught Mercury exception:
Software Error: map.lookup: key not found
        Key Type: term.var(parse_tree.prog_data.prog_var_type)
        Key Value: var(6)
        Value Type: ml_backend.ml_gen_info.ml_ground_term

Mercury: 22.01-beta-2022-02-04
OS: CentOS 7
GCC: 4.8.5
TagsNo tags attached.
Attached Files? bug544.m (597) 2022-02-06 12:55

2022-02-06 12:55   
I have diagnosed the problem on the attached cut-down test case.

The predicate that causes the problem, s, includes a cc_nondet scope
that constructs a static term, Result. In non-trailing grades, this works
just fine.

The add_trail_ops transformation replaces that cc_nondet scope with
a disjunction, whose first disjunct is the cc_nondet scope with some
trail ops around it, and whose second disjunct does some trail ops and
then fails.

The problem comes in the fact that ml_gen_ordinary_model_det_semi_disj
generates code for each disjunct by invoking ml_gen_goal_as_branch_block
on it, and this predicate discards any updates to the const_var_map
done while processing a disjunct. So that we generate an entry for
Result in the const_var_map when we process its generator unification,
but then throw away the resulting const_var_map at the end of the branch.
When the code generator reaches the code that wants to *use* Result,
just after the scope or the disjunction it turns into, its lookup
in the const_var_map fails.

The guilty party in this case is add_trail_ops.m, because it leaves
the unification that uses Result as in input marked with construct_statically,
even though its addition of the disjunction invalidated the grounds
that justified that marker. We could fix this by having add_trails_ops
marking everything that it can construct_dynamically. The next pass is
mark_static, which could turn unifications back to construct_statically
if it is safe to do so.

Another way to fix the bug is to change ml_gen_goal_as_block, and the
predicates that call it, so that at the end of each branched control
structure we end up not with the const_var_map that we entered with
(the current situation), but with the "consensus" of the const_var_maps
at the ends of the branches, *provided* the branch end is reachable.
In this case, and in all other cases where add_trail_ops introduces
disjunctions like this, there will be one reachable branch end and
one unreachable branch end, so we will end up with the const_var_map
of the first branch.

Note that both possible fixes are too drastic to add to the release
without at least a week or two of testing.

2022-02-06 13:01   
To clarify: by "consensus", I mean the intersection
of the const_var_maps at the ends of the reachable branches,
when each map is viewed as a set of key/value pairs.
We already have similar operations on other structures
in our code generators.
2022-02-06 14:55   
Having the HLDS coming out of add_trail_op pass be inconsistent is a possible source of future bugs (in addition to causing this one), for that reason I would favour the first fix. So far as merging a fix on to the release branch is concerned, the first fix has less of an impact on everything else (i.e. the non-trailing MLDS grades).

For the release, I've set up one of Opturion's larger servers to systematically bootcheck all the usual grades using -O0 to -O6, with and without --intermod-opt. We can do the same hlc grades on the master branch once a fix is available -- and for there release branch, if/when the fix is merged onto it. [Fortunately, bootchecks are reasonably fast with -j16 ;-)]
2022-02-06 15:13   
With the second approach, the HLDS coming out of add_trail_ops
would still be consistent, just with respect to a changed definition
of "under what circumstances is a construct unification allowed to assume
that one of the rhs variable is constant". The updated definition would
of course have to be documented somewhere, but

- the only thing that pays attention to the construct_how annotations
is the MLDS code generator, and
- we would update that in accordance with the updated definition.

About the first approach: probably the simplest way to implement it would be
to get add_trail_ops to add a marker to the disjunctions it creates, which
the following mark_static pass would look for. If it found it, then it would
replace every construct_how annotation reachable from the end of that disjunction
on forward execution with construct_dynamically (the default), and rebuild it
from its own knowledge. That way, you would avoid having to teach add trail ops
about static terms.

BTW, I would be comfortable doing the fix using the second approach,
but not the first.
2022-02-06 15:28   
Ok, not having to teach add_trail_ops about static terms is a convincing reason for the second approach.
2022-02-07 18:10   
Fix committed 2022 feb 7.

Issue History
2022-02-05 17:08juliensfNew Issue
2022-02-06 12:55zsFile Added: bug544.m
2022-02-06 12:55zsNote Added: 0001165
2022-02-06 13:01zsNote Added: 0001166
2022-02-06 14:55juliensfNote Added: 0001167
2022-02-06 15:13zsNote Added: 0001168
2022-02-06 15:28juliensfNote Added: 0001169
2022-02-07 18:10zsAssigned To => zs
2022-02-07 18:10zsStatusnew => resolved
2022-02-07 18:10zsResolutionopen => fixed
2022-02-07 18:10zsNote Added: 0001171