User:JoeyArmstrong/makefiles/gotcha

From MozillaWiki
Jump to: navigation, search

blog post

Makefiles: potential problems

When writing makefiles there are a few logic bombs to try and avoid as they can mask failure conditions.  When command exit status is ignored or overlooked the site can become a source for random failures in downstream testing.

A few problem sources to mention are:
  1) Not returning, propagating, testing or ignoring shell exit status
  2) Using line continuations to write multi-line commands within a makefile.
  3) Prefixing commands with a hyphen '-'

A simple makefile (gotcha.mk) is attached below to illustrate problem space for items #1 and #2 using the true & false shell commands.  True will succeed with a shell exit status of zero ($?).  False will fail with a non-zero exit status:
% gmake -f gotcha.mk show
makefile target is: show
     /bin/true: $?=0
    /bin/false: $?=1

% gmake -f gotcha.mk pass
makefile target is: pass
/bin/true

% gmake -f gotcha.mk fail
makefile target is: fail
/bin/false
gmake: *** [fail] Error 1

Line continuation problems

Why line continuation can be a problem:
=======================================
Line continuation is an efficiency that is often used to avoid the overhead
of having to spawn a new command shell for each command mentioned in a
target rule.  Most of the time line continuation will function as expected
but when exit status is lost or over-written, success is returned and
make will return a false positive.

An easy way to create this condition is by wrappering the logic in an if
block.  Commands can merrily fail in bulk within the block but final exit
status will be set by the shell if/test condition -- which always succeeds.

This can be illustrated by the makefile target 'bad'.  Run the true command,
then false and if all is well the target should fail on the false call.
Guess what, we have a logic hole for bugs to sneak in through:

% gmake -f gotcha.mk bad
makefile target is: bad
if [ ! -e '/non/existent/file' ]; then \
            /bin/true; \
            /bin/false; \
            echo "ASSERT: target bad: should not reach this point"; \
        fi
ASSERT: target bad: should not reach this point



There are a few workarounds for this problem space.  One of the easiest is
use of the shell '-e' set/command line arg.  When enabled a shell will exit
immediately with non-zero status whenever a command fails.

Uncomment this line in the makefile and the command will fail as expected:
   SHELL := $(SHELL) -e


% gmake -f gotcha.mk bad
makefile target is: bad
   [snip]
gmake: *** [bad] Error 1

The -e option or a morale equivalent is supported by bourne, bash, dash,
ksh or whatever other shells people may prefer to use.

suggestions

A few suggestions:
  o At a minimum for readability consider using GNU Make's .ONESHELL:
    directive rather than large line continuation blocks:

    http://www.gnu.org/software/make/manual/html_node/One-Shell.html

  o Make is very good at control flow and processing but not so much as
    a scripting language.  Blocks that span more than 5 commands or are
    encapsulated within an if block would be ripe for breaking up into
    several distinct makefile goals or moving the entire block into a
    shell script that will be invoked by the target.

  o shell/set -e should be enabled everywhere to improve error detection
    along with options to detect uninitialized variable usage.  Functionality
    is already enabled by default in a few places using the $(EXIT_ON_ERROR)
    macro defined in config/config.mk.  Listing $(EXIT_ON_ERROR) as the first
    item of a target rule will expand into bash -e @command_text:

check::
        @$(<font color=blue>EXIT_ON_ERROR</font>) \
          for f in $(subst .cpp,$(BIN_SUFFIX),$(CPP_UNIT_TESTS)); do \
            XPCOM_DEBUG_BREAK=stack-and-abort $(RUN_TEST_PROGRAM) $(DIST)/bin/$$f; \
          done

     Make judicous use of -e funcionality but do not blindly trust that it
     will be enabled for a given makefile target.  There will not always be
     visual cues that the flag has been enabled.

   o One trivial way to validate logic blocks like these is to setup a negative
     test using the try server.  Simply inline a call to 'exit ###' as the last
     statement in a block then submit to try.
<code>
all:
    if [ 1 -eq 1 ]; then\
    /bin/ps \
    && /bin/who \
    && echo "Failure forced"\
    && exit 99\
    fi
</code>
% hg push -f try
If all goes well failure email will soon be on it's way.
     Negative testing like this will sanity check two important items:
     First that the exit call was properly detected and make was able
     to record it as a failure in the build and test logs.  Second,
     the failure status was able to propagate through the try server
     and was reported accurately by the web interface.

     If either of these conditions are not met a bug should be created
     so the logic can be modified to prevent any future failure conditions
     from silently sneaking into the tree.

   o The configure script would also be a place to perform similar -e tests.
     Often when commands cannot be found or failures occur the configure script
     is able to run to completion but underlying failures are lost in the
     shuffle.

   o Lastly a plug for testing.  If you will be expending effort to test
     makefile logic go the extra setup and write a unit test and add a 'check:'
     target to automate the activity.  Even if the test is a trivial smoke test
     initially it can be expanded later.  Start with these three, if any fail
     something has gone horribly wrong or wastes processing time:
       positive test: target does not exist initially, generate and verify.
       negative test: target cannot be created and test fails.
       nop test: target exists, minimal overhead should be imposed

<code>
% gmake -f gotcha.mk check
makefile target is: check
Running positive test
OK
Running negative test
OK
Running negative test [failure forced]
gmake: *** [check] Error 1
</code>

hyphen usage

  3) Prefixing target commands with a hyphen '-'
================================================
AKA I do not care what type of explosion occurs on this line, ignore status
and always return success.

There are a few legitimate places where this option can be used.
  o File deletion on a system that does not support rm -f.
  o Shell/OS command bug that will force returning non-zero status for
    a command that should succeed (~transient problems).

Aside from cases like these, use of a hyphen in general makefile logic to
ignore a set of failure conditions will have an ugly side effect of also
ignoring an entire set of unrelated problems, in the filesystem and os area,
that a target should be aware of and fail on.

Ignoring conditions like say a file being non-existent when a target is
processed because the file will be generated by the target is one example.
This case might have been an instance of trying to work around a line
continuation, testing for file existance then loss of command exit status.

Rather than using hyphens to code around a set of conditions remove them
and use an external shell script to do the right thing.  Handle conditions,
test for existance and return exit status from the actions.

Suggestions:
  o Remove any un-necessary hyphens from makefiles and create scripts,
    write unit tests to accomodate the logic as needed.

Samples:

  • toolkit/locales/l10n.mk
    • -cp -r $(JARLOG_DIR)/en-US/*.jar.log $(JARLOG_DIR_AB_CD)
    • -$(PERL) -pi.old -e "s/en-US/$(AB_CD)/g" $(JARLOG_DIR_AB_CD)/*.jar.log
    • A few conditions these commands will succeed on:
      • Source files do not exist. When file existence is not mandatory $(if ) conditions and extra target rules can conditionally copy when needed. As could an external shell script that could bundle several statements and eventually be tested.
      • Destination directory does not exist or filesystem is full.
      • Source file is not readable.
      • Destination file exists but is not writable.
      • JARLOG_DIR_AB_CD does not exist and we attempt copying files into.
  • js/src/ctypes/libffi
    • -rm -f src/alpha/ffi.$(OBJEXT)
      • Implied by the $(RM) macro
      • $(RM) src/alpha/ffi.$(OBJEXT)
    • -test -z "$(lib_LTLIBRARIES)" || rm -f $(lib_LTLIBRARIES)
    • -test -z "$(noinst_LTLIBRARIES)" || rm -f $(noinst_LTLIBRARIES)
      • Avoid spawning extra shells
      • $(RM) will not fail when the macros are empty
      • $(RM) $(lib_LTLIBRARIES) $(noinst_LTLIBRARIES)
  • config/rules.mk
  • clean clobber realclobber clobber_all:: $(SUBMAKEFILES)
  • -$(RM) $(ALL_TRASH)
  • -$(RM) -r $(ALL_TRASH_DIRS)
    • -$(RM) is redundant of $(RM) unless RM != /bin/rm -f

File: gotcha.mk

# -*- makefile -*-

ifneq ($(null),$(MAKECMDGOALS))
  $(info makefile target is: $(MAKECMDGOALS))
endif

SHELL := $(SHELL) -e

all: pass fail bad

pass:
        /bin/true

fail:
        /bin/false

bad:
        if [ ! -e '/non/existent/file' ]; then \
            /bin/true; \
            /bin/false; \
            echo "ASSERT: target $@: should not reach this point"; \
        fi

check:
        @echo "Running positive test"
        @/bin/true && echo "OK" || exit 1
        @echo "Running negative test"
        @! /bin/false && echo "OK" || exit 1
        @echo "Running negative test [failure forced]"
        @/bin/false && echo "OK" || exit 1

show:
        @if [ 1 ]; then\
            /bin/true;  echo "     /bin/true: \$$?=$$?"; \
            /bin/false; echo "    /bin/false: \$$?=$$?"; \
        fi                          



Coop's notes:

  • I added some punctuation where things were unclear, and fixed up spelling in a few places.
  • Content is great, so make sure the formatting when you go to publish does it justice, e.g. make sure code blocks are marked up as such, lists have the correct bullets, etc.
  • You have a big section heading for point #3 but not for #1 and #2. You should probably add those headings.
  • Are there examples from the Mozilla codebase of each type of error? Those would really hit home, I think.