compilation-mode and next-error confusion

Question

I'm going crazy trying to understand why compilation mode and next-error recognize some errors and not others.  It appears as though we're long past the days of simple regexps for recognizing errors but I'm not finding anything that explains what is, or is not used now, much less how to debug it.
Why is this recognized as an error:
/Users/kpixley/projects/src-head/cevo/junos/ui/tests/Makefile.inc:5:0 (41): no match found, expected: ":", [ t] or [p{Latin}-_.${}/%0123456789]

While this is not?
/Users/kpixley/projects/src-head/cevo/jdid/jdid-infra/build-files/evo/src/Makefile.inc:14:24 (268): rule include_dir: include

phils · Answer

This isn't a complete answer, but it provides more context.
As @lawlist has shown, this matching is determined by the gnu regexp in compilation-error-regexp-alist-alist, which is currently defined as follows:
    (gnu
     ;; The first line matches the program name for

;;     PROGRAM:SOURCE-FILE-NAME:LINENO: MESSAGE

;; format, which is used for non-interactive programs other than
     ;; compilers (e.g. the "jade:" entry in compilation.txt).

;; This first line makes things ambiguous with output such as
     ;; "foo:344:50:blabla" since the "foo" part can match this first
     ;; line (in which case the file name as "344").  To avoid this,
     ;; the second line disallows filenames exclusively composed of
     ;; digits.

;; Similarly, we get lots of false positives with messages including
     ;; times of the form "HH:MM:SS" where MM is taken as a line number, so
     ;; the last line tries to rule out message where the info after the
     ;; line number starts with "SS".  --Stef

;; The core of the regexp is the one with *?.  It says that a file name
     ;; can be composed of any non-newline char, but it also rules out some
     ;; valid but unlikely cases, such as a trailing space or a space
     ;; followed by a -, or a colon followed by a space.
     ;;
     ;; The "in \|from " exception was added to handle messages from Ruby.
     ,(rx
       bol
       (? (| (regexp "[[:alpha:]][-[:alnum:].]+: ?")
             (regexp "[ t]+$?:in \|from$")))
       (group-n 1 (: (regexp "[0-9]*[^0-9n]")
                     (*? (| (regexp "[^n :]")
                            (regexp " [^-/n]")
                            (regexp ":[^ n]")))))
       (regexp ": ?")
       (group-n 2 (regexp "[0-9]+"))
       (? (| (: "-"
                (group-n 4 (regexp "[0-9]+"))
                (? "." (group-n 5 (regexp "[0-9]+"))))
             (: (in ".:")
                (group-n 3 (regexp "[0-9]+"))
                (? "-"
                   (? (group-n 4 (regexp "[0-9]+")) ".")
                   (group-n 5 (regexp "[0-9]+"))))))
       ":"
       (| (: (* " ")
             (group-n 6 (| "FutureWarning"
                           "RuntimeWarning"
                           "Warning"
                           "warning"
                           "W:")))
          (: (* " ")
             (group-n 7 (| (regexp "[Ii]nfo$?:\>\|rmationa?l?$")
                           "I:"
                           (: "[ skipping " (+ nonl) " ]")
                           "instantiated from"
                           "required from"
                           (regexp "[Nn]ote"))))
          (: (* " ")
             (regexp "[Ee]rror"))
          (: (regexp "[0-9]?")
             (| (regexp "[^0-9n]")
                eol))
          (regexp "[0-9][0-9][0-9]")))
     1 (2 . 4) (3 . 5) (6 . 7))

That alist is preceded with the comment:
;; If you make any changes to `compilation-error-regexp-alist-alist',
;; be sure to run the ERT test in test/lisp/progmodes/compile-tests.el.
;; emacs -batch -l compile-tests.el -f ert-run-tests-batch-and-exit

The current gnu test cases from compile-tests.el are:
;; gnu
foo.c:88: message
../foo.c:88: W: message
/tmp/foo.c:88:warning message
foo/bar.py:8: FutureWarning message
foo.py:88: RuntimeWarning message
foo.c:88:I: message
foo.c:88.23: note: message
foo.c:88.23: info: message
foo.c:88:23:information: message
foo.c:88.23-45: Informational: message
foo.c:88-23: message
;; The next one is not in the GNU standards AFAICS.
;; Here we seem to interpret it as LINE1-LINE2.COL2.
foo.c:88-45.37: message
foo.c:88.23-9.17: message
jade:dbcommon.dsl:133:17:E: missing argument for function call
G:/cygwin/dev/build-myproj.xml:54: Compiler Adapter 'javac' can't be found.
file:G:/cygwin/dev/build-myproj.xml:54: Compiler Adapter 'javac' can't be found.
{standard input}:27041: Warning: end of file not at end of a line; newline inserted
boost/container/detail/flat_tree.hpp:589:25:   [ skipping 5 instantiation contexts, use -ftemplate-backtrace-limit=0 to disable ]

to which we can add the two cases from this question (I've truncated the file paths, as that makes no difference).
Makefile.inc:5:0 (41): no match found, expected: ":", [ t] or [p{Latin}-_.${}/%0123456789]
Makefile.inc:14:24 (268): rule include_dir: include

This confirms the issue:  All of the original test cases match, but only one of the new cases matches.
As @lawlist identified, changing that [0-9]? makes a difference.  If we change that to [0-9]* then all of the cases are now matched; however there's so much going on in this pattern that it's currently unclear to me whether or not that's the correct fix.
In the failure case:
Makefile.inc:14:24 (268): rule include_dir: include

The line number is 14, but it's the subsequent 24 which is failing to match the zero-or-one-digit [0-9]?.  Reducing that to a single digit (as seen in the case which worked) means the original regexp matches the line.  (Use C-cC-u to ensure re-builder picks up the change, if necessary.)
That [0-9]? dates back to commit 0ab31e4a9ff from 2006, and was part of a change intended to "rule out false positives due to time stamps":

we get lots of false positives with messages including times of the
form "HH:MM:SS" where MM is taken as a line number, so the last line
tries to rule out message where the info after the line number
starts with "SS".

1 You'll need make the top-level sequence explicit.  Refer to the discussion of this gotcha in https://emacs.stackexchange.com/a/5577/454

lawlist · Answer

In the following guess-timated answer (tested with Emacs 26.3), I have changed the fourth line from the bottom; i.e., (regexp "[0-9]?") to (regexp "[0-9]+?").  This permits Emacs to match the number 14 following the filename and the first : (colon).  To try out this answer, evaluate the Lisp code underneath section labeled THE CODE, and then paste the working data into a scratch buffer and type:  M-x compilation-mode
The following link provides other methods for changing an element of an alist: How to replace an element of an alist?  I chose to use the solution provided by Dan to modify the alist at issue in this thread.
WORKING DATA:
/Users/kpixley/projects/src-head/cevo/junos/ui/tests/Makefile.inc:5:0 (41): no match found, expected: ":", [ t] or [p{Latin}-_.${}/%0123456789]

/Users/kpixley/projects/src-head/cevo/jdid/jdid-infra/build-files/evo/src/Makefile.inc:14:24 (268): rule include_dir: include

THE CODE:
  ;;;  Load the library before trying to change `compilation-error-regexp-alist-alist'
(require 'compile)

(setf (nth 1 (assoc 'gnu compilation-error-regexp-alist-alist))
  (rx
    bol
    (? (| (regexp "[[:alpha:]][-[:alnum:].]+: ?")
          (regexp "[ t]+$?:in \|from$")))
    (group-n 1 (: (regexp "[0-9]*[^0-9n]")
                  (*? (| (regexp "[^n :]")
                         (regexp " [^-/n]")
                         (regexp ":[^ n]")))))
    (regexp ": ?")
    (group-n 2 (regexp "[0-9]+"))
    (? (| (: "-"
             (group-n 4 (regexp "[0-9]+"))
             (? "." (group-n 5 (regexp "[0-9]+"))))
          (: (in ".:")
             (group-n 3 (regexp "[0-9]+"))
             (? "-"
                (? (group-n 4 (regexp "[0-9]+")) ".")
                (group-n 5 (regexp "[0-9]+"))))))
    ":"
    (| (: (* " ")
          (group-n 6 (| "FutureWarning"
                        "RuntimeWarning"
                        "Warning"
                        "warning"
                        "W:")))
       (: (* " ")
          (group-n 7 (| (regexp "[Ii]nfo$?:\>\|rmationa?l?$")
                        "I:"
                        (: "[ skipping " (+ ".") " ]")
                        "instantiated from"
                        "required from"
                        (regexp "[Nn]ote"))))
       (: (* " ")
          (regexp "[Ee]rror"))
       (: (regexp "[0-9]+?") ;; (regexp "[0-9]?")
          (| (regexp "[^0-9n]")
             eol))
       (regexp "[0-9][0-9][0-9]"))))

Then, I took the working data and used M-x re-builder to see what the above-mentioned regexp matched.  I modified the second line of the working data by reducing the 14 to just one digit and that helped me zero in on the relevant section of the regexp at issue.  From there, I looked at the compilation-error-regexp-alist-alist to locate the correspondenting section and found it in the gnu section of that variable.

compilation-mode and next-error confusion

2 Answers

Add your own answers!

Ask a Question