Emacs Asked by kpixley on December 16, 2021
I’m going crazy trying to understand why compilation mode and next-error recognize some errors and not others. It appears as though we’re long past the days of simple regexps for recognizing errors but I’m not finding anything that explains what is, or is not used now, much less how to debug it.
Why is this recognized as an error:
/Users/kpixley/projects/src-head/cevo/junos/ui/tests/Makefile.inc:5:0 (41): no match found, expected: ":", [ t] or [p{Latin}-_.${}/%0123456789]
While this is not?
/Users/kpixley/projects/src-head/cevo/jdid/jdid-infra/build-files/evo/src/Makefile.inc:14:24 (268): rule include_dir: include
This isn't a complete answer, but it provides more context.
As @lawlist has shown, this matching is determined by the gnu
regexp in compilation-error-regexp-alist-alist
, which is currently defined as follows:
(gnu
;; The first line matches the program name for
;; PROGRAM:SOURCE-FILE-NAME:LINENO: MESSAGE
;; format, which is used for non-interactive programs other than
;; compilers (e.g. the "jade:" entry in compilation.txt).
;; This first line makes things ambiguous with output such as
;; "foo:344:50:blabla" since the "foo" part can match this first
;; line (in which case the file name as "344"). To avoid this,
;; the second line disallows filenames exclusively composed of
;; digits.
;; Similarly, we get lots of false positives with messages including
;; times of the form "HH:MM:SS" where MM is taken as a line number, so
;; the last line tries to rule out message where the info after the
;; line number starts with "SS". --Stef
;; The core of the regexp is the one with *?. It says that a file name
;; can be composed of any non-newline char, but it also rules out some
;; valid but unlikely cases, such as a trailing space or a space
;; followed by a -, or a colon followed by a space.
;;
;; The "in \|from " exception was added to handle messages from Ruby.
,(rx
bol
(? (| (regexp "[[:alpha:]][-[:alnum:].]+: ?")
(regexp "[ t]+\(?:in \|from\)")))
(group-n 1 (: (regexp "[0-9]*[^0-9n]")
(*? (| (regexp "[^n :]")
(regexp " [^-/n]")
(regexp ":[^ n]")))))
(regexp ": ?")
(group-n 2 (regexp "[0-9]+"))
(? (| (: "-"
(group-n 4 (regexp "[0-9]+"))
(? "." (group-n 5 (regexp "[0-9]+"))))
(: (in ".:")
(group-n 3 (regexp "[0-9]+"))
(? "-"
(? (group-n 4 (regexp "[0-9]+")) ".")
(group-n 5 (regexp "[0-9]+"))))))
":"
(| (: (* " ")
(group-n 6 (| "FutureWarning"
"RuntimeWarning"
"Warning"
"warning"
"W:")))
(: (* " ")
(group-n 7 (| (regexp "[Ii]nfo\(?:\>\|rmationa?l?\)")
"I:"
(: "[ skipping " (+ nonl) " ]")
"instantiated from"
"required from"
(regexp "[Nn]ote"))))
(: (* " ")
(regexp "[Ee]rror"))
(: (regexp "[0-9]?")
(| (regexp "[^0-9n]")
eol))
(regexp "[0-9][0-9][0-9]")))
1 (2 . 4) (3 . 5) (6 . 7))
That alist is preceded with the comment:
;; If you make any changes to `compilation-error-regexp-alist-alist',
;; be sure to run the ERT test in test/lisp/progmodes/compile-tests.el.
;; emacs -batch -l compile-tests.el -f ert-run-tests-batch-and-exit
The current gnu
test cases from compile-tests.el
are:
;; gnu
foo.c:88: message
../foo.c:88: W: message
/tmp/foo.c:88:warning message
foo/bar.py:8: FutureWarning message
foo.py:88: RuntimeWarning message
foo.c:88:I: message
foo.c:88.23: note: message
foo.c:88.23: info: message
foo.c:88:23:information: message
foo.c:88.23-45: Informational: message
foo.c:88-23: message
;; The next one is not in the GNU standards AFAICS.
;; Here we seem to interpret it as LINE1-LINE2.COL2.
foo.c:88-45.37: message
foo.c:88.23-9.17: message
jade:dbcommon.dsl:133:17:E: missing argument for function call
G:/cygwin/dev/build-myproj.xml:54: Compiler Adapter 'javac' can't be found.
file:G:/cygwin/dev/build-myproj.xml:54: Compiler Adapter 'javac' can't be found.
{standard input}:27041: Warning: end of file not at end of a line; newline inserted
boost/container/detail/flat_tree.hpp:589:25: [ skipping 5 instantiation contexts, use -ftemplate-backtrace-limit=0 to disable ]
to which we can add the two cases from this question (I've truncated the file paths, as that makes no difference).
Makefile.inc:5:0 (41): no match found, expected: ":", [ t] or [p{Latin}-_.${}/%0123456789]
Makefile.inc:14:24 (268): rule include_dir: include
We can then test these with M-x re-builder
either by switching it to rx
mode to use the original form1, or for the default read
mode using (cadr (assoc 'gnu compilation-error-regexp-alist-alist))
:
"^\(?:[[:alpha:]][-[:alnum:].]+: ?\|[ t]+\(?:in \|from\)\)?\(?1:\(?:[0-9]*[^0-9n]\)\(?:[^n :]\| [^-/n]\|:[^ n]\)*?\)\(?:: ?\)\(?2:[0-9]+\)\(?:-\(?4:[0-9]+\)\(?:\.\(?5:[0-9]+\)\)?\|[.:]\(?3:[0-9]+\)\(?:-\(?:\(?4:[0-9]+\)\.\)?\(?5:[0-9]+\)\)?\)?:\(?: *\(?6:\(?:FutureWarning\|RuntimeWarning\|W\(?::\|arning\)\|warning\)\)\| *\(?7:[Ii]nfo\(?:\>\|rmationa?l?\)\|I:\|\[ skipping \.+ ]\|instantiated from\|required from\|[Nn]ote\)\| *\(?:[Ee]rror\)\|[0-9]?\(?:[^0-9n]\|$\)\|[0-9][0-9][0-9]\)"
This confirms the issue: All of the original test cases match, but only one of the new cases matches.
As @lawlist identified, changing that [0-9]?
makes a difference. If we change that to [0-9]*
then all of the cases are now matched; however there's so much going on in this pattern that it's currently unclear to me whether or not that's the correct fix.
In the failure case:
Makefile.inc:14:24 (268): rule include_dir: include
The line number is 14, but it's the subsequent 24 which is failing to match the zero-or-one-digit [0-9]?
. Reducing that to a single digit (as seen in the case which worked) means the original regexp matches the line. (Use C-cC-u to ensure re-builder
picks up the change, if necessary.)
That [0-9]?
dates back to commit 0ab31e4a9ff from 2006, and was part of a change intended to "rule out false positives due to time stamps":
we get lots of false positives with messages including times of the form "HH:MM:SS" where MM is taken as a line number, so the last line tries to rule out message where the info after the line number starts with "SS".
1 You'll need make the top-level sequence explicit. Refer to the discussion of this gotcha in https://emacs.stackexchange.com/a/5577/454
Answered by phils on December 16, 2021
In the following guess-timated answer (tested with Emacs 26.3), I have changed the fourth line from the bottom; i.e., (regexp "[0-9]?")
to (regexp "[0-9]+?")
. This permits Emacs to match the number 14
following the filename and the first :
(colon). To try out this answer, evaluate the Lisp code underneath section labeled THE CODE, and then paste the working data into a scratch buffer and type: M-x compilation-mode
The following link provides other methods for changing an element of an alist: How to replace an element of an alist? I chose to use the solution provided by Dan
to modify the alist at issue in this thread.
WORKING DATA:
/Users/kpixley/projects/src-head/cevo/junos/ui/tests/Makefile.inc:5:0 (41): no match found, expected: ":", [ t] or [p{Latin}-_.${}/%0123456789]
/Users/kpixley/projects/src-head/cevo/jdid/jdid-infra/build-files/evo/src/Makefile.inc:14:24 (268): rule include_dir: include
THE CODE:
;;; Load the library before trying to change `compilation-error-regexp-alist-alist'
(require 'compile)
(setf (nth 1 (assoc 'gnu compilation-error-regexp-alist-alist))
(rx
bol
(? (| (regexp "[[:alpha:]][-[:alnum:].]+: ?")
(regexp "[ t]+\(?:in \|from\)")))
(group-n 1 (: (regexp "[0-9]*[^0-9n]")
(*? (| (regexp "[^n :]")
(regexp " [^-/n]")
(regexp ":[^ n]")))))
(regexp ": ?")
(group-n 2 (regexp "[0-9]+"))
(? (| (: "-"
(group-n 4 (regexp "[0-9]+"))
(? "." (group-n 5 (regexp "[0-9]+"))))
(: (in ".:")
(group-n 3 (regexp "[0-9]+"))
(? "-"
(? (group-n 4 (regexp "[0-9]+")) ".")
(group-n 5 (regexp "[0-9]+"))))))
":"
(| (: (* " ")
(group-n 6 (| "FutureWarning"
"RuntimeWarning"
"Warning"
"warning"
"W:")))
(: (* " ")
(group-n 7 (| (regexp "[Ii]nfo\(?:\>\|rmationa?l?\)")
"I:"
(: "[ skipping " (+ ".") " ]")
"instantiated from"
"required from"
(regexp "[Nn]ote"))))
(: (* " ")
(regexp "[Ee]rror"))
(: (regexp "[0-9]+?") ;; (regexp "[0-9]?")
(| (regexp "[^0-9n]")
eol))
(regexp "[0-9][0-9][0-9]"))))
SHOW YOUR WORK
One of my old math teachers used to say always "SHOW YOUR WORK". I came up with this guess-timated answer by first placing a message within the function compilation-parse-errors
, with an eye towards extracting the regexp used to process the relevant components of the working data, which yielded the following regexp:
"^ *\(?:[[:alpha:]][-[:alnum:].]+: ?\|[ t]+\(?:in \|from\)\)?\(?1:\(?:[0-9]*[^0-9n]\)\(?:[^n :]\| [^-/n]\|:[^ n]\)*?\)\(?:: ?\)\(?2:[0-9]+\)\(?:-\(?4:[0-9]+\)\(?:.\(?5:[0-9]+\)\)?\|[.:]\(?3:[0-9]+\)\(?:-\(?:\(?4:[0-9]+\).\)?\(?5:[0-9]+\)\)?\)?:\(?: *\(?6:\(?:FutureWarning\|RuntimeWarning\|W\(?::\|arning\)\|warning\)\)\| *\(?7:[Ii]nfo\(?:>\|rmationa?l?\)\|I:\|[ skipping .+ ]\|instantiated from\|required from\|[Nn]ote\)\| *\(?:[Ee]rror\)\|[0-9]?\(?:[^0-9n]\|$\)\|[0-9][0-9][0-9]\)"
Then, I took the working data and used M-x re-builder
to see what the above-mentioned regexp matched. I modified the second line of the working data by reducing the 14
to just one digit and that helped me zero in on the relevant section of the regexp at issue. From there, I looked at the compilation-error-regexp-alist-alist
to locate the correspondenting section and found it in the gnu
section of that variable.
Answered by lawlist on December 16, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP