TeX - LaTeX Asked on June 5, 2021
Consider the following MWE:
documentclass{article}
usepackage{listings}
lstset{basicstyle=ttfamily}
begin{document}
lstinline |asdf|asdf asdfasdf
verb |asdf|asdf asdfasdf
end{document}
My understanding of what is to expect here has always been the following (let cmd
stand for either verb
or lstinline
in the following):
cmd |
, it gobbles the space following it, leaving only the token cmd
in its "mouth" (and |
behind it in the input stream).cmd
, which leads to a series of category code changes, basically making every otherwise special character other
, followed by some macro that looks at the next token (in this case, |
).Notably, the space following cmd
is gobbled during that control sequence’s tokenization, i.e. before any category codes are changed.
With this understanding, I would expect both of the lines above to typeset
asdf
asdf asdfasdf
But I get the following output:
lstinline
behaves as expected, but verb
somehow knows about the space following it.
How?? To my knowledge, there shouldn’t ever have been a space token behind the verb
token.
At the very beginning you said:
When TeX first tokenized
cmd |
but that's wrong. TeX is a well-behaved gentleman and doesn't get ahead of itself scanning a
and a |
before knowing what cmd
is supposed to do. As far as TeX is concerned, the space and the |
and whatever other character could all mean the same thing, and could change in meaning, so pre-scanning would only cause confusion.
When TeX sees cmd
, the only “special” thing it does to blank spaces is to set state:=skip_blanks
, so that when, say, typesetting, TeX code
will write , ignoring the spaces after the control sequence as usual. You can check for yourself with:
deftest{catcode` =12 testx}
deftestx{futurelettokentesty}
deftesty{showtokenafterassignmenttestxlettoken = }
test x
and you'll see that it shows 5 the character
before showing the letter x
.
Now back to the problem at hand: update your LaTeX :-)
The old behaviour of verb
was to look at the next token, whichever it happened to be, and use that as a delimiter (given the exception of {
). This has now been fixed for the 2020-10-01 LaTeX release (from LaTeX News Issue 32):
Answered by Phelype Oleinik on June 5, 2021
I believe what happens is as follows:
verb
is first tokenized (the space character, which has catcode 10 just before verb
is tokenized, marks the end of this control word but is not discarded).
TeX will go into state S, since verb
is a control word (control sequence whose name is made of “letters” only), but it doesn't skip blanks yet.
verb
is expanded and code from its expansion is executed. This code first gives spaces the catcode 12 (via letdo@makeother dospecials
), this is important.
A the end of verb
's replacement text, there is @ifstar@sverb@verb
. This @ifstar
looks ahead in the input, thus the state S kicks in. Since spaces have catcode 12 at this point, the space character following verb
is not skipped. It gets tokenized with catcode 12.
Since we used the no-star form of verb
and @verb
is defined as def@verb{@vobeyspaces frenchspacing @sverb}
, spaces are now made active, and @sverb
is expanded (so, the end delimiter will be a catcode-13 space, while the start delimiter was a catcode-12 space).
@sverb
grabs the catcode-12 space token as its only argument and defines active spaces to be let
-equal to verb@egroup
(if verb*
had been used, @sverb
would have done @setupverbvisiblespace @vobeyspaces
too; thus, spaces end up active in all cases). This is how the verbatim text will end in non-erroneous conditions: verb@egroup
will yield egroup
, which will terminate the group started by verb
(there is a bgroup
in verb
's replacement text). Since the special catcode setup has been done locally inside this group, this terminates the special catcode setup.
Thus, the sentence from the question “This macro then grabs everything up to the next occurrence of that token” is not really correct: there is no grabbing of the verbatim contents as an argument. Tokens between the start and the end delimiters are simply processed as catcode-12 tokens, except space tokens which are always active at the end of @sverb
, as we've seen.
Note: as Phelype Oleinik pointed out, the behavior of verb
was changed in the LaTeX format from 2020-10-01. My comments here are based on LaTeX2e <2020-02-02> patch level 5
.
Answered by frougon on June 5, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP