TransWikia.com

Parse Argument by Character while Executing Embedded Macros

TeX - LaTeX Asked by Steven B. Segletes on February 26, 2021

I am interesting in parsing a string of input character by character, and doing something to each character. In this MWE, I merely apply a textbf to each successive character as an example, to verify that the parser is working.

The problem arises if the argument contains embedded macros. What I would like to do, if I come a across a macro in the argument, is to halt the parsing and let the macro execute, eating up as many arguments as it would desire from the original argument stream, and then resume parsing the remainder of the argument.

While charparse is the parsing macro, I am trying to design a helper macro execmacro to do what needs to occur when a macro is detected in the argument stream. In the MWE below, I can accomplish that, but only if I presuppose the nature of the embedded macros. In particular, I show 3 versions of execmacro, depending on whether I presuppose the embedded macros to require 0, 1, or 2 arguments, respectively.

What I would like instead is to have a version of execmacro that will work regardless of how many arguments the embedded macros demand.

documentclass{article}
newcommandcharparse[1]{charparsehelp#1relaxrelaxrelax}
defcharparsehelp#1#2relax{%
  ifcatnoexpandrelaxnoexpand#1%
%   MACRO DETECTED IN INPUT STREAM
    execmacro#1#2relax% EXECUTE THE MACRO THEN RETURN TO PARSING 
  elsetextbf{#1}% BOLDING IS JUST AN EXAMPLE TO SHOW THAT PARSING MACRO IS WORKING
  ifxrelax#2else
    charparsehelp#2relax
  fifi
}
begin{document}
% THIS ONLY WORKS IF THE EMBEDDED MACRO TAKES EXACTLY ZERO ARGUMENTS
defexecmacro#1#2relax{#1ifxrelax#2elsecharparsehelp#2relaxfi}Case 1par
charparse{0123itshape456upshape789}par
I would like the output from the above character-parsing macro to bepar
charparse{0123}itshapecharparse{456}upshapecharparse{789}medskip

% THIS ONLY WORKS IF THE EMBEDDED MACRO TAKES EXACTLY ONE ARGUMENT
defexecmacro#1#2#3relax{#1{#2}ifxrelax#3elsecharparsehelp#3relaxfi}Case 2par
charparse{0123textit{456}789}par
I would like the output from the above character-parsing macro to bepar
charparse{0123}textit{456}charparse{789}medskip

% THIS ONLY WORKS IF THE EMBEDDED MACRO TAKES EXACTLY TWO ARGUMENTS
defexecmacro#1#2#3#4relax{#1{#2}{#3}ifxrelax#4elsecharparsehelp#4relaxfi}Case 3par
charparse{0123rule{3ex}{1ex}456789}par
I would like the output from the above character-parsing macro to bepar
charparse{0123}rule{3ex}{1ex}charparse{456789}
end{document}

enter image description here

2 Answers

Nearly 5 years after asking the question, I realize my recent tokcycle package can answer this question, directly. I have even made it to handle optional arguments, which it assumes to occur when a [ follows a control sequence. (Thus, to place a normal [ following a control sequence, use macro{}[ in the input stream, placing empty braces following the control sequence.)

Here it does it by creating the Charparse macro that leaves control sequences, spaces, and (importantly) group content untouched, thus operating only upon top-level character tokens. Additional screening could be added to restrict operations to particular catcodes, or to exclude particular character tokens.

Thus, as long as arguments to a macro are embraced {}, they will be passed through unmodified.

In the MWE below, the first 3 examples are from the OP's question. The 4th example shows an optional argument being handled properly. The last example shows how to overcome a normal [ accidentally being interpreted as an optional argument.

documentclass{article}
usepackage{tokcycle}
newcommandCharparse[1]{%
  defmacON{F}%
  defoptON{F}%
  tokcycle
    {if TmacONifx[##1defoptON{T}fifi
     if ToptONaddcytoks{##1}elseaddcytoks{textbf{##1}}fi
     ifx]##1defoptON{F}fi
     defmacON{F}%
    }% CHARACTER DIRECTIVE
    {addcytoks{##1}gdefmacON{F}}% GROUP DIRECTIVE
    {addcytoks{##1}if FoptONdefmacON{T}fi}% MACRO (CS) DIRECTIVE
    {addcytoks{##1}defmacON{F}}% SPACE DIRECTIVE
  {#1}thecytoks
}
begin{document}
Tokcycle can define an environment that leaves
 macros, spaces, and group content untouched.par
Charparse{
0123itshape456upshape789

0123textit{456}789

0123rule{3ex}{1ex}456789

0123rule[-3pt]{3ex}{1ex}456789

S[x] S{}[x]}
end{document}

enter image description here

For those wondering, tokcycle can be directed to process tokens inside of groups (using processtoks{##1} instead of addcytoks{##1} for the Group Directive). However, in this application, it would defeat the purpose

Correct answer by Steven B. Segletes on February 26, 2021

Well, since no one wants to touch this one, and all the gurus advise strongly against it, I should probably let it lie. But I won't. Just to show what might be done, I followed the hint mentioned by egreg, to the effect of "registering" allowable macros that can be processed.

At this time, the presence of optional arguments are not allowed, which is something of a drawback. And for demonstration purposes, I have set it up to handle macros with up to 2 arguments only.

So here is how it works. I wish to develop charparse{} that parses the argument 1 character at a time. But the key is to be able to execute macros in the argument and then resume the character by character parsing following their execution.

If a macro has no arguments, it should not be registered. Otherwise, the macros have to be registered with registerparsemacro}{<macroname>}{<arguments>}. To repeat, optional arguments are not allowed. So, for instance, I invoke

registerparsemacro{textit}{1}
registerparsemacro{rule}{2}

At that point I can include invocations of textit and rule (without optional argument) in the argument to charparse. The macros itshape and upshape should not be registered, since they take no arguments.

The following MWE creates the same output as the question shows.

documentclass{article}
usepackage{ifthen}
newcommandcharparse[1]{charparsehelp#1relaxrelaxrelax}
defcharparsehelp#1#2relax{%
  ifcatnoexpandrelaxnoexpand#1%
%   MACRO DETECTED IN INPUT STREAM
    execmacro#1#2relax% EXECUTE THE MACRO THEN RETURN TO PARSING 
  elsetextbf{#1}% BOLDING IS JUST AN EXAMPLE TO SHOW THAT PARSING MACRO IS WORKING
  ifxrelax#2else
    charparsehelp#2relax
  fifi
}
newcounter{parsemacro}
defexecmacro#1#2#3#4relax{%
  setcounter{parsemacro}{0}%
  whiledo{value{parsemacro} < value{parsemacrocount}}{%
  stepcounter{parsemacro}%
  expandafterexpandafterexpandafter%
    ifxcsname parsemacroromannumeralvalue{parsemacro}endcsname#1%
      if1csname parsemacroargumentsromannumeralvalue{parsemacro}endcsname
        execmacroONE#1{#2}#3#4relaxelse
      if2csname parsemacroargumentsromannumeralvalue{parsemacro}endcsname
        execmacroTWO#1{#2}{#3}#4relaxelse
      fifi
      setcounter{parsemacro}{numexprvalue{parsemacrocount}+1}%
    fi
  }%
  ifnumvalue{parsemacro}=value{parsemacrocount}execmacroZERO#1#2#3#4relaxfi
}
defexecmacroZERO#1#2relax{#1ifxrelax#2else
  charparsehelp#2relaxfi}
defexecmacroONE#1#2#3relax{#1{#2}ifxrelax#3else
  charparsehelp#3relaxfi}
defexecmacroTWO#1#2#3#4relax{#1{#2}{#3}ifxrelax#4else
  charparsehelp#4relaxfi}
newcounter{parsemacrocount}
setcounter{parsemacrocount}{0}
defregisterparsemacro#1#2{%
  stepcounter{parsemacrocount}%
  expandafterdefcsname parsemacroromannumeralvalue{parsemacrocount}endcsname{#1}%
  expandafterdefcsname parsemacroargumentsromannumeralvalue{parsemacrocount}%
    endcsname{#2}%
}
begin{document}
registerparsemacro{textit}{1}
registerparsemacro{rule}{2}
Case 1parcharparse{0123itshape456upshape789}par
I would like the output from the above character-parsing macro to bepar
charparse{0123}itshapecharparse{456}upshapecharparse{789}medskip

Case 2parcharparse{0123textit{456}789}par
I would like the output from the above character-parsing macro to bepar
charparse{0123}textit{456}charparse{789}medskip

Case 3parcharparse{0123rule{3ex}{1ex}456789}par
I would like the output from the above character-parsing macro to bepar
charparse{0123}rule{3ex}{1ex}charparse{456789}
end{document}

Answered by Steven B. Segletes on February 26, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP