Date: Mon 23 Dec 91 11:46:33-EST From: Michael Downes Subject: Answers to 'Around the bend' #2 Exercise 6 To: info-tex@shsu.edu X-ListName: TeX-Related Network Discussion List "*** Exercise 6 (hard): " "Define a macro \args that can be used to fill in the proper number "in the following sentence no matter how \foo is defined (except "you may assume it is not \outer). " " The macro {\tt\string\foo} has {\args\foo} arguments. " "Is it possible to solve this if \foo is \outer also? Is it possible "to make \args fully expandable, so that it could be used in a "message: " " \message{The macro \noexpand\foo has \args\foo\space arguments.} This was a tough one. All who sent in answers to this exercise (counting myself) used the approach of applying \meaning to \foo and analyzing the resulting string. There are some drawbacks to this. (1) In a \meaning string, all characters (other than spaces) have catcode 12. This means that all occurrences in a \meaning string of the character # are indistinguishable, regardless of their true significance in the parameter text or replacement text of the macro in question. Consequently, an occurrence of a # character, not category 6, followed by a number, in the parameter text of \foo can potentially make \args report an incorrect number of arguments. For example, in the following definitions \foo has no arguments, only delimiter text, in all three cases, but the \meaning string would appear to show that \foo has one argument: \def\foo\#1{} \expandafter\def\expandafter\foo\string #1{} \catcode`\#=12 \def\foo#1{} (2) The following two examples produce identical \meaning strings: \def\foo&1{} % no arguments \catcode`\&=6 \def\foo&1{} % one argument (The string is "macro:&1->".) I.e., characters other than # can be used to create parameter markers in a macro definition, and such a parameter marker cannot be distinguished in a \meaning string from a normal use of the character in question. (3) There is no completely general way to isolate the parameter text of an arbitrary macro from the replacement text. The best you can do is remove the tail of the \meaning string---everything after the last occurrence of -> in the string---and say 'This is not part of the parameter text'. Likewise, anything preceding the first occurrence of -> is certainly part of the parameter text. If there are two or more occurrences of -> in the string, however, you cannot say for sure whether anything between the first and last occurrences is parameter text or replacement text. This raises a slight additional possibility that pseudo 'parameter markers' in the replacement text could cause \args to give an incorrrect result. For example: \edef\foo #1{\string#2->} defining \foo with one argument, produces a \meaning string of macro:#1->#2-> which is exactly the same as the \meaning string for \def\foo#1->#2{} where \foo has two arguments. Speaking practically, however, rather than theoretically, using \meaning to analyze the number of arguments of an arbitrary macro works fine. Donald Arseneau's solution, below, is admirably brief and demonstrates an easy way of handling an outer argument that I had never seen before. >>Solution 1 (Donald Arseneau) Here is my solution for counting arguments. It is totally expandable, and relies on the fact that the parameter numbers must be in increasing order, that they are only single digits, and that there is no parameter zero. Also important is that \meaning of a macro defined by \def\x#{...} reports a syntax of { rather than #. {\catcode`\*=6 \catcode`\#=12 % use * for macro parameters while # is "other" % \gdef\args{\expandafter\Args\noexpand}% get rid of \outerness % \long\gdef\Args*1{\expandafter\countargs \meaning*1:->{}\end}% % ... \meaning will display the parameter syntax (as "other" characters). % \gdef\countargs*1:*2->*3\end{\twoargs#0*2#0}% get just the parameter syntax % ... in format #0junk#1junk...#njunk#0. \twoargs processes the list to % ... give "n", the last number before #0. % % Here's what tests the parameter numbers, two at a time. (Thus, the two % #0's in \countargs, so there are always at least two #n's detected.) % When the second number of a comparison isn't zero, \twoargs re-executes % itself to test the next pair; when the second n is 0, the first n is the % highest parameter number, so it is output. \gdef\twoargs*1#*2*3#*4{\ifnum0=*4 *2\else % note the space to end the number \expandafter\twoargs\expandafter#\expandafter*4\fi} } Here is my test suite. The character ``:'' works in a funny way: it confuses how \countargs reads its parameter list, and another colon gets into the supposed syntax. But it works because there are no parameters. The primitive \halign is reported to have no parameters because it is not a macro. This could be confusing to someone. The same confusion could arise with \args itself because it doesn't read the parameter right away. \def\test#1#{nothing} \def\Test[#1]#2:{\##1,#2##} \def\#{haha} \show\test \show\Test %>> I condensed this test suite---MJD \long\def\msg#1{\message{The object \string#1 has \args#1 arguments.}} \msg\mathpalette \msg\mathhexbox \msg\par \msg\halign \msg\args \msg\relax \msg # \msg\# \msg\test \msg\Test \msg : \msg\: \msg\csname \msg t \msg ~ \msg $ \msg ^ %>> Outer macros---MJD \message{The object \string\bye\space has \args\bye\space arguments.} \message{The object \string\newhelp\space has \args\newhelp\space arguments.} \bye % -- Donald Arseneau asnd@triumfcl % asnd@reg.triumf.ca >>EndSolution Although the problem statement only mentioned `macros' Arseneau earned some thoroughness points by including primitives \halign, \relax, and \csname, as well as characters # : t $ ^ in his tests. This is of some interest because of the difference in \meaning strings between macros and non-macros. In my solution for this exercise, I amused myself by trying to pack everything into as few control sequences as possible. Although I got it down to two, that's really only one less than Arseneau's four, because one control sequence in his solution is expended to handle outer macros, something my solution didn't attempt to do. >>Solution 2 (mine) % Use & instead of # temporarily. \catcode`\&=6 \catcode`\#=12 \long\def\args &1{\expandafter\countargs\meaning &1#\args->\countargs 0} % Analysis is restricted to the parameter text by chopping off everything % after -> in the meaning string (this will leave possibly only part % of the parameter text). % Then we look in the parameter text for # followed by a number % (checking to make sure that the thing after # is a number handles a % few extra possibilities, such as \# followed by non-number in the % parameter text). If we find # plus a number, we pass the number % onward to the next invocation of \countargs, where it will end up as % the returned value (argument #5) if the next \countargs determines % that the remaining parameter text contains no more parameter markers. \def\countargs &1#&2&3->&4\countargs &5{% \ifx\args&2&5% \else \ifodd0&21 % Then &2 is a number, carry forward. \countargs&3#\args->\countargs&2% \else % &2 not a number---ignore, carry forward last number instead \countargs&3#\args->\countargs&5% \fi \fi} \catcode`\#=6 \def\test{\message{The macro \noexpand\foo has \args\foo\space arguments (\meaning\foo).}} %\tracingmacros=2 \tracingcommands=2 % Success: \def\foo{No args}\test \def\foo#1{One arg}\test \def\foo#1#2{Two args}\test \def\foo./{No args, delimited}\test \def\foo#1#2#3#4#5#6#7#8#9{Nine args}\test \def\foo//#1#2#3#4#5#6#7#8#9//{Nine args, delimited}\test \def\foo#{Weird}\test \def\foo#1#{Weird, one arg}\test \def\foo#1#2#3#4#5#6#7#8#9#{Weird, nine args}\test \def\foo#1 {One arg, space delimited}\test \def\foo#1 #2 #3 #4 #5 #6 #7 #8 #9 {Nine args, space delimited}\test \def\foo/{\def\foo} \foo/ #1{Interesting}\test \edef\foo#1#2{\string #3\string #4}\test \edef\foo{\string #}\test \expandafter\edef\expandafter\foo \csname 0\string #\string #\endcsname#1#2{#1#2}\test % Failure: \def\foo->#1->#2->#3->#4->#5->#6->#7->#8->#9->{Nine args, devious delimiter}\test \expandafter\edef\expandafter\foo \csname 0\string #1\string #2\endcsname{...}\test \let\foo=\bye \test % \outer bomb >>EndSolution When I originally posed this problem, I had seen far enough ahead to suspect that the drawbacks of \meaning mentioned above would be impossible to overcome. But \meaning is the only way to analyze a macro that has a nonsimple parameter text---that is, one containing delimited arguments. Another possibility I had in mind was restricting the analysis to macros with simple parameter texts---empty or having only nondelimited arguments---to see what might be done without \meaning. The best that I could manage in my experiments along these lines was a definition of \args with an unacceptably cumbersome call syntax. But it does have the virtue of correctly identifying any number of nondelimited arguments, no matter whether \foo was originally defined using # (category 6) or some other category 6 character. >>Solution 3 (mine) % This solution is not fully expandable, hence cannot be used % inside a \message. \def\args{\expandafter\argscontinue} \def\argscontinue{\begingroup % Make all digits have category 2 (= end of group) so that % they will serve to end the token register assignment % \global\toks1 ... \catcode`\0=2 \catcode`\1=2 \catcode`\2=2 \catcode`\3=2 \catcode`\4=2 \catcode`\5=2 \catcode`\6=2 \catcode`\7=2 \catcode`\8=2 % We use \afterassignment to put an \endgroup after the % token register assignment, so that numbers will revert to % their ordinary catcodes. And we use \aftergroup to put % a \finishup token after the \endgroup. Thus \finishup can % look ahead to see what numbers are remaining; this information % reveals how many arguments were used up by the \foo macro call. \aftergroup\finishup \afterassignment\endgroup \global\toks1\bgroup } % \finishup takes the first digit following it and returns it % as the value of \args; any following numbers are discarded % (note that #2 is delimited by a space). \def\finishup#1#2 {%\showthe\toks1 #1} %\tracingmacros=2 \tracingcommands=2 \tracingonline=1 \def\foo{} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \def\foo#1{} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \edef\foo#1{\string #2\string #3\string #4->\string #4\string #3#1} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \def\foo#1#2#3{a#1b#2c#3} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. \def\foo#1#2#3#4#5#6#7#8#9{#1#2#3#5#8bb#9} The macro {\tt\string\foo} has \args\foo 00123456789 \ arguments. >>EndSolution The fourth solution for Exercise 6 is by Peter Schmitt; it gets the robustness prize for carrying out a diligent analysis of \meaning strings that enables it to correctly handle a greater variety of exotic cases than the other solutions. Schmitt's original method of handling outer macros was effective, but more complicated than Arseneau's method, incorporated here as noted. Even though my approach was rather different from Schmitt's, some of the comments in Schmitt's solution inspired me in turn to improve my solution [2] from its previous much inferior state. >>Solution 4 (Peter Schmitt) % \args expands to: - if is not a macro % 0..9 according to the number of parameters % if the is a macro % \args is fully expandable and accepts outer macros as well. % It assumes, however, that the tested macro has been defined using the % standard parameter symbol #, % and that the current value of \escapechar is the standard backslash \. % The definition of the macros uses the expansion of % \meaning\cs: % It is of the form: % [..] macro: [parameter text] -> [replacement text] % and consists of `other characters'. % The macro \args checks: % (1) if the expansion contains `macro': % - if not, then \cs is not a macro and \args yields `-' % (2) if the expansion contains parameters #1 etc. % - if #n is the first that is not present % then \cs takes (n-1) arguments % and \args yields `n-1' % The following special characters are chosen to make the definitions as % readable as possible. Any characters having catcodes different from 12 % will serve the same purpose: \catcode`\:3 \catcode`\/3 % : and / are used as parameter delimiters \catcode`\^3 % ^ is used to detect empty arguments \catcode`\?11 % ? is used to make the control sequences private % Since the occurrences of # in the expansion of \meaning\cs has to be % detected, it has to be used as an `other character'. % To avoid confusion it has been replaced not only where necessary but % throughout all the definitions: \catcode`\#12 \catcode`\*6 % * is parameter character % \?macro is defined to be `macro' consisting of `other characters' % using the expansion of \meaning\TeX. % \?DEF inserts these five characters into some definitions where they are % as parameter delimiters: % \DEF\cs { } { } % where the texts may contain *1 and **1 .. **9 % yields \def\cs {} % where *1 is replaced by `macro' and **1 yields *1 etc. \def\?macro *1:*2:{*1} \edef\?macro{\expandafter\?macro\meaning\TeX:} \def\?DEF *1*2{\def*1**1:{\long\def*1*2}\expandafter*1\?macro:} %%%%%%%%%%%% % \args passes the unexpanded to \args? %%% (taken from the solution by Donald Arseneau) % \args? takes one argument, expands its \meaning to TEXT % and passes it to \macro? after appending macro^: % \macro? checks the first token after the first occurrence of `macro': % if this is ^(3), then `macro' was not present in TEXT (output: -) % otherwise TEXT is further investigated. \def\args{\expandafter\args?\noexpand} \?DEF \args? {**1{\expandafter\macro?\meaning **1*1^:}} \?DEF\macro? {**1*1**2:{\ifx^**2-\else\expandafter\purge? **2:\fi}} % The parameters taken by a control sequence all appear (once and in numerical % order) in the parameter text --- and no other occurrence of a pair #n is % allowed in it. Moreover, only the same pairs #n may occur in the replacement % text. It is, however, not possible to simply look for occurrences of these % pairs since there are tokens that may - if followed by some number - % be (wrongly) interpreted as parameters: % - the token ## in the replacement text, and %% (as pointed out by Michael Downes) % - the control symbol \# both in the parameter text and the replacement text. % Since \\#n has to be distinguished from \#n the control symbol \\ is also % important. % % Therefore \purge? is used to remove all occurrences of these tokens. % After that the search-macro \head? is invoked, appending % the sequence #n^(n-1) for every possible parameter #n. % Since \purge? has to identify the character \(12) it is necessary to change % the escapecharacter: \catcode`\!0 !catcode`!\=12 % ! is used as escape character % \purge? appends ## \#^ and \\^ to the TEXT as a means to stop the search % for these tokens, and : as delimiter: % (i) \backslash? looks for the first occurrence of the character pair \\ % in TEXT (this must be a token \\) and replaces it by a space. % If it is followed by ^(3) then the search is completed, % otherwise the process is repeated. % (ii) \numbersign? looks for the first occurrence of the character pair \# % in the (in the meantime modified) TEXT % (since all \\ have been removed this must correspond to a token \#) % and replaces it by a space. % Again the process is stopped when it is followed by ^(3). % (iii) \parametersign? truncates TEXT at the first occurrence of the % character pair. Note that this pair must correspond to a parameter % token ## in the replacement text and therefore the rest of TEXT is % not needed any more. !def!purge? *1:{!backslash? *1##\#^\\^:} % \purge? could be avoided - \macro? could call \backslash? directly !def!backslash? *1\\*2*3:{!ifx^*2!expandafter!numbersign? !else !expandafter!backslash? !fi *1 *2*3:} !def!numbersign? *1\#*2*3:{!ifx^*2!expandafter!parametersign? !else !expandafter!numbersign? !fi *1 *2*3:} !catcode`!\0 \catcode`\!=12 % return to the normal use of backslash \def\parametersign? *1##*2:{% \head? *1^#1^0#2^1#3^2#4^3#5^4#6^5#7^6#8^7#9^8#0^9:} % For each n from 0 to 9 \head? extracts the characters contained in % the (appended) TEXT between the first occurrence of #n and #(n+1) % and investigates them with \used?. % If #n is not present in TEXT, then the first of these characters is % ^(3), taken from the appended string: % When this happens for the first time \used? outputs the second character % (the number of parameters) and calls \skip? to hide all the remaining % parts of the appended TEXT, otherwise \used? checks the next item. % Since eleven parameters are necessary to handle the ten cases (0..9) this % duty has to be distributed on two macros: % The appearance of the character /(3) is used to indicate that the second % macro \tail? has to be invoked by \used?. \def\head? *1#1*2#2*3#3*4#4*5#5*6:{% \used? *2..:*3..:*4..:*5..:/.:% \expandafter\tail? *6://} \def\tail? *1#6*2#7*3#8*4#9*5#0*6:{\used? *2..:*3..:*4..:*5..:*6:} \def\used? *1*2*3:{\ifx^*1*2\expandafter\skip?\else\ifx/*1\else \expandafter\expandafter\expandafter\used?\fi\fi} \def\skip? *1//{} %% Finally, catcodes are turned back to normal: \catcode`\#6 \catcode`\*12 \catcode`\?12 \catcode`\:12 \catcode`\/12 \catcode`\^12 %%%%%%%%%%%%%%%%%%%%%% \long\def\test#1{ The macro {\tt\string#1} has {\args#1} arguments. \message{The macro \noexpand#1 has :\args#1:\space arguments.} } \def\exc#1\\#2\ #3{\#4\\#1\\\#4\\\\#2two arguments} \test\exc \end >>EndSolution Schmitt's solution assumes the use of mine and Arseneau's test suites as well, because they had been shared between us before Schmitt sent in the final version of his solution. Answers for Exercise 7 will follow next week. Michael Downes mjd@math.ams.com (Internet)