TeX - LaTeX Asked by mpr on November 23, 2020
I was just reviewing my “capitalization standards” for titles and such and was wondering if there’s a macro to do the same thing I’m forced to do by hand nowadays. My personal rules (feel free to disagree/comment on them) are as follows:
In other words: I’d like a capitalization command (like MakeUppercase
) that will capitalize every word not included in a list of words and that will always capitalize the first word of its argument.
Doable?
PS: one such “list” of closed class words (also known as “function words”) can be found here.
documentclass[a4paper]{article}
usepackage[latin1]{inputenc}
usepackage{xparse}
ExplSyntaxOn
NewDocumentCommand{capitalize}{>{SplitList{~}}m}{
CapitalizeFirst#1Capitalizeunskip
}
ExplSyntaxOff
defSentinel{Capitalize}
defCapitalizeFirst#1{MakeUppercase#1 Capitalize}
defCapitalize#1{%
defnext{#1}%
ifxnextSentinel
expandafterunskip
else
CheckInList{#1}spaceexpandafterCapitalize
fi}
defCheckInList#1{%
ifcsname List@detokenize{#1}endcsname
#1%
else
MakeUppercase#1%
fi}
makeatletter
defAppendToList#1{%
@fornext:=#1do
{expandafterletcsname List@detokenizeexpandafter{next}endcsnameempty}}
makeatother
AppendToList{a,is,of}
begin{document}
capitalize{here is a list of words école}
end{document}
Won't work with UTF-8 (unless XeLaTeX or LuaLaTeX are used).
It won't work with UTF-8 in pdflatex
because MakeUppercase
will apply only to the first byte of a possible two, three or four byte combination (for Western languages probably only two). For that to work one has to feed the whole block of bytes to MakeUppercase
.
To be clearer: when we say MakeUppercase
, LaTeX will uppercase the argument; in general the call is MakeUppercase{word}
; here we're saying instead MakeUppercase#1
(without braces), so only the first token (usually a character) will be uppercased; here's where it will fail with input such as 'ecole
: the token passed to MakeUppercase
would be '
, which it doesn't know what to do. Using école
(and a one byte encoding such as latin1
), MakeUppercase
will process é
and give the correct result.
With UTF-8 this would fail: what we see as é
on our screen when writing a LaTeX document is actually two bytes (C3 and A9, for é
) and again MakeUppercase
would be passed only the first one. So a more complex routine is necessary.
In order to have this work with pdflatex
and UTF-8, the definition of CheckInList
and CapitalizeFirst
above can be changed into the following
defCapitalizeFirst#1{expandafterUC@next#1 Capitalize}
defCheckInList#1{%
ifcsname List@detokenize{#1}endcsname
#1%
else
expandafterUC@next#1%
fi}
defUC@next#1{%
ifx#1UTFviii@two@octets
expandafter@firstoffour
else
ifx#1UTFviii@three@octets
expandafterexpandafterexpandafter@secondoffour
else
ifx#1UTFviii@four@octets
expandafterexpandafterexpandafterexpandafterexpandafter
@thirdoffour
else
expandafterexpandafterexpandafterexpandafterexpandafter
expandafterexpandafter@fourthoffour
fi
fi
fi
{UC@two}{UC@three}{UC@four}{MakeUppercase}#1}
defUC@two#1#2#3{MakeUppercase{#1#2#3}}
defUC@three#1#2#3#4{MakeUppercase{#1#2#3#4}}
defUC@four#1#2#3#4#5{MakeUppercase{#1#2#3#4#5}}
providecommand@firstoffour[4]{#1}
providecommand@secondoffour[4]{#2}
providecommand@thirdoffour[4]{#3}
providecommand@fourthoffour[4]{#4}
However accent commands are not allowed (they aren't also in the other version).
After a few years, here's a better implementation, thanks to new expl3
features; it works for all engines.
documentclass[a4paper]{article}
usepackage{ifxetex}
ifxetex
usepackage{fontspec}
else
usepackage[T1]{fontenc}
usepackage[utf8]{inputenc}
fi
usepackage{xparse}
ExplSyntaxOn
NewDocumentCommand{capitalize}{>{SplitList{~}}m}
{
seq_clear:N l_capitalize_words_seq
ProcessList{#1}{CapitalizeFirst}
seq_use:Nn l_capitalize_words_seq { ~ }
}
NewDocumentCommand{CapitalizeFirst}{m}
{
capitalize_word:n { #1 }
}
sys_if_engine_pdftex:TF
{
cs_set_eq:Nc capitalize_tl_set:Nn { protected@edef }
}
{
cs_set_eq:NN capitalize_tl_set:Nn tl_set:Nn
}
cs_new_protected:Nn capitalize_word:n
{
capitalize_tl_set:Nn l_capitalize_word_tl { #1 }
seq_if_in:NfTF g_capitalize_exceptions_seq { tl_to_str:n { #1 } }
% exception word
{ seq_put_right:Nn l_capitalize_words_seq { #1 } } % exception word
% to be uppercased
{ seq_put_right:Nx l_capitalize_words_seq { tl_mixed_case:V l_capitalize_word_tl } }
}
cs_generate_variant:Nn tl_mixed_case:n { V }
NewDocumentCommand{AppendToList}{m}
{
clist_map_inline:nn { #1 }
{
seq_gput_right:Nx g_capitalize_exceptions_seq { tl_to_str:n { ##1 } }
}
}
cs_generate_variant:Nn seq_if_in:NnTF { Nf }
seq_new:N l_capitalize_words_seq
seq_new:N g_capitalize_exceptions_seq
ExplSyntaxOff
AppendToList{a,is,of,óf}
begin{document}
Xcapitalize{here is a list of words óf école}X
end{document}
Correct answer by egreg on November 23, 2020
A ConTeXt solution:
You can use the command applytosplitstringwordspaced
for this:
defIgnoredWords
{a,is,to,of,or,and}
define[1]CapitalizeWithIgnoreWord
{doifinsetelse{#1}IgnoredWords{#1}{Words{#1}}}
defCapitalizeWithIgnore
{applytosplitstringwordspacedCapitalizeWithIgnoreWord}
starttext
CapitalizeWithIgnore{This is some of my input or another and to the end.}
stoptext
which gives
The applytosplitstringwordspaced
command divides the input into words and applies each word to the macro CapitalizeWithIgnoreWord
, which takes one argument. Then I simply test, if the given word is a member of the word list and print it, or print it uppercased.
Answered by Marco on November 23, 2020
The titlecaps
package is newly introduced and demonstrated here: Headings in uppercase. It will take care of titling diacritical marks (e.g., umlauts, etc.) national symbols (e.g., oe) and is compatible with (i.e., can include in its argument) commands that change the font characteristics, such as textit{}
, scshape
, and footnotesize
. Further, it allows for words to be designated as lower-cased, for example prepositions and conjunctions, which are to be screened out and not titled. The presence of punctuation should not affect the ability of the package to either capitalize a word or detect it as a pre-designated lower-cased word.
Answered by Steven B. Segletes on November 23, 2020
The mfirstuc
package provides capitalisewords
. You can specify the exceptions with MFUnocap
. For example:
documentclass{article}
usepackage{mfirstuc}
begin{document}
capitalisewords{the cat sat on the mat.}
MFUnocap{on}
MFUnocap{the}
capitalisewords{the cat sat on the mat.}
end{document}
The mfirstuc-english
package (which automatically loads mfirstuc
) provides some common exceptions:
documentclass{article}
usepackage{mfirstuc-english}
begin{document}
capitalisewords{the cat sat on the mat.}
end{document}
It doesn't include disputed words or words that may be ignored from case-changes only under certain circumstances. You can localise MFCnocap
:
documentclass{article}
usepackage{mfirstuc-english}
begin{document}
{% scope
MFUnocap{on}
capitalisewords{the cat sat on the mat.}
}
capitalisewords{the cat sat on the mat.}
end{document}
The switches MFUhyphenfalse
and MFUhyphentrue
determine whether or not to change the case of parts of hyphenated words. The default is MFUhyphenfalse
:
documentclass{article}
usepackage{mfirstuc}
begin{document}
capitalisewords{server-side includes}
MFUhyphentrue
capitalisewords{server-side includes}
end{document}
Answered by Nicola Talbot on November 23, 2020
Using csplain you can implement this by a few lines of basic macros (they use only TeX primitives):
defcapitalize#1{capitA#1 {} }
defcapitA#1 {capitW#1 capitB}
defcapitB#1 {ifxrelax#1relax else space
isinlistextrawords{ #1 }iftrue #1else capitW#1 fi
expandafter capitB fi
}
defcapitW#1#2 {uppercase{#1}#2}
defisinlist#1#2#3{begingroup longdeftmp##1#2##2end{deftmp{##2}%
ifxtmpempty endgroup csname iffalseexpandafterendcsname else
endgroup csname iftrueexpandafterendcsname fi}%
expandaftertmp#1endlistsep#2end
}
defextrawords{ a is of óf }
Xcapitalize{here is a list of words óf école}X
bye
Answered by wipet on November 23, 2020
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP