TeX - LaTeX Asked by KramerTheCat on March 30, 2021
I’m having some problems with listings and UTF-8 in my document. Maybe someone can help me? Some characters work, like é and ó, but á and others appear at the beginning of words…
documentclass[12pt,a4paper]{scrbook}
KOMAoptions{twoside=false,open=any,chapterprefix=on,parskip=full,fontsize=14pt}
usepackage[portuguese]{babel}
usepackage[utf8]{inputenc}
usepackage[T1]{fontenc}
usepackage{listingsutf8}
usepackage{inconsolata}
lstset{
language=bash, %% Troque para PHP, C, Java, etc... bash é o padrão
basicstyle=ttfamilysmall,
numberstyle=footnotesize,
numbers=left,
backgroundcolor=color{gray!10},
frame=single,
tabsize=2,
rulecolor=color{black!30},
title=lstname,
escapeinside={%*}{*)},
breaklines=true,
breakatwhitespace=true,
framextopmargin=2pt,
framexbottommargin=2pt,
extendedchars=false,
inputencoding=utf8
}
begin{document}
begin{lstlisting}
<?php
echo 'Olá mundo!';
print 'Olá mundo!';
end{lstlisting}
end{document}
end{lstlisting}
One way to get around this limitation of listings
is to use the option extendedchars=true
and then to use the literate
option for each accents you're going to be using (it's a bit tedious to do, but once you've done all the accents of your language, you never have to worry about them again). The syntax is
literate={á}{{'a}}1 {ã}{{~a}}1 {é}{{'e}}1
For each accent you must put the real character inside braces (e.g. {á}
) then you put what you want this character to be inside double braces (e.g. {{'a}}
) and finally you put the number one (1
); between two entries, you can put a space for clarity.
Here's your example modified to use this:
documentclass[12pt,a4paper]{scrbook}
KOMAoptions{twoside=false,open=any,chapterprefix=on,parskip=full,fontsize=14pt}
usepackage[portuguese]{babel}
usepackage[utf8]{inputenc}
usepackage[T1]{fontenc}
usepackage{listings}
usepackage{xcolor}
usepackage{inconsolata}
lstset{
language=bash, %% Troque para PHP, C, Java, etc... bash é o padrão
basicstyle=ttfamilysmall,
numberstyle=footnotesize,
numbers=left,
backgroundcolor=color{gray!10},
frame=single,
tabsize=2,
rulecolor=color{black!30},
title=lstname,
escapeinside={%*}{*)},
breaklines=true,
breakatwhitespace=true,
framextopmargin=2pt,
framexbottommargin=2pt,
inputencoding=utf8,
extendedchars=true,
literate={á}{{'a}}1 {ã}{{~a}}1 {é}{{'e}}1,
}
begin{document}
begin{lstlisting}
<?php
echo 'Olá mundo!';
print 'áãé';
end{lstlisting}
end{document}
Correct answer by Philippe Goutet on March 30, 2021
The way the inputenc
package works with non-ASCII UTF-8-encoded characters (by making the first byte active and then reading the following ones as arguments) is fundamentally incompatible with the way the listing
package works, which reads each byte individually and expects it to be an individual character.
The listingsutf8 package tries to work around this for the case that your characters are convertible to some 8-bit encoding (and you are using PdfLaTeX) - but this will work only with lstinputlisting
(as Marc's answer pointed out), not with inline listings. For inline listings the literate
option (as pointed out by Phillipe) sounds good. An alternative would be escaping to LaTeX (as pointed out by Gonzalo) - but this makes simple cut-and-paste not work.
The last time I had to typeset a code which included non-ASCII Unicode characters (stuff like ℤ as Java identifiers, which are not in any 8-bit encoding, AFAIK), I switched to XeLaTeX, which supports UTF-8 input out of the box, without needing the inputenc package. With this, it worked nicely. I suppose LuaLaTeX would work the same way (but it was not that mature then).
(But I later wanted the comments to be formatted, too, thus I started/revived my ltxdoclet project to include source code and formatted comments.)
Answered by Paŭlo Ebermann on March 30, 2021
With the listingsutf8 package and a traditional (not UTF-8) TeX engine, you have to use the lstinputlisting command only, which properly displays a UTF-8 encoded file. You can't use the lstlisting environment, unless the code inside is plain ASCII.
Answered by Marc Baudoin on March 30, 2021
Escape those characters to LaTeX, as the documentation (listings manual, page 14) suggests:
Similarly, if you are using UTF-8 extended characters in a listing, they must be placed within an escape to LaTeX.
documentclass[12pt,a4paper]{scrbook}
KOMAoptions{twoside=false,open=any,chapterprefix=on,parskip=full,fontsize=14pt}
usepackage[portuguese]{babel}
usepackage[utf8]{inputenc}
usepackage[T1]{fontenc}
usepackage{listingsutf8}
usepackage{xcolor}
usepackage{inconsolata}
lstset{
language=bash, %% Troque para PHP, C, Java, etc... bash é o padrão
basicstyle=ttfamilysmall,
numberstyle=footnotesize,
numbers=left,
backgroundcolor=color{gray!10},
frame=single,
tabsize=2,
rulecolor=color{black!30},
title=lstname,
escapeinside={%*}{*)},
breaklines=true,
breakatwhitespace=true,
framextopmargin=2pt,
framexbottommargin=2pt,
extendedchars=false,
inputencoding=utf8
}
begin{document}
begin{lstlisting}
<?php
echo '%*Olá mundo*)!';
print '%*Olá mundo*)!';
end{lstlisting}
end{document}
Answered by Gonzalo Medina on March 30, 2021
This is a modified version for adding support to Swedish and German characters (åäö üß) as well as Portuguese characters.
Put the following line in the header:
usepackage{inconsolata} % Swedish encoding in lstlisting
and then where you want the code listing put the code below.
lstset{
language=bash, % Switch code language ... bash is the default
basicstyle=ttfamilyfootnotesize,
numberstyle=tiny,
numbers=left,
backgroundcolor=color{gray!10},
frame=single,
tabsize=2,
rulecolor=color{black!30},
title=lstname,
escapeinside={%*}{*)},
breaklines=true,
breakatwhitespace=true,
framextopmargin=2pt,
framexbottommargin=2pt,
inputencoding=utf8,
extendedchars=true,
% Support for Swedish, German and Portuguese umlauts
literate=%
{Ö}{{"O}}1
{Ä}{{"A}}1
{Å}{{AA{}}}1
{Ü}{{"U}}1
{ß}{{ss}}1
{ü}{{"u}}1
{ö}{{"o}}1
{ä}{{"a}}1
{å}{{aa{}}}1
{á}{{'a}}1
{ã}{{~a}}1
{é}{{'e}}1,
}
lstinputlisting[language=bash]{your_code_file.txt}
Answered by Tobias Holm on March 30, 2021
Just to help people, here is a quite complete literate
statement for using with lstlistings
:
lstset{
inputencoding = utf8, % Input encoding
extendedchars = true, % Extended ASCII
literate = % Support additional characters
{á}{{'a}}1 {é}{{'e}}1 {í}{{'i}}1 {ó}{{'o}}1 {ú}{{'u}}1
{Á}{{'A}}1 {É}{{'E}}1 {Í}{{'I}}1 {Ó}{{'O}}1 {Ú}{{'U}}1
{à}{{`a}}1 {è}{{`e}}1 {ì}{{`i}}1 {ò}{{`o}}1 {ù}{{`u}}1
{À}{{`A}}1 {È}{{'E}}1 {Ì}{{`I}}1 {Ò}{{`O}}1 {Ù}{{`U}}1
{ä}{{"a}}1 {ë}{{"e}}1 {ï}{{"i}}1 {ö}{{"o}}1 {ü}{{"u}}1
{Ä}{{"A}}1 {Ë}{{"E}}1 {Ï}{{"I}}1 {Ö}{{"O}}1 {Ü}{{"U}}1
{â}{{^a}}1 {ê}{{^e}}1 {î}{{^i}}1 {ô}{{^o}}1 {û}{{^u}}1
{Â}{{^A}}1 {Ê}{{^E}}1 {Î}{{^I}}1 {Ô}{{^O}}1 {Û}{{^U}}1
{œ}{{oe}}1 {Œ}{{OE}}1 {æ}{{ae}}1 {Æ}{{AE}}1 {ß}{{ss}}1
{ç}{{c c}}1 {Ç}{{c C}}1 {ø}{{o}}1 {å}{{r a}}1 {Å}{{r A}}1
{ã}{{~a}}1 {õ}{{~o}}1 {Ã}{{~A}}1 {Õ}{{~O}}1
{ñ}{{~n}}1 {Ñ}{{~N}}1 {¿}{{?`}}1 {¡}{{!`}}1
{°}{{textdegree}}1 {º}{{textordmasculine}}1 {ª}{{textordfeminine}}1
% ¿ and ¡ are not correctly displayed if inconsolata font is used
% together with the lstlisting environment. Consider typing code in
% external files and using lstinputlisting to display them instead.
}
Answered by Rmano on March 30, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP