TransWikia.com

Single quotes are not working properly in XeLaTeX for Bengali language

TeX - LaTeX Asked by MKS on July 20, 2021

Please see MWE below:

documentclass[12pt,a4paper]{article}
% For a bilingual document
RequirePackage{fontspec}
RequirePackage{polyglossia}
setmainlanguage{english}
defaultfontfeatures{Ligatures=TeX}
% Times New Roman used for English
setmainfont[Mapping=tex-text, Ligatures=TeX]{Times New Roman}
setmainlanguage[numerals=Devanagari]{bengali}
setotherlanguage{english}
% Bengali
newfontfamilybengalifont[Script=Bengali,AutoFakeBold=4.0,AutoFakeSlant=0.4]{SolaimanLipi}
newfontfamilybengalifontbf[Script=Bengali,AutoFakeBold=4.0,AutoFakeSlant=0.4]{SolaimanLipi}
newfontfamilybengalifontsf[Script=Bengali,AutoFakeBold=4.0,AutoFakeSlant=0.4]{SolaimanLipi}
title{LaTeX  ইংলিশ ডকুমেন্টে বাংলা বোল্ড এবং ইটালিক ফন্ট লেখাটি কীভাবে লিখবেন?}
author{MKS}

begin{document}
maketitle
`সাধারন  স্টাইল',   textbf{বোল্ড  ফন্ট স্টাইল }, textit{ইটালিক ফন্ট স্টাইল । }
end{document}

Output:
enter image description here

Single quotes not working in Bengali. What to do to work properly?

2 Answers

There are a few problems here: polyglossia is making ` active, and I suspect that the font you selected (you do not say which of several with similar names you were using) does not contain smart quotes at all. The most-likely match I could find does not.

Entering the Unicode characters and , and selecting a font that has the characters, makes that part of it work. However, many Bengali fonts lack the Devanagari numerals you requested.

documentclass[12pt,a4paper]{article}
tracinglostchars=2
% For a bilingual document
RequirePackage{polyglossia}
defaultfontfeatures{ Ligatures=TeX, Scale=MatchUppercase }
% Times New Roman used for English
setmainfont{Times New Roman}
setmainlanguage{bengali}
setotherlanguage{english}

% Bengali
newfontfamilybengalifont{NotoSerifBengali}[
  Script=Bengali,
  Language=Bengali,
  AutoFakeBold = 0.2,
  AutoFakeSlant = 0.15  ]
title{textenglish{LaTeX}  ইংলিশ ডকুমেন্টে বাংলা বোল্ড এবং ইটালিক ফন্ট লেখাটি কীভাবে লিখবেন?}
author{textenglish{MKS}}

begin{document}
maketitle
‘সাধারন  স্টাইল’,   textbf{বোল্ড  ফন্ট স্টাইল}, textit{ইটালিক ফন্ট স্টাইল । }
end{document}

if you do want to use this particular font, in XeLaTeX, you could in theory use ucharclasses to take Bangla from one font, Latin and punctuation from another, and Devanagari from a third.

Correct answer by Davislor on July 20, 2021

Using expl3 regex and replace functions, for xelatex or lualatex, as a sort of manual way of doing the ucharclasses functionality mentioned in @Davislor 's comment.

quotes and digits

The colours are just for testing, to show that the glyphs are coming in from different fonts.

MWE

documentclass[12pt,a4paper]{article}
usepackage{xcolor}
% For a bilingual document
usepackage{fontspec}
usepackage{polyglossia}
setmainlanguage{english}
defaultfontfeatures{Ligatures=TeX}
% Times New Roman used for English
setmainfont[Mapping=tex-text, Ligatures=TeX]{Times New Roman}
setmainlanguage[numerals=Bengali,
changecounternumbering=true]{bengali}
setotherlanguage{english}
%Punctuation (quotes) source:
newfontfamilyftpunct{Times New Roman}[Colour=red]%for testing
%Digits
newfontfamilyftdigits{Noto Sans Devanagari}[Colour=blue]%for testing
% Bengali
newfontfamilybengalifont[Script=Bengali,AutoFakeBold=4.0,AutoFakeSlant=0.4]{Kalpurush}%SolaimanLipi}
newfontfamilybengalifontbf[Script=Bengali,AutoFakeBold=4.0,AutoFakeSlant=0.4]{Kalpurush}%SolaimanLipi}
newfontfamilybengalifontsf[Script=Bengali,AutoFakeBold=4.0,AutoFakeSlant=0.4]{Kalpurush}%SolaimanLipi}SolaimanLipi}
title{LaTeX  ইংলিশ ডকুমেন্টে বাংলা বোল্ড এবং ইটালিক ফন্ট লেখাটি কীভাবে লিখবেন?}
author{MKS}


%========================
ExplSyntaxOn

tl_new:N l_myxuchar_tl 
tl_new:N l_myxucharb_tl 
NewDocumentEnvironment{xuchare}{ +b }
{ 
                  tl_set:Nn l_myxuchar_tl  { #1 }
                    doxuchar
                    tl_use:N l_myxuchar_tl 
}{}
NewDocumentCommand{xuchar}{ m }
{ 
                  tl_set:Nn l_myxucharb_tl  { #1 }
                    doxucharb
                    tl_use:N l_myxucharb_tl 
}{}

newcommanddoxuchar{
                    regex_replace_all:nnN %opening quote
                                        { 
                                                ([`]+)
                                                ([ঀ-৿]{1})   % Bengali glyphs: 0980 to 09FF
                                                }                                       
                                        { 
                                            cB{ 
                                            c{formatquotes}
                                            1  
                                            cE} 
                                            2
                                            } 
                                        l_myxuchar_tl 
                    regex_replace_all:nnN %closing quote
                                        { 
                                                ([ঀ-৿]{1})   % Bengali glyphs: 0980 to 09FF
                                                ([']+)
                                                }                                       
                                        { 
                                            1  
                                            cB{ 
                                            c{formatquotes}
                                            2
                                            cE} 
                                            } 
                                        l_myxuchar_tl 
                    %Bengali digits to Devanagari
                    tl_replace_all:Nnn l_myxuchar_tl { ০ } { {formatdigits ०} }
                    tl_replace_all:Nnn l_myxuchar_tl { ১ } { {formatdigits १} }
                    tl_replace_all:Nnn l_myxuchar_tl { ২ } { {formatdigits २} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৩ } { {formatdigits ३} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৪ } { {formatdigits ४} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৫ } { {formatdigits ५} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৬ } { {formatdigits ६} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৭ } { {formatdigits ७} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৮ } { {formatdigits ८} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৯ } { {formatdigits ९} }
                                         
}

newcommanddoxucharb{
                    %Bengali digits to Devanagari
            tl_replace_all:Nnn l_myxucharb_tl { ০ } { {formatdigits ०} }
            tl_replace_all:Nnn l_myxucharb_tl { ১ } { {formatdigits १} }
            tl_replace_all:Nnn l_myxucharb_tl { ২ } { {formatdigits २} }
            tl_replace_all:Nnn l_myxucharb_tl { ৩ } { {formatdigits ३} }
            tl_replace_all:Nnn l_myxucharb_tl { ৪ } { {formatdigits ४} }
            tl_replace_all:Nnn l_myxucharb_tl { ৫ } { {formatdigits ५} }
            tl_replace_all:Nnn l_myxucharb_tl { ৬ } { {formatdigits ६} }
            tl_replace_all:Nnn l_myxucharb_tl { ৭ } { {formatdigits ७} }
            tl_replace_all:Nnn l_myxucharb_tl { ৮ } { {formatdigits ८} }
            tl_replace_all:Nnn l_myxucharb_tl { ৯ } { {formatdigits ९} }
                                         
}

date{exp_args:Ne xuchar { today } }


ExplSyntaxOff

newcommandformatquotes{ftpunct}%switch
newcommandformatdigits{ftdigits}%switch

AtBeginEnvironment{document}{begin{xuchare}}
AfterEndEnvironment{document}{end{xuchare}}




begin{document}
maketitle
`সাধারন  স্টাইল',  ``সাধারন  স্টাইল'', textbf{বোল্ড  ফন্ট স্টাইল }, textit{ইটালিক ফন্ট স্টাইল । }

০১২৩৪৫৬৭৮৯

end{document}

More detail:

SolaimanLipi font does not have left and right quotes.

Changing to a more up-to-date font, or one with more coverage, is the most optimal solution.

FreeSerif font, for example, has Bengali letters, left and right quotes, and Devanagari digits.

Keeping the original font, and adding in glyphs from other fonts, can be done in several ways.

The ucharclasses package for xelatex was designed for this.

A package-less xelatex/lualatex solution is also possible, in several ways.

Example manual methods

(A) Use expl3 regex commands to replace the grave accent `` and single quote ' with left and right quotes in the appropriate font.

The implementation in the MWE uses a document-wide environment for the regex, so is somewhat breakable when new combinations of characters are introduced into the document which then happen to unintentionally match the regular expression.

The regex which searches for a grave accent directly followed by any Bengali character looks like this:

regex_replace_all:nnN %opening quote
    { 
      ([`]+) ([ঀ-৿]{1}) % Bengali glyphs: 0980 to 09FF
    }                                       
    { 
     cB{ c{formatquotes} 1  cE} 2 }                        l_myxuchar_tl 

(B) Use expl3 replace_all functionality to replace custom unique markup code (qmnl and qmnr in the example) with the formatted quotes.

The user must type the markup code in the correct position.

The equivalent replace function, producing the left quote (U+2018), looks like this:

 % Replace shortcut 
    tl_replace_all:Nnn l_myxuchar_tl { qmnl } { {formatquotes ^^^^2018 } }

(C) Create custom commands (qmol and qmor in the example) which expand to the formatted quotes.

newcommandqmol{{formatquotes ^^^^2018}}

(D) Create a convenient command (qenquote{}) which applies the two commands from (C) to an argument passed in as a parameter, namely the text to be quoted.

newcommandqenquote[1]{{formatquotes ^^^^2018}#1{formatquotes ^^^^2019}}

But much more practical, if a suitable font is available, direct input is possible (E, ‘অ’१ ), and if the quotes are not directly accessible, the text... commands can be used (F) -- this latter either in the base font or formatted with the imported font.

With mixing of fonts, much time can be spent searching for fonts that go together well enough to be usable outside of a designer's context.

Mixing fonts

MWE

documentclass[12pt,a4paper]{article}
usepackage[table]{xcolor}
usepackage{fontspec}
usepackage{polyglossia}
defaultfontfeatures{Ligatures=TeX}
setmainfont[Renderer=HarfBuzz]{Kalpurush}%Times New Roman}
setmainlanguage[numerals=Bengali,
changecounternumbering=true]{bengali}
setotherlanguage{english}
%Punctuation (quotes) source:
newfontfamilyftpunct{Times New Roman}[Colour=red]%for testing
%Digits: Devanagari
newfontfamilyftdigits{Noto Sans Devanagari}[Colour=blue]%for testing
% Bengali
newfontfamilybengalifont[Renderer=HarfBuzz,Script=Bengali,AutoFakeBold=4.0,AutoFakeSlant=0.4]{Kalpurush}
newfontfacebfont{detokenize{Charu_Chandan_3D_Unicode-Regular}}[Extension=.ttf,
Path=C:/Windows/Fonts/,
Renderer=HarfBuzz,
Script=Bengali,
UprightFont=*,]
newfontfamilyfall{FreeSerif}[Colour=violet]


%========================

ExplSyntaxOn

tl_new:N l_myxuchar_tl 
tl_new:N l_myxucharb_tl 
NewDocumentEnvironment{xuchare}{ +b }
{ 
                  tl_set:Nn l_myxuchar_tl  { #1 }
                    doxuchar
                    tl_use:N l_myxuchar_tl 
}{}
NewDocumentCommand{xuchar}{ m }
{ 
                  tl_set:Nn l_myxucharb_tl  { #1 }
                    doxucharb
                    tl_use:N l_myxucharb_tl 
}{}
%------------------------------- print and run #1
tl_new:N l_my_tl

NewDocumentCommand{cdr}{ m }{%
    tl_set:Nn l_my_tl { #1 }
    { ttfamilycolor{blue}
%  token_to_str:N #1
  detokenize{#1}
  }
  
  enspace $mapsto$ enspace
  
  colorbox{ blue!20 }{ tl_use:N l_my_tl }
}%

newcommanddoxuchar{
                    regex_replace_all:nnN %opening quote
                                        { 
                                                ([`]+)
                                                ([ঀ-৿]{1})   % Bengali glyphs: 0980 to 09FF
                                                }                                       
                                        { 
                                            cB{ 
                                            c{formatquotes}
                                            1  
                                            cE} 
                                            2
                                            } 
                                        l_myxuchar_tl 
                    regex_replace_all:nnN %closing quote
                                        { 
                                                ([ঀ-৿]{1})   % Bengali glyphs: 0980 to 09FF
                                                ([']+)
                                                }                                       
                                        { 
                                            1  
                                            cB{ 
                                            c{formatquotes}
                                            2
                                            cE} 
                                            } 
                                        l_myxuchar_tl 
 % Replace shortcut 
    tl_replace_all:Nnn l_myxuchar_tl { qmnl } { {formatquotes ^^^^2018 } }           
    tl_replace_all:Nnn l_myxuchar_tl { qmnr } { {formatquotes ^^^^2019 } }                                   
                    %Bengali digits to Devanagari
                    tl_replace_all:Nnn l_myxuchar_tl { ০ } { {formatdigits ०} }
                    tl_replace_all:Nnn l_myxuchar_tl { ১ } { {formatdigits १} }
                    tl_replace_all:Nnn l_myxuchar_tl { ২ } { {formatdigits २} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৩ } { {formatdigits ३} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৪ } { {formatdigits ४} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৫ } { {formatdigits ५} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৬ } { {formatdigits ६} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৭ } { {formatdigits ७} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৮ } { {formatdigits ८} }
                    tl_replace_all:Nnn l_myxuchar_tl { ৯ } { {formatdigits ९} }
                                         
}

newcommanddoxucharb{
                    %Bengali digits to Devanagari
            tl_replace_all:Nnn l_myxucharb_tl { ০ } { {formatdigits ०} }
            tl_replace_all:Nnn l_myxucharb_tl { ১ } { {formatdigits १} }
            tl_replace_all:Nnn l_myxucharb_tl { ২ } { {formatdigits २} }
            tl_replace_all:Nnn l_myxucharb_tl { ৩ } { {formatdigits ३} }
            tl_replace_all:Nnn l_myxucharb_tl { ৪ } { {formatdigits ४} }
            tl_replace_all:Nnn l_myxucharb_tl { ৫ } { {formatdigits ५} }
            tl_replace_all:Nnn l_myxucharb_tl { ৬ } { {formatdigits ६} }
            tl_replace_all:Nnn l_myxucharb_tl { ৭ } { {formatdigits ७} }
            tl_replace_all:Nnn l_myxucharb_tl { ৮ } { {formatdigits ८} }
            tl_replace_all:Nnn l_myxucharb_tl { ৯ } { {formatdigits ९} }
                                         
}

date{exp_args:Ne xuchar { today } }


ExplSyntaxOff

newcommandformatquotes{ftpunct}%switch
newcommandformatdigits{ftdigits}%switch
newcommandqmol{{formatquotes ^^^^2018}}
newcommandqmor{{formatquotes ^^^^2019}}
newcommandqenquote[1]{{formatquotes ^^^^2018}#1{formatquotes ^^^^2019}}

AtBeginEnvironment{document}{begin{xuchare}}
AfterEndEnvironment{document}{end{xuchare}}

newcommandtesttext{সাধারন  স্টাইল}
newcommandeng[1]{begin{english}#1end{english}}


begin{document}
eng{Digits:}
০১২৩৪৫৬৭৮৯

bigskip
begin{tabular}{lllll}
rowcolor{blue!5}
Method & Command & How & Input & Output 
hline
A & expl3 regex & find-replace & textasciigrave  অtextquotesingle & `অ' 
B & expl3 replace & find-replace & qmn{}lঅqmn{}r & qmnlঅqmnr 
C & macros & expand & textbackslash qmol অtextbackslash qmor & qmol অqmor 
D & command & argument & textbackslash qenquote{অ} & qenquote{অ} 
E & in font &FreeSerif & direct input: colorbox{yellow!40}{fall{^^^^2018অ^^^^2019१}} & colorbox{yellow!40}{fall{^^^^2018অ^^^^2019१}} 
F & commands & kernel & textbackslash textquoteleft অtextbackslash textquoteright & textquoteleft অtextquoteright 
hline
    end{tabular}

bigskip
begin{english}
cdr{symbol{39}} quad quotesingle, {textquotesinglesmall becomes quoteright}

cdr{symbol{96}} quad grave accent, {textasciigravesmall becomes quoteleft}

cdr{symbol{8216}} quad quoteleft

cdr{symbol{8217}} quad quoteright

cdr{symbol{8220}} quad quotedblleft

cdr{symbol{8221}} quad quotedblright

cdr{textquotedblleft}

cdr{textquotedblright}

cdr{textquotedbl}

cdr{textquotesingle}

cdr{textasciigrave}

end{english}


section{কূল্বনন}
কূল্বনন  {bfont `testtext' eng{versus} `সাধারন  স্টাইল'} eng{versus} {bfont `}সাধারন  স্টাইল{bfont '}

end{document}

Answered by Cicada on July 20, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP