Unix & Linux Asked by ssssteffff on November 21, 2021
I migrate a system from a CentOS 6.9 VM to a Debian 10 Docker container, and I can’t explain why thousands separator differs. Same locale (fr_FR.UTF-8), same version of the locale yet different separator:
CentOS 6.9 VM:
[user@host ~]$ cat /etc/redhat-release
CentOS release 6.9 (Final)
[user@host ~]$ locale -v -a fr_FR.UTF-8 | grep -A10 fr_FR.utf8
locale: fr_FR.utf8 archive: /usr/lib/locale/locale-archive
-------------------------------------------------------------------------------
title | French locale for France
source | RAP
contact | Traduc.org
email | [email protected]
language | French
territory | France
revision | 1.0
date | 2008-03-15
codeset | UTF-8
[user@host ~]$ yum list installed | grep libc
glibc.x86_64 2.12-1.209.el6_9.2 @updates
glibc-common.x86_64 2.12-1.209.el6_9.2 @updates
glibc-devel.x86_64 2.12-1.209.el6_9.2 @updates
glibc-headers.x86_64 2.12-1.209.el6_9.2 @updates
[...]
[user@host ~]$ grep "thousands_sep" /usr/share/i18n/locales/fr_FR
mon_thousands_sep "<U0020>"
thousands_sep "<U0020>"
[user@host ~]$ LC_NUMERIC="fr_FR" printf "%'.fn" 1234 | hexdump -C
00000000 31 20 32 33 34 0a |1 234.|
00000006
Debian 10 container:
root@240c7f7ca3a1:~# cat /etc/issue.net
Debian GNU/Linux 10
root@240c7f7ca3a1:~# locale -v -a fr_FR.UTF-8 | grep -A10 fr_FR.utf8
locale: fr_FR.utf8 archive: /usr/lib/locale/locale-archive
-------------------------------------------------------------------------------
title | French locale for France
source | RAP
contact | Traduc.org
email | [email protected]
language | French
territory | France
revision | 1.0
date | 2008-03-15
codeset | UTF-8
root@240c7f7ca3a1:~# apt list --installed | grep libc
libc-bin/stable,now 2.28-10 amd64 [installé, automatique]
libc-l10n/stable,now 2.28-10 all [installé, automatique]
libc6/stable,now 2.28-10 amd64 [installé]
[...]
root@240c7f7ca3a1:~# grep "thousands_sep" /usr/share/i18n/locales/fr_FR
mon_thousands_sep "<U202F>"
thousands_sep "<U202F>"
root@240c7f7ca3a1:~# LC_NUMERIC="fr_FR" printf "%'.fn" 1234 | hexdump -C
00000000 31 e2 80 af 32 33 34 0a |1...234.|
00000008
As you can see, in the first case I get a normal space (<U0020>
/ 20), while in the latter case I get a narrow non breakable space (<U202F>
/ e2 80 af).
I understand that the NNBSP is the legit character for french locale (according to several source including Wikipedia), but this changes my application behavior when PDF reports are getting generated (this character does not exist in every font).
I see a lot of debates about what character it should be on GNU/Glibc/JDK mailing lists, but can’t find where it’s been changed in Glibc changelog.
I could simply replace all NNBSP with standard space (or simple NBSP) directly in my code to fix the application, but this seems a bit messy to me.
I guess I can modify locale file and recompile it ?
Is there a better solution?
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP