TeX - LaTeX Asked by Maïeul on December 31, 2020
I know that TeX can’t write the content of hbox in an auxiliary file (Re-parse the content of a box register).
That mean that
newwritefoo
immediateopenoutfoo=jobname.txt
setbox0=hbox{bar}
immediatewritefoo{box0}
can’t write
However, can LuaTeX do it? I have found
directlua{
n = tex.getbox(0)
}
But I don’t understand what n
is representing and if I could use it to write the box content in a file.
Edit: here is a new code that works for ligatures:
documentclass{article}
usepackage{fontspec}
begin{document}
setbox0=hbox{Příliš žluťoučký textit{kůň} úpěl hbox{ďábelské} ódy, diffierence, difference}
directlua{
% local fontstyles = require "l4fontstyles"
local char = unicode.utf8.char
local glyph_id = node.id("glyph")
local glue_id = node.id("glue")
local hlist_id = node.id("hlist")
local vlist_id = node.id("vlist")
local disc_id = node.id("disc")
local minglue = tex.sp("0.2em")
local usedcharacters = {}
local identifiers = fonts.hashes.identifiers
local function get_unicode(xchar,font_id)
local current = {}
local uchar = identifiers[font_id].characters[xchar].tounicode
for i= 1, string.len(uchar), 4 do
local cchar = string.sub(uchar, i, i + 3)
print(xchar,uchar,cchar, font_id, i)
table.insert(current,char(tonumber(cchar,16)))
end
return current
end
local function nodeText(n)
local t = {}
for x in node.traverse(n) do
% glyph node
if x.id == glyph_id then
% local currentchar = fonts.hashes.identifiers[x.font].characters[x.char].tounicode
local chars = get_unicode(x.char,x.font)
for _, current_char in ipairs(chars) do
table.insert(t,current_char)
end
% glue node
elseif x.id == glue_id and node.getglue(x) > minglue then
table.insert(t," ")
% discretionaries
elseif x.id == disc_id then
table.insert(t, nodeText(x.replace))
% recursivelly process hlist and vlist nodes
elseif x.id == hlist_id or x.id == vlist_id then
table.insert(t,nodeText(x.head))
end
end
return table.concat(t)
end
local n = tex.getbox(0)
print(nodeText(n.head))
local f = io.open("hello.txt","w")
f:write(nodeText(n.head))
f:close()
}
box0
end{document}
Result in hello.txt
:
Příliš žluťoučký kůň úpěl ďábelské ódy, diffierence, difference
Original answer:
Variablen
in your example is a node list. Various types of nodes exists, such as glyphs
for characters, glue
for spacing, or hlist
which is the type you get for your hbox
. hlist
contains child nodes, which are accessible in n.head
attribute. You can then loop this child list for glyphs and glues.
Each node type is distinguishable by value of n.id
attribute. Particular node types and possible attributes are described in chapter "8 Nodes". In this particular example, we need to process just glyph
and glue
nodes, but you should keep in mind that node lists are recursive and various nodes can contain child lists, like hlist
, vlist
, etc. You can support them with recursive call of nodeText
on current node head
attribute.
Regarding glyph nodes, char
attribute contains unicode value only in the case if you use opentype or truetype fonts, if you use old 8-bit fonts, it contains just 8-bit value which actual encoding depends on used font encoding and it isn't easy to convert it to unicode.
documentclass{article}
usepackage{fontspec}
begin{document}
setbox0=hbox{Příliš žluťoučký textit{kůň} úpěl hbox{ďábelské} ódy}
directlua{
local fontstyles = require "l4fontstyles"
local char = unicode.utf8.char
local glyph_id = node.id("glyph")
local glue_id = node.id("glue")
local hlist_id = node.id("hlist")
local vlist_id = node.id("vlist")
local minglue = tex.sp("0.2em")
local usedcharacters = {}
local identifiers = fonts.hashes.identifiers
local function get_unicode(xchar,font_id)
return char(tonumber(identifiers[font_id].characters[xchar].tounicode,16))
end
local function nodeText(n)
local t = {}
for x in node.traverse(n) do
% glyph node
if x.id == glyph_id then
% local currentchar = fonts.hashes.identifiers[x.font].characters[x.char].tounicode
table.insert(t,get_unicode(x.char,x.font))
local y = fontstyles.get_fontinfo(x.font)
print(x.char,y.name,y.weight,y.style)
% glue node
elseif x.id == glue_id and node.getglue(x) > minglue then
table.insert(t," ")
elseif x.id == hlist_id or x.id == vlist_id then
table.insert(t,nodeText(x.head))
end
end
return table.concat(t)
end
local n = tex.getbox(0)
print(nodeText(n.head))
local f = io.open("hello.txt","w")
f:write(nodeText(n.head))
f:close()
}
box0
end{document}
nodeText
function returns text contained in the node list. It is used to print hbox
contents to the terminal and to write to file hello.txt
in this example.
For basic info about font style, you can try to use l4fontstyles module, like this:
local fontstyles = require "l4fontstyles"
...
if x.id == glyph_id then
table.insert(t,char(x.char))
local y = fontstyles.get_fontinfo(x.font)
print(y.name,y.weight,y.style)
Correct answer by michal.h21 on December 31, 2020
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP