TransWikia.com

MathML as well need LaTeX coding in tex4ht

TeX - LaTeX Asked on January 21, 2021

I would like to convert the below file to MathML format as well need the LaTeX coding in MathML Semantic tag.

MWE:

documentclass{article}
usepackage[T1]{fontenc}

begin{document}

article{Article Title Here}
author{Author Name Here}
maketitle

section{Introduction}

This is the sample paragraph.
 begin{equation}label{eq1-11}
T,^{prime}_{mu nu} = left( frac{partial xi^alpha} {partialxi^{primemu}}right) left( frac{partial xi^beta}{partial xi^{primenu}} right) T_{alpha beta}
end{equation}

Please refer the equations ref{eq1-11} for the further testing.
end{document}

2 Answers

Provided MWE having many LaTeX coding errors, I've fixed and the modified tags are:

documentclass{article} 
usepackage[T1]{fontenc}

begin{document}

title{Article Title Here}

author{Author Name Here}

maketitle

section{Introduction}

This is the sample paragraph.
begin{equation}label{eq1-11}
T,^{prime}_{mu nu} = left( frac{partial xi^{alpha}}
{partialxi^{primemu}}right) left( frac{partial xi^{beta}}{partial xi^{primenu}} right) T_{alpha beta}
end{equation}

Please refer the equations ref{eq1-11} for the further testing.
end{document}

After correct the errors, I've run the command

htlatex test "xhtml,mathml,mathml-" " -cunihft" "-cvalidate -p"

It converts nicely...

EDIT

If you need to get display the LaTeX tags in the converted HTML, then use the below .cfg file:

conversion.cfg

RequirePackage{verbatim,etoolbox}

Preamble{xhtml}
defAltMathOne#1${HCode{detokenize{(#1)}}$}
Configure{$}{}{}{expandafterAltMathOne} 
defAltlMath#1){HCode{detokenize{(#1)}})}
Configure{()}{AltlMath}{}
defAltlDisplay#1]{HCode{detokenize{[#1]}}]}
Configure{[]}{AltlDisplay}{}
defAltDisplayOne#1#2$${#1HCode{detokenize{$$#2$$}}$$}
Configure{$$}{}{}{AltDisplayOne}{}{}
newcommandVerbMath[1]{%
ifcsdef{#1}{%
  renewenvironment{#1}{%
    NoFonts%
  Configure{verbatim}{}{} % suppress <br /> tags
    texttt{stringbegin{#1}}HCode{Hnewline}% we need to use texttt to get all characters right
      verbatim}{endverbatimtexttt{stringend{#1}}EndNoFonts}%
}{}%
}
VerbMath{align}
VerbMath{equation}
VerbMath{equation*}

begin{document}

EndPreamble

Then the run command:

htlatex sample "conversion" " " "-cvalidate -p"

Answered by MadyYuvi on January 21, 2021

There are several possible approaches how to achieve this:

  1. configure TeX4ht to catch all math content and typeset it twice - once using MathML, second time as a verbatim text.
  2. parse MathML content and convert it back to the LaTeX code
  3. pre-process the input TeX file and modify it in the way it will be easier for working with

The first method could reuse the code that we use for the MathJax option in TeX4ht, see file mathjax-latex-4ht.4ht for details.

The second method won't produce the same LaTeX code as was the original input. It may be a problem for you. LuaXML can be used for the conversion.

I will present the third method in my answer. It consists of two components - the input filter that parses the input LaTeX file for the math content and marks it with some additional macros, and make4ht DOM filter that modifies the resulting HTML file to produce the correct MathML structure.

Here is the input filter. It reads input from the standard input and prints the modified output.

File altmath.lua:

-- insert envrionmnets that should be handled by the script here
local math_environments = {
  equation = true,
  displaymath = true,
  ["equation*"] = true,

}

-- macros that will be inserted to the updated document
local macros = [[
NewDocumentCommandinlinemath {mv} {HCode{<span class="inlinemath">}#1HCode{<span class="alt">}NoFonts #2EndNoFontsHCode{</span></span>}}
NewDocumentEnvironment{altdisplaymath}{} {ifvmodeIgnoreParfiEndPHCode{<div class="altmath">}} {ifvmodeIgnoreParfiEndPHCode{</div>}}
]]

-- we will insert macros before the second control sequence (we assume that first is documentclass
local cs_counter = 0

-- we will hanlde inline and diplay math differently
local inline  = 1
local display = 2

local function handle_math(input, nexts, stop, buffer, mathtype)
  local content = input:sub(nexts, stop)
  local format = "inlinemath{%s}{%s}" -- format used to insert math content back to the doc
  -- set format for display math
  if mathtype == display then
    format = [[
begin{altdisplaymath}
%s
begin{verbatim}
%s
end{verbatim}
end{altdisplaymath}
]]
  end
  buffer[#buffer + 1] =  string.format(format, content, content )
end

local function find_next(input, start, buffer)
  -- find next cs or math start
  local nexts, stop = input:find("[$]", start)
  local mathtype   
  if nexts then
    -- save current text chunk from the input buffer
    buffer[#buffer+1] = input:sub(start, nexts - 1)
    local kind, nextc = input:match("(.)(.)", nexts)
    if kind == "" then -- handle cs
      -- insert our custom TeX macros before second control sequence
      cs_counter = cs_counter + 1
      if cs_counter == 2 then
        buffer[#buffer+1] = macros
      end
      if nextc == "(" then -- inline math
        _, stop = input:find(")", nexts)
        mathtype = inline
      elseif nextc == "[" then -- display math
        _, stop = input:find("]", nexts)
        mathtype = display
      else -- maybe environment?
        -- find environment name
        local env_name = input:match("^begin%s*{(.-)}", nexts+1)
        -- it must be enabled as math environment
        if env_name and math_environments[env_name] then
          _, stop = input:find("end%s*{" .. env_name .. "}", nexts)
          mathtype = display
        else -- not math environment 
          buffer[#buffer+1] = "" -- save backspace that was eaten by the processor
          return stop + 1 -- return back to the main loop
        end
      end
    else -- handle $
      if nextc == "$" then -- display math
        _, stop = input:find("%$%$", nexts + 1)
        mathtype = display
      else -- inline math
        _, stop = input:find("%$", nexts + 1)
        mathtype = inline
      end
    end
    if not stop then -- something failed, move one char next
      return nexts + 1
    end
    -- save math  content to the buffer
    handle_math(input, nexts, stop, buffer, mathtype)
  else
    -- if we cannot find any more cs or math, we need to insert rest of the input 
    -- to the output buffer
    buffer[#buffer+1] = input:sub(start, string.len(input))
    return nil
  end
  return stop + 1
end

-- process the input buffer, detect inline and display math and also math environments
local function process(input)
  local buffer = {} -- buffer where text chunks are stored
  local start = 1
  start = find_next(input, start,buffer)
  while start do
    start = find_next(input, start, buffer)
  end
  return table.concat(buffer) -- convert output buffer to string
end


local content = io.read("*all")
print(process(content))

You can test it using the following command:

texlua altmath.lua < sample.tex

This is modified version of your original TeX file:

documentclass{article}
NewDocumentCommandinlinemath {mv} {HCode{<span class="inlinemath">}#1HCode{<span class="alt">}NoFonts #2EndNoFontsHCode{</span></span>}}
NewDocumentEnvironment{altdisplaymath}{} {ifvmodeIgnoreParfiEndPHCode{<div class="altmath">}} {ifvmodeIgnoreParfiEndPHCode{</div>}}
usepackage[T1]{fontenc}

begin{document}

title{Article Title Here}
author{Author Name Here}
maketitle

section{Introduction}

This is the sample paragraph with inlinemath{$a=b^2$}{$a=b^2$} inline math. Different inlinemath{(a=c^2)}{(a=c^2)} type of math.
 begin{altdisplaymath}
begin{equation}label{eq1-11}
T,^{prime}_{mu nu} = left( frac{partial xi^alpha} {partialxi^{primemu}}right) left( frac{partial xi^beta}{partial xi^{primenu}} right) T_{alpha beta}
end{equation}
begin{verbatim}
begin{equation}label{eq1-11}
T,^{prime}_{mu nu} = left( frac{partial xi^alpha} {partialxi^{primemu}}right) left( frac{partial xi^beta}{partial xi^{primenu}} right) T_{alpha beta}
end{equation}
end{verbatim}
end{altdisplaymath}


Please refer the equations ref{eq1-11} for the further testing.
end{document}

You can see that it inserts macro definitions after the documentclass command. It defines the inlinemath command and altdisplaymath environment. The definitions contain code that inserts HTML tags directly to the converted file. They are designed to be used just with TeX4ht.

You can convert your file to HTML using

texlua altmath.lua < sample.tex | make4ht -j sample - "mathml"

It produces a following code:

<span class='inlinemath'><!-- l. 14 --><math xmlns='http://www.w3.org/1998/Math/MathML' display='inline'><mi>a</mi> <mo class='MathClass-rel'>=</mo> <msup><mrow><mi>b</mi></mrow><mrow><mn>2</mn></mrow></msup></math><span class='alt'>$a=b^2$</span></span> 

or

<div class='altmath'> <!-- tex4ht:inline --><table class='equation'><tr><td>
<!-- l. 16 --><math xmlns='http://www.w3.org/1998/Math/MathML' display='block' class='equation'>
                       <mstyle class='label' id='x1-1001r1'></mstyle><!-- endlabel --><mi>T</mi><msubsup><mrow><mspace width='0.17em' class='thinspace'></mspace></mrow><mrow><mi mathvariant='italic'>μν</mi></mrow><mrow><mi>′</mi></mrow></msubsup> <mo class='MathClass-rel'>=</mo> <mrow><mo form='prefix' fence='true'> (</mo><mrow> <mfrac><mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi>α</mi></mrow></msup></mrow>
<mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi mathvariant='italic'>′μ</mi></mrow></msup></mrow></mfrac> </mrow><mo form='postfix' fence='true'>)</mo></mrow> <mrow><mo form='prefix' fence='true'> (</mo><mrow> <mfrac><mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi>β</mi></mrow></msup></mrow>
<mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi mathvariant='italic'>′ν</mi></mrow></msup></mrow></mfrac> </mrow><mo form='postfix' fence='true'>)</mo></mrow> <msub><mrow><mi>T</mi></mrow><mrow><mi mathvariant='italic'>αβ</mi></mrow></msub>
</math></td><td class='eq-no'>(1)</td></tr></table>
<!-- l. 18 --><p class='nopar'>

</p>
   <pre id='verbatim-1' class='verbatim'>
begin{equation}label{eq1-11}
T,^{prime}_{mu nu} = left( frac{partial xi^alpha} {partialxi^{primemu}}right) left( frac{partial xi^beta}{partial xi^{primenu}} right) T_{alpha beta}
end{equation}
</pre>
<!-- l. 23 --><p class='nopar'> </p></div>

We need to use make4ht DOM filter to create a correct MathML structure. Save the following file as build.lua:

local domfilter = require "make4ht-domfilter"

-- find mathml and insert TeX as an alternative annotation
local function update_mathml(element, class)
  local alt_element_t = element:query_selector(class)
  if not alt_element_t and not alt_element_t[1] then return nil end
  -- save alt element contents and remove it from the document
  local alt_contents = alt_element_t[1]:get_children()
  alt_element_t[1]:remove_node()
  -- create a new structure of the mathml element ->
  -- mathml 
  --   semantics
  --     mrow -> math content
  --     annotation -> saved TeX
  local mathml = element:query_selector("math")[1]
  local mathml_contents = mathml:get_children()
  local semantics = mathml:create_element("semantics")
  local mrow = semantics:create_element("mrow")
  mrow._children = mathml_contents -- this trick places saved original mathml content into a new <mrow>
  semantics:add_child_node(mrow)
  local annotation = semantics:create_element("annotation", {encoding="application/x-tex"})
  annotation._children = alt_contents
  semantics:add_child_node(annotation)
  mathml._children = {semantics}
end

local process = domfilter {
  function(dom)
    for _, inline in ipairs(dom:query_selector(".inlinemath")) do
      update_mathml(inline, ".alt")
    end
    for _, display in ipairs(dom:query_selector(".altmath")) do
      update_mathml(display, ".verbatim")
    end
    return dom
  end
}

It parses the HTML files for our custom <span> and <div> elements, get the alt text and inserts it as an '` element of the MathML code.

This is the result:

   <h3 class='sectionHead'><span class='titlemark'>1   </span> <a id='x1-10001'></a>Introduction</h3>
<!-- l. 14 --><p class='noindent'>This  is  the  sample  paragraph  with
<span class='inlinemath'><!-- l. 14 --><math display='inline' xmlns='http://www.w3.org/1998/Math/MathML'><semantics><mrow><mi>a</mi> <mo class='MathClass-rel'>=</mo> <msup><mrow><mi>b</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow><annotation encoding='application/x-tex'>$a=b^2$</annotation></semantics></math></span> inline math.
Different <span class='inlinemath'><!-- l. 14 --><math display='inline' xmlns='http://www.w3.org/1998/Math/MathML'><semantics><mrow><mrow><mi>a</mi> <mo class='MathClass-rel'>=</mo> <msup><mrow><mi>c</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></mrow><annotation encoding='application/x-tex'>(a=c^2)</annotation></semantics></math></span>
type of math. </p><div class='altmath'> <!-- tex4ht:inline --><table class='equation'><tr><td>
<!-- l. 16 --><math class='equation' xmlns='http://www.w3.org/1998/Math/MathML' display='block'><semantics><mrow>
                       <mstyle id='x1-1001r1' class='label'></mstyle><!-- endlabel --><mi>T</mi><msubsup><mrow><mspace width='0.17em' class='thinspace'></mspace></mrow><mrow><mi mathvariant='italic'>μν</mi></mrow><mrow><mi>′</mi></mrow></msubsup> <mo class='MathClass-rel'>=</mo> <mrow><mo fence='true' form='prefix'> (</mo><mrow> <mfrac><mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi>α</mi></mrow></msup></mrow>
<mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi mathvariant='italic'>′μ</mi></mrow></msup></mrow></mfrac> </mrow><mo fence='true' form='postfix'>)</mo></mrow> <mrow><mo fence='true' form='prefix'> (</mo><mrow> <mfrac><mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi>β</mi></mrow></msup></mrow>
<mrow><mi>∂</mi><msup><mrow><mi>ξ</mi></mrow><mrow><mi mathvariant='italic'>′ν</mi></mrow></msup></mrow></mfrac> </mrow><mo fence='true' form='postfix'>)</mo></mrow> <msub><mrow><mi>T</mi></mrow><mrow><mi mathvariant='italic'>αβ</mi></mrow></msub>
</mrow><annotation encoding='application/x-tex'>
begin{equation}label{eq1-11}
T,^{prime}_{mu nu} = left( frac{partial xi^alpha} {partialxi^{primemu}}right) left( frac{partial xi^beta}{partial xi^{primenu}} right) T_{alpha beta}
end{equation}
</annotation></semantics></math></td><td class='eq-no'>(1)</td></tr></table>
<!-- l. 18 --><p class='nopar'>

</p>
   
<!-- l. 23 --><p class='nopar'> </p></div>

Answered by michal.h21 on January 21, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP