TransWikia.com

Is there a program similar to detex for windows

TeX - LaTeX Asked on February 4, 2021

Detex strips a TeX file from its TeX commands to turn it into a plain *.txt file. Is there a detex program for Windows.
I am looking for ways to use spell/grammar editing tools in Word or other programs such as After Deadline.
There is of course a spell checkers on every editor. But occasionally you want to have other checks as explained in comparison of spell checking in TeXworks and other editors.

3 Answers

If you look the source code of detex (written in C) you will see that it consists the main job is done by lex (lexical analyzers) with a help of the small sed script. I checked and detex is unfortunatelly not ported to Cygwin but my feeling is that you should be able to compile on Cygwin (you have flex (free lex), gcc, gnu sed etc).

Now other options which are not so sophisticated is to write your own sed (perl) script. Obviously you need to run that in Cygwin. I am at work right now and I am sure I have seen sed one-liners which can do decent job of detex-ing. I will try to find/write such a script and post here. I will also try to post 100 points bounty for such sed one-liner. If you Google you should be able to find Perl script which does that.

Edit: Try this script which uses dvi as an intermediate format and catdvi toll to strip LaTeX tags.

$ latex file.tex
$ catdvi -e 1 -U file.dvi | sed -re "s/[U+2022]/*/g" 
  | sed -re "s/([^^[:space:]])s+/1 /g" > file.txt

I also checked for people who wants to go dvi route dvi2tty does a marvelous job converting dvi file into plain text files. No additional processing is needed.

There is another one well known sed script tex2xml for converting tex2xml written by Tilmann Bitterberg. I will try to fix it to do conversion to plain ASCII.

#! /bin/sed -f

# Try of a nested tag{value} parser:
# - handles multiline tags
# - can deal with quoted { and }
# - handles nested tags
# Limitations:
# - tags are not allowed to have [{}<>| ] in the name.
# - doesn't detect unbalanced brackets
#
# b{foo} -> <b>foo</b>
# b{foo em{bar}} -> <b>foo <em>bar</em></b>

# Tue Nov 27 17:28:32 UTC 2001

# {1{2{3{4{5{6{7{8{9{a{b{c{d{e{f{g{h{i{{text0}}}}}}}}}}}}}}}}}}}text1}

# How it works
# We build a stack of unclosed tags in holdspace
# by appending always at the end (``H'').
# when a closing bracket is found, fetch tag
# from holdspace.
# Main focus is small memory usage

# escape Quoted and generate entities
s,&,&amp;,g
s,<,&lt;,g
s,>,&gt;,g
s,{,&obrace;,g
s,},&cbrace;,g

# uninteresting line, jump to end
/[{}]/!b unescape

:open  

/{/{   
  s,( *)([^|<>}{ ]*){,1
2
,;           # Isolate tag
  # Patternspace: text n newtag n text
  H;         # append to holdspace
  s,n([^n]*)n,<1>,; # generate XML tag

  # Holdspace: ..tagN n text n newtag n text
  # We only want oldtags + newtag
  x
  s,(.*n)[^n]*n([^n]*)n[^n]*$,12,
  x

  /^[^{]*}/b close
  /{/b open
}

:close

/}/{
  s,},


,
  # text1 nnn text2 nn tag0 n tag1 text2 may be empty
  G;
  s,nnn([^n]*)n.*n([^n]*)$,</2>1,
  x
  s,n[^n]*$,,;   # delete tag from holdspace
  x

  /^[^}]*{/b open;   # if next bracket is an open one
  /}/b close;        # another one?
}

:unescape
s,&obrace;,{,g
s,&cbrace;,},g

Answered by Predrag Punosevac on February 4, 2021

LuaTeX users may want to have a look at the spelling package. It writes out a pure text file after the LaTeX run that can be checked by your favourite spell-checker.

Answered by Stephan Hennig on February 4, 2021

Answered by Jonas Stein on February 4, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP