TransWikia.com

Running several PdfTeX processes in parallel from within PdfTeX

TeX - LaTeX Asked on July 25, 2021

I have a complex document setup where I have to produce several document versions. I want to start these runs in parallel to speed things up – and I want to control this from within PdfTeX. For this purpose I start these runs with a macro from within PdfTeX, as in

immediatewrite18{/Users/xyz/TEX/makeParallel jobname space  #1 space }

This works nicely but produces a sequential execution, which is not what I want. So I tried

immediatewrite18{/Users/xyz/TEX/makeParallel jobname space  #1 space & }

However, this leads to a truncated log file just in the middle of nowhere and a 2K byte PDF file which the PDF reader refuses to open.

Initially I had some suspicions that certain things have problems with reentering PdfTeX again while it is running. A candidate was minted, for example. So I made a complete duplication of all source files and the build directory to separate the instances. NOW, which means 30 hours later, I am starting to get a bit desparate. No matter what I do, sequential executions always work, parallelized versions consistently crash. I already tried all kinds of tricks, such as sending PdfTeX into the background in the shell script and not in write18 and other stuff. It is still the same. It looks like some process in my build chain (comprising makeindex, BibTeX etc) take offense in my idea and are not cleanly reentrant.

Any ideas what this could be?!?

More details: Commenters correctly point out that I am not giving enough information – which is due to the fact that I simply have tried out too many different variants which did not work.

Initially I did not use a script but started PdfTeX directly in immediatewrite18 which did worked (but only when it was running in the foreground, i.e. without an & at the end). I then switched to the shell version as this makes it easier to change the commands and get them right as – for me – the TeX escapements needed are less well known than shell.

The different runs do not share any files at all: The shell script first generates a complete copy of all files used in the tex run. In fact: minted and its cache system posed quite some problems so I decided to be that radical and just duplicate the entire environment.

I am running them from write18 and not as parallel jobs at the end: I run this from TeXStudio and it is there that I preview and edit and make my checks. My use case requires to have all the other 6 versions ready by the time the TeXStudio foreground job has completed. So I have to start them early.

jobname is the first shell parameter, which I need to access the correct filename in the copied environment. #1 is the second shell parameter, which is filled in by a macro such as backJob{shortVersion}. The shell script then uses this information to make a suitably named directory with all required source files and then start the PdfTeX run there. If I manually start the run: Everything is fine and no other files are touched afaik. If I start the script in the foreground: Just the same. If I start the tex run in the background (either by backgrounding the script or by backgrounding pdftex inside of the script): It breaks.

The must be some file which gets reused somewhere. Maybe some fancy /tmp or some /hidden/somewhere/in/the/system which I missed. But currently my brain seems to run out of suggestions here. I also tried -recorder as command line option to catch any overlapping files, but there is none.

What makes debugging even more problematic is the Heisenbug nature of the phenomenon. Only when I run this from within write18 with shell backgrounding somewhere turned on does the problem show up.

One Answer

Found it! :-)

The problem seems to be connected with the way immediatewrite18 deals with output from the started process in the particular case of background activity.

immediatewrite18{/Users/cap/TEX/makeParallel jobname space #1 space > OUTPUT 2>ERROR &}

works fine.

The problem shows up with immediatewrite18{/Users/cap/TEX/makeParallel jobname space #1 space &}

So far the therapy. Wrt diagnosis I suspect: The shell process wants to produce content on stdout and/or stderr and has no place to do so; as result of the backgrounding there is no reasonable terminal connection for this as well - and so somebody just decides to crash the process. This leads to the unpredictable end, corrupted pdf file, log file terminating in the middle of an output line etc.

Answered by Nobody-Knows-I-am-a-Dog on July 25, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP