TransWikia.com

Hide internal bibkeys in pdf

TeX - LaTeX Asked by Pixelf on January 19, 2021

When I open the pdf generated by pdflatex in e.g. firefox, the URL when hovering over a reference shows the internal bibkey. How can I prevent this?

Consider the following MWE:

mwe.tex

documentclass{article}

usepackage[backend=biber]{biblatex}
addbibresource{bib.bib}

usepackage{hyperref}

begin{document} 

Testcite{secretbibkey}

printbibliography

end{document}

bib.bib

@article{secretbibkey,
    author = {Joe},
    title = {Just a title},
    number = {ABCD-E/2008/ab/1234},
    institution = {University},
    year = {2008}
}

Now look at the following screenshot:
Screenshot showing you can see the internally used bibkey
It shows the bibkey I used when hovering over the citation (apparently the screenshot does not include my cursor but it was above the green "1"). Is there any way to randomize/anonymize these URL-keys?

3 Answers

You can post-process the created pdf with some command-line tools to search for cite keys and replace them in the pdf source.

First you need to uncompress the pdf to allow textual search and replace, for example with pdftk.

In the uncompressed pdf the links look like this:

3 0 obj 
<<
/Border [0 0 1]
/Subtype /Link
/H /I
/Type /Annot
/C [0 1 0]
/Rect [169.08 653.748 176.054 665.704]
/A 
<<
/D (cite.0@secretbibkey)
/S /GoTo
>>
>>
endobj 

and the corresponding parts further down:

34 0 obj 
<<
/Names [(Doc-Start) 22 0 R (cite.0@secretbibkey) 21 0 R (page.1) 14 0 R (section*.1) 18 0 R]
/Limits [(Doc-Start) (section*.1)]
>>
endobj

Now you can grep the source for the pattern /D (cite.0@[some key]) and store only the key part.

Then, looping over all keys, you can generate a replacement key, for example the md5 checksum (which can reasonably be expected to be unique for each key).

Next you can replace all occurrences of the key with the replacement using sed.

At the end of the loop you re-compress the pdf with pdftk and you are done.

Full script (call with bash myscript.sh mypdf.pdf):

pdftk $1 output "raw$1" uncompress
grep -aoP "/D (Kcite.0@[^)]+(?=))" "raw$1" | while read -r line ; do
   echo $line
   citehash=`echo $line|md5sum|awk '{ print $1 }'`
   sed -i "s/$line/$citehash/g" "raw$1"
done
pdftk "raw$1" output $1 compress

Resulting link tooltip in Firefox:

enter image description here

Added bonus: the link still works.

Answered by Marijn on January 19, 2021

If you want to hide all internal labelling, Ulrike's answer is much nicer, but here is a way to obfuscate only the links created by biblatex via MD5 hashes (I believe MD5 hashes are no longer recommended for anything security critical any more, but it might be enough for your purposes).

The command blx@mdfivesum used here requires a relatively recent biblatex version. If you are stuck with an old biblatex, load usepackage{pdftexcmds} yourself and say letblx@mdfivesumpdf@mdfivesum.

documentclass[british]{article}
usepackage[T1]{fontenc}
usepackage[utf8]{inputenc}
usepackage{babel}
usepackage{csquotes}

usepackage[style=authoryear, backend=biber]{biblatex}
usepackage{hyperref}

makeatletter
AtBeginDocument{%
  protecteddefblx@anchor{%
    xifinlist{thec@refsection @blx@mdfivesum{abx@field@entrykey}}{blx@anchors}
      {}
      {listxaddblx@anchors{thec@refsection @blx@mdfivesum{abx@field@entrykey}}%
       hypertarget{cite.thec@refsection @blx@mdfivesum{abx@field@entrykey}}{}}}%
  ifundefhyper@natanchorstart
    {longdefblx@bibhyperref[#1]#2{%
       blx@sfsavehyperlink{cite.thec@refsection @blx@mdfivesum{#1}}{blx@sfrest
         #2%
       blx@sfsave}blx@sfrest}%
     protectedlongdefblx@imc@bibhyperlink#1#2{%
       blx@sfsavehyperlink{cite.thec@refsection:blx@mdfivesum{#1}}{blx@sfrest
         #2%
       blx@sfsave}blx@sfrest}%
     protectedlongdefblx@imc@bibhypertarget#1#2{%
       blx@sfsavehypertarget{cite.thec@refsection:blx@mdfivesum{#1}}{blx@sfrest
         #2%
       blx@sfsave}blx@sfrest}}%
    {longdefblx@bibhyperref[#1]#2{%
       blx@sfsavehyper@natlinkstart{thec@refsection @blx@mdfivesum{#1}}blx@sfrest
       #2%
       blx@sfsavehyper@natlinkendblx@sfrest}%
     protectedlongdefblx@imc@bibhyperlink#1#2{%
       blx@sfsavehyper@natlinkstart{thec@refsection:blx@mdfivesum{#1}}blx@sfrest
       #2%
       blx@sfsavehyper@natlinkendblx@sfrest}%
     protectedlongdefblx@imc@bibhypertarget#1#2{%
       blx@sfsavehyper@natanchorstart{thec@refsection:blx@mdfivesum{#1}}blx@sfrest
       #2%
       blx@sfsavehyper@natanchorendblx@sfrest}}}
makeatother

addbibresource{biblatex-examples.bib}


begin{document}
autocite{sigfridsson,worman,geer,nussbaum}
printbibliography
end{document}

Screenshot of the PDF with obfuscated link.

Answered by moewe on January 19, 2021

You can try something like this. I'm not sure if int_to_alpha handles all chars and input that can appear in a destination, and if it can lead to identical destinations, but it was the best expandable function I found for now. It will scramble all destinations, also the one to sections and labels. To scramble only bib-keys would require a number of changes to the biblatex code.

documentclass{article}

usepackage[backend=biber]{biblatex}
addbibresource{test.bib}

usepackage{hyperref}
ExplSyntaxOn
cs_new:Npn pix_scrample_dest:n #1 {int_eval:n{int_from_alph:n{#1} + 2}} %2= secret number 
defHyperDestNameFilter#1{exp_args:Netl_map_function:nN {#1}pix_scrample_dest:n}
ExplSyntaxOff

begin{document}

Testcite{secretbibkey}

printbibliography

end{document}

Answered by Ulrike Fischer on January 19, 2021

Add your own answers!

Ask a Question

Get help from others!

© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP