Mathematica Asked on October 2, 2021
I have a list of integers dims
and a list of SparseArray
s bdrs
(representing a chain complex $mathbb{Z}^{d_0}overset{partial_1}{leftarrow}mathbb{Z}^{d_1}overset{partial_2}{leftarrow}mathbb{Z}^{d_2}leftarrowldots$).
I wish to import/export such data from/to a file.txt
(each line should be a matrix entry). For instance, the data
$$mathbb{Z}^{2}xleftarrow{left[begin{smallmatrix}5&0&0&6&7end{smallmatrix}right]} mathbb{Z}^{3}xleftarrow{left[begin{smallmatrix}0&8&0&09&0&0&0&0&-1&-2end{smallmatrix}right]}mathbb{Z}^{4}$$
corresponds to a file
2 3 4
1 1 5
2 2 6
2 3 7
1 2 8
2 1 9
3 3 -1
3 4 -2
and $$
mathbb{Z}^{7}xleftarrow{0} mathbb{Z}^{0}xleftarrow{0} mathbb{Z}^{5}
xleftarrow{left[begin{smallmatrix}0&0&0&0&0&0end{smallmatrix}right]}
mathbb{Z}^{2}xleftarrow{left[begin{smallmatrix}0&0&0&1521&0&0&0end{smallmatrix}right]} mathbb{Z}^{4}$$
corresponds to a file
7 0 5 2 4
1 4 14
2 1 21
My solution is:
chcxIn[file_]:= Module[{s,dims,bdrs={},k=1,i=1}, s=Import["/home/"<>file,"List"];
s=Map[If[#=="",{},ImportString[#,"Table"][[1]]]&,s]; dims=s[[1]]; s=ParallelMap[If[#=={},{},#[[;;2]]->#[[3]]]&,s[[3;;]],{1}];
Do[ If[s[[j]]=={}, AppendTo[bdrs,SparseArray[s[[i;;j-1]], dims[[k;;k+1]]]]; k+=1; i=j+1;],{j,Length@s}]; Return@{bdrs,dims}];
chcxOut[bdrs_,dims_,file_]:= Export["/home/"<>file, {StringReplace[ ToString@dims, {"{"->"","}"->"",","->""}],""}~Join~
Flatten[Table[ArrayRules[b][[;; -2]]~Join~{""} /.({u_,v_}->w_):>(ToString[u]<>" "<>ToString[v]<>" "<>ToString[w]), {b,bdrs}],1]~Join~{""}, "List"];
However, this is hopelessly inefficient (time and memory wise). For 50MB of data, chcxOut
needs 65 seconds and 700MB of RAM. This seems excessive. I wish to deal with files of size 10GB. Is there an efficient way of doing this?
Edit: With the help of @HenrikSchumacher, here is an improvement.
chcxIn[fileName_] := Module[{s=OpenRead[fileName],r(*read*), l(*line*), dims,bdrs={},k=0,e={}},
dims=ImportString[Read[s,String],"Table"][[1]];
r:=Read[s,Record,NullRecords->True]; Monitor[If[s=!=$Failed, While[l=!=EndOfFile, l=r;
Which[l=="0", , l=="", k+=1; AppendTo[bdrs,SparseArray[e,dims[[k;;k+1]]]]; e={}, True,
l=ImportString[l,"Table"][[1]]; AppendTo[e,l[[1;;2]]->l[[3]]]]; ]], k]; Close[s]; {bdrs,dims}];
chcxOut[bdrs_,dims_,fileName_] := Module[{f=OpenWrite[fileName], w(*write*)},
w=WriteString[f,ExportString[#,"Table"]]&; w@{dims}; WriteString[f,"nn"];
Monitor[ Do[ If[Times@@dims[[k;;k+1]]==0 || bdrs[[k]]["Density"]==0, w@{0},
w@Join[bdrs[[k]]["NonzeroPositions"],Partition[bdrs[[k]]["NonzeroValues"], 1], 2]];
WriteString[f,"nn"],{k,Length@bdrs}],k]; Close[f];];
For a 2MB file, the time and memory performance is:
Export
0.1sec 4MB, Import
0.2sec 8MB, chcxOut
1.1sec 12MB, chcxIn
265sec 9MB. As we can see, importing from my custom format is still much slower. Hopefully, there is a better way to do this.
Something like this should work.
dims = Prepend[(Dimensions /@ bndrs)[[All, 2]], Dimensions[bndrs[[1]]][[1]]];
file = OpenWrite["a.txt"];
WriteString[file, ExportString[{dims}, "Table"]];
Do[
WriteString[file, "nn"];
WriteString[
file,
ExportString[
Join[A["NonzeroPositions"], Partition[A["NonzeroValues"], 1], 2],
"Table"
]],
{A, bndrs}];
Close[file]
The result is a human-readible file, so it is not really super compressed.
Answered by Henrik Schumacher on October 2, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP