Mathematica Asked on January 24, 2021
I was surprised to see that, in the results of CompilePrint
for a function made with Compile
, calls to Function
s seemed to be "actually" making a copy of the arguments, and perhaps-not optimizing-out the copies.
For example, compare the CompilePrint
output of compiledFunction
and compiledExpression
below:
CompileSustitutable[vars_, body_] :=
Hold[vars, body] /. Hold -> Compile
func = Function[x, Evaluate@Through@{# &, # + 1 &, #^2 &}@x]
expr = func[x]
With[
{func = func},
compiledFunction = Compile[{y}, func[y]];
];
compiledExpression = CompileSustitutable[{x}, expr];
CompilePrint@compiledFunction
CompilePrint@compiledExpression
The compiledFunction
output is:
R0 = A1
I0 = 1
Result = T(R1)0
1 R1 = R0
2 R2 = I0
3 R2 = R2 + R1
4 R3 = Square[ R1]
5 T(R1)0 = {R1, R2, R3}
6 Return
Whereas the compiledExpression
output is:
R0 = A1
I0 = 1
Result = T(R1)0
1 R1 = I0
2 R1 = R1 + R0
3 R2 = Square[ R0]
4 T(R1)0 = {R0, R1, R2}
5 Return
Although modern CPUs may perform their own optimization (which may make any Mathematica optimization a bit of a moot point, excepting any edge-cases), I don’t like relying on unknown downstream processes. Does anyone know if the "lowest-level code output by Mathematica" is basically what’s shown in CompilePrint
, as-shown? I see that there are options like CompilationTarget -> "C"
, so does a "normal/mainstream compiler" still do things like inline
‘ing function calls?
Is my CompileSustitutable
function, which ostensibly saves argument copying, even worth the bother?
FYI: A much-less-elegant but interesting first attempt, demonstration the ability to bypass the "scoping" of Compile
‘s arguments is shown below — I forget who to give credit to for the idea of using Evaluate
in this manner:
With[
{
directMethod = Unevaluated[Evaluate[x]*2]
},
C1 = Compile[
{Evaluate[x]},
directMethod
]
];
I decided to get some quantitative results to this question -- through brute-force. I decided to write my own version of BorderDimensions
(calculates borders of a solid color around an image) using Compile
. It's non-trivial enough to actually demonstrate the point, but (I hope) small enough to post here.
Note that I'm aware that my version could be "tricked" if an image contains horizontal/vertical lines of the same color, but where each line doesn't have the same color as its adjacent line. That's not the point for this question -- it's just a proof-of-concept. Also, I think there's a bug in the "function-based" version (I ran it through some test images), but the point is that you can see a ton of TensorCopy
operations in CompilePrint
, and the results of AbsoluteTiming
speak for themselves.
The results confirmed that inlining functions yourself, manually, does actually result in performance boosts.
I used Henrik Schumacher's suggestion of using ExportString[cf, "C"]
to view the actual C-code generated, plus AbsoluteTiming
. Results: the version with function calls generated double as many lines of C-code, and took more than double time to execute, as measured by running 1000 iterations through a small test image and using AbsoluteTiming
.
First, here is the test data:
testImgRaster = Rasterize@x
testImgData =
Rasterize[x] /. HoldPattern@Image[data__] -> List[data];
testImgDataArr = testImgData[[1]] // Normal;
Next, here is the benchmarking setup:
Style["PaddingCalculatorCompiledWithFunctions", Bold, Red]
codePaddingCalculatorCompiledWithFunctions =
ExportString[PaddingCalculatorCompiledWithFunctions, "C"];
StringLength[codePaddingCalculatorCompiledWithFunctions]
Do[PaddingCalculatorCompiledWithFunctions[testImgDataArr], {i, 0,
1000}] // AbsoluteTiming
Style["PaddingCalculatorCompiledWithExpressions", Bold, Red]
codePaddingCalculatorCompiledWithExpressions =
ExportString[PaddingCalculatorCompiledWithExpressions, "C"];
StringLength[codePaddingCalculatorCompiledWithExpressions]
Do[PaddingCalculatorCompiledWithExpressions[testImgDataArr], {i, 0,
1000}] // AbsoluteTiming
Style["Comparison - BorderDimensions", Bold, Red]
Do[BorderDimensions[testImgRaster], {i, 0, 1000}] // AbsoluteTiming
Here are the benchmarking results:
PaddingCalculatorCompiledWithFunctions
15390
{0.447371,Null}
PaddingCalculatorCompiledWithExpressions
8898
{0.196994,Null}
Comparison - BorderDimensions
{0.180333,Null}
Finally, if you've read to here and want to see the actual code, here it is...
Note: I normally go way out-of-my-way to make small, concise, purpose-built functions. But in addition to the performance benefits, I think I like the "look and feel" of the expression-based version better. The style takes some getting-used-to, though.
Function-based version:
PaddingCalculatorCompiledWithFunctions =
Module[{PaddingCalculatorGenerator, PaddingCalculatorParams, numRows,
numCols, PaddingCalculator, PaddingCalculatorInner},
PaddingCalculatorGenerator[primaryRange_, secondaryRange_,
comparePart1_, comparePart2_] =
Function[imageDataArgToCalculator,
Hold@Do[
Function[innerResult,
If[innerResult == -1,
Return@Abs@(primaryRange[[2]] - primaryRange[[1]])]
]@
Do[
If[
comparePart1@imageDataArgToCalculator ==
comparePart2@imageDataArgToCalculator,(*Null*)-1, Return@-1],
secondaryRange
],
primaryRange
]
];
PaddingCalculatorParams[numRows_, numCols_] =
{
{{rowIdx, 1, numRows, +1}, {colIdx, 1, numCols},
Part[#, rowIdx, colIdx] &, Part[#, rowIdx, 1] &},
{{rowIdx, numRows, 1, -1}, {colIdx, 1, numCols},
Part[#, rowIdx, colIdx] &, Part[#, rowIdx, 1] &},
{{colIdx, 1, numCols, +1}, {rowIdx, 1, numRows},
Part[#, rowIdx, colIdx] &, Part[#, 1, colIdx] &},
{{colIdx, numCols, 1, -1}, {rowIdx, 1, numRows},
Part[#, rowIdx, colIdx] &, Part[#, 1, colIdx] &}
};
PaddingCalculator =
ReleaseHold@
Function[{imageDataArgToCalculators, numRows, numCols},
imageDataArgToCalculators //
Apply[PaddingCalculatorGenerator] /@
PaddingCalculatorParams[numRows, numCols] // Through //
Evaluate
];
PaddingCalculatorInner =
With[{PaddingCalculator = PaddingCalculator},
ReleaseHold@Function[imageDataArgToMain,
Module[{numRows, numCols},
numRows = Hold@Length@imageDataArgToMain;
numCols = Hold@Length@First@imageDataArgToMain;
Hold@PaddingCalculator[imageDataArgToMain, numRows, numCols]
]
]
];
With[{PaddingCalculatorInner = PaddingCalculatorInner},
Compile[{{imageDataArgToCompile, _Integer, 3}},
PaddingCalculatorInner[imageDataArgToCompile]]
]
]
Expression-based version:
ReleaseHoldUnevaluated[expr_] :=
ReplaceRepeated[HoldComplete[Unevaluated[expr]], Hold[x__] -> x] //
ReleaseHold
CompileSustitutable[vars_, body_] :=
Hold[vars, body] /. Hold -> Compile
Module[{PaddingCalculatorGenerator, PaddingCalculatorParams,
PaddingCalculator, PaddingCalculatorInner},
PaddingCalculatorGenerator[primaryRange_, secondaryRange_,
comparePart1_, comparePart2_] :=
With[{primaryIndex = primaryRange[[1]],
primaryStart = primaryRange[[2]]},
Hold@Do[
Function[innerResult,
If[innerResult == -1,
Return@Abs@(primaryStart - primaryIndex)] ]@
Do[
If[comparePart1 == comparePart2, Null, Return@-1],
secondaryRange
],
primaryRange
]
];
ImagePart[row_, col_] = Hold@Part[imageData, row, col];
PaddingCalculatorParams =
{
{{rowIdx, 1, numRows, +1}, {colIdx, 1, numCols},
ImagePart[rowIdx, colIdx], ImagePart[rowIdx, 1]},
{{rowIdx, numRows, 1, -1}, {colIdx, 1, numCols},
ImagePart[rowIdx, colIdx], ImagePart[rowIdx, 1]},
{{colIdx, 1, numCols, +1}, {rowIdx, 1, numRows},
ImagePart[rowIdx, colIdx], ImagePart[1, colIdx]},
{{colIdx, numCols, 1, -1}, {rowIdx, 1, numRows},
ImagePart[rowIdx, colIdx], ImagePart[1, colIdx]}
};
PaddingCalculator =
Hold[Module]
[{numRows, numCols},
Hold[CompoundExpression][
Hold[numRows = Length@imageData],
Hold[numCols = Length@First@imageData],
PaddingCalculatorGenerator @@@ PaddingCalculatorParams
]
];
PaddingCalculatorCompiledWithExpressions =
CompileSustitutable[{{imageData, _Integer, 3}},
ReleaseHoldUnevaluated[PaddingCalculator]]
]
Answered by Sean on January 24, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP