When do decimal-coded binary numbers XOR to zero?

Puzzling Asked by A. Rex on January 6, 2021

Background definition: XOR on numbers

Given two non-negative integers $$x$$ and $$y$$, let $$xoplus y$$ denote the bitwise exclusive or (XOR) of the numbers $$x$$ and $$y$$. This is the result of writing $$x$$ and $$y$$ in binary notation, then XORing corresponding bits (also known as "addition without carry"), and finally interpreting the result as a number again. This operation is often represented by the expression x ^ y in programming languages and is also known as nim-addition. This operation is commutative and associative, so one can take the XOR of more than one number in any order and simply write $$xoplus yoplus z$$.

It’s important to note that we view $$xoplus y$$ as an operation on numbers here, not on strings, which is consistent with its behavior in various programming languages. For example, Python will happily report that 123 ^ 456 ^ 789 == 678, which you can see by examining the columns of the following table, where we use a subscript like $$123_{10}$$ to mean decimal notation and $$0001111011_2$$ to mean binary notation:

begin{align*} 123_{10} &= 0001111011_2 456_{10} &= 0111001000_2 789_{10} &= 1100010101_2 678_{10} &= 1010100110_2. end{align*}

[Hint: check that each column of bits has an even number of $$1$$s.]

Good collections of numbers have vanishing XOR

Motivated by a recent challenge on the Code Golf Stack Exchange, "For what block sizes is this checksum valid?" (which was the #1 Hot Network Question for a while!), let us call a collection of numbers good if their XOR is zero, and bad otherwise. For example, the set $${123_{10}, 456_{10}, 789_{10}, 678_{10}}$$ is good. In other words, a collection is good if "the checksum is valid".

Given a bitstring like $$101$$, one can interpret the string in binary as $$101_2 = 5_{10}$$, or in decimal notation as $$101_{10} = 1100101_2$$. By analog to binary-coded decimal, one might call the latter interpretation decimal-coded binary (of $$101$$), meaning we use an entire decimal digit to represent a single binary bit.

Given a collection of bitstrings, like $${10, 100, 110}$$, one can interpret the strings in binary like $${10_2, 100_2, 110_2}$$ or in decimal notation like $${10_{10}, 100_{10}, 110_{10}}$$. Sometimes, like in this case, it happens that both the interpretations are good, meaning $$10_2oplus 100_2oplus 110_2 = 0$$ and $$10_{10} oplus 100_{10} oplus 110_{10} = 0$$ as well:

begin{align*} 10_{10} &= 0001010_2 100_{10} &= 1100100_2 110_{10} &= 1101110_2. end{align*}

Unfortunately, sometimes the binary interpretation is good but the decimal interpretation is bad, like in the case $${100,1000,1100}$$ which is good in binary because $$100_2 oplus 1000_2 oplus 1100_2 = 0$$ but where unfortunately the decimal interpretation has $$100_{10} oplus 1000_{10} oplus 1100_{10} = 1984_{10}$$:

begin{align*} 100_{10} &= 00001100100_2 1000_{10} &= 01111101000_2 1100_{10} &= 10001001100_2 1984_{10} &= 11111000000_2. end{align*}

Of course, it is very often the case that both binary and decimal interpretations are bad; this is the usual situation for a random collection. (Checksums are unlikely to be valid by accident.) Thus we have examples of "binary good and decimal good", "binary good but decimal bad", and "binary bad and decimal bad". That leaves a final possibility: "binary bad but decimal good".

The puzzle

Find a collection of bitstrings that is good when interpreted as decimal numbers but is bad when interpreted as binary numbers.

That is, find a collection of bitstrings (ie, strings of $$0$$s and $$1$$s) $${abcdotsc def,quad dotsb,quad uvwdotsc xyz }$$ so that the "decimal-coded binary XOR" vanishes, meaning $$abcdotsc def_{10} oplus dotsb oplus uvwdotsc xyz_{10} = 0$$ but their "usual XOR" doesn’t, meaning $$abcdotsc def_{2} oplus dotsb oplus uvwdotsc xyz_{2} ne 0.$$

This was obtained with a computer search using Magma, with the following code:

n:=25;
V:=VectorSpace(GF(2),2*n);
W:=sub<V|{V.i:i in {1..n}}>;
S:={};
A:=[];
for i in {1..200} do
t:=IntegerToSequence(i,2);
s:=IntegerToSequence(SequenceToInteger(t,10),2);
Include(~S,V!(t cat [0:j in {1..n-#t}] cat s cat [0:j in {1..n-#s}]));
if (Dimension(sub<V|S> meet W) ge 1) then print i; break; end if;
end for;


This code searches for the set of integers with this property with the smallest maximum element. For each integer ?, we create the binary expansion of this integer as ?, and then ? is the "decimal expansion" of the same string of 0s and 1s, converted back to binary. By appending these strings together in a single vector, we can look for solutions where the last half of the vector (the decimal expansion) XORs to 0, but the first half does not. The first integer for which such a set occurs is 168, which gives us the above set of integers.

A little more detail

The code above shows that there is a solution set with largest element 10,101,000, but does not actually produce it. The following code actually produces the solution:

n:=25;
lim:=200;
V:=VectorSpace(GF(2),2*n);
W:=sub<V|{V.i:i in {1..n}}>;
A:=[];
S:={};
for i in {1..lim} do
t:=IntegerToSequence(i,2);
s:=IntegerToSequence(SequenceToInteger(t,10),2);
jst:=t cat [0:j in {1..n-#t}] cat s cat [0:j in {1..n-#s}];
Append(~A,jst);
Include(~S,V!jst);
if (Dimension(sub<V|S> meet W) ge 1) then
v:=Basis(sub<V|S> meet W)[1];
M:=KMatrixSpace(GF(2),i,2*n);
B:=M!A;
x:=Solution(B,v);
tt:={j:j in {1..i}|x[j] eq 1};
{SequenceToInteger(IntegerToSequence(i,2),10):i in tt};
break;
end if;
end for;


The extra code sets v as the target vector, and then uses linear algebra to find the combination of vectors calculated above that sum to v, and converts them to the appropriate integers.

Answered by Jeremy Dover on January 6, 2021

A friend of mine found another solution but, despite my cajoling, declines to post here.

As we know from the other answer, any solution must

The code below finds a couple solutions achieving this bound and

One such solution is

or in decimal notation,

M = 2**8
x = [int(bin(n)[2:]) for n in range(M)]

for a in range(M):
for b in range(a):
for c in range(b):
for d in range(c):
if x[a]^x[b]^x[c]^x[d] == 0 and a^b^c^d != 0:
print(a, b, c, d, x[a], x[b], x[c], x[d])


Try it online!

However, this leaves open a possibility: is there a solution with only three strings?

Answered by A. Rex on January 6, 2021