the previous behavior would just keep incrementing the lock,
which is bad; emulate the x86 implementation behavior here by
performing a value exchange instead
we cannot use __atomic_test_and_set (which would eliminate
the value check) since that 1) works on a single byte, which
is okay on little endian systems but bad on big endian systems
and 2) has an undefined value of 'true' (just nonzero)
Fixes https://github.com/void-linux/void-packages/issues/26109