Background
One-bit Compressive Sensing ($1$-bit CS) examines the efficient aquisition of sparse (or more structured) signals via linear measurement systems followed by the most severe quantization: retaining only $1$ bit per measurement, representing its sign. While this concept is of significant theoretical interest, 1-bit CS has several advantages in practice:
- a $1$-bit quantizer is a comparator -- these are easy to build and can operate at low power and high rates;
- $1$-bit quantization is robust to many forms of noise and non-linearities, e.g., positively or negatively saturated analog values still maintain the correct sign; and
- under certain conditions, $1$-bit CS can outperform convetional multi-bit CS.
1-bit CS shifts the CS discussion from measurements to bits. Since each measurement is a bit we can expect that, depending on the desired quality for the acquired signal, it may be desirable to obtain more bits, i.e., more measurements, than the dimension of the signal. Thanks to the simplicity of the 1-bit acquisition hardware, however, these measurements are inexpensive.
The $1$-bit CS framework has several other interesting features. For example, the framework uses a scalar quantizer, as opposed to vector or Sigma-Delta quantizer, meaning that each measurement is independently quantized, and can operate in a distributed manner. It also provides an inner product-preserving embedding, i.e., the hamming distance between quantized measurements of two signals is approximately proportional to their inner product, resembling Locality Sensitive Hashing (LSH) methods.
Acquisition Framework
We briefly describe the $1$-bit CS framework. Measurements of a signal $ x \in \mathbb{R}^{N}$ are computed via \begin{equation} \label{eq:defh} y_{s} = A( x) := \mathrm{sign} (\Phi x), \end{equation} where $\Phi \in \mathbb{R}^{M\times N}$ is a matrix representing our measurement system [note: in CS we typically only consider $M\leq N$ but in the $1$-bit setting $M > N$ is ok too]. Thus, the measurement operator $A(\cdot)$ is a mapping from $\mathbb{R}^{N}$ to the Boolean cube $\mathcal{B}^{M} := \{-1,1\}^{M}$. At best, we hope to recover signals $ x \in \Sigma_{K}^{*} := \{ x \in S^{N-1}: \| x\|_{0} \leq K\}$ where $S^{N-1} := \{ x \in \mathbb{R}^{N}: \| x\|_{2} = 1\}$ is the unit hyper-sphere of dimension $N$. We restrict our attention to sparse signals on the unit sphere since the scale of the signal has been lost during the quantization process.
Signal Recovery
To reconstruct, we enforce consistency on the signs of the estimate's measurements, i.e., that $A( \widehat{x}) = A( x)$. Specifically, we define a general non-linear decoder $\Delta^{\rm 1bit}( y_s, \Phi, K)$ such that, for $ \widehat{x} = \Delta^{\rm 1bit}( y_s, \Phi, K)$, the solution $ \widehat{x} $ is
- i) sparse, i.e., satisfies $\| \widehat{x}\|_0\leq K=\| x\|_0$; and
- ii) consistent, i.e., satisfies $A( \widehat{x}) = y_s = A( x)$.
The program $(1)$ is computationally intractable, and thus we relax the objective to the $\ell_{1}$-norm and enforce consistency via a linear convex constraint. Specifically, let the matrix $Y$ have the elements of $y_{s}$ along the diagonal and zero elsewhere. Then we can solve \begin{equation} \label{eq:1bit} \widehat{x} \gets \min_{x\in S^{N-1}} \|x\|_{1} ~~\mathrm{s.t.}~~Y\Phi x \geq 0~~\mathrm{and}~~\|x\|_{2} = 1, \tag{2} \end{equation} rather than $(2)$. The $\ell_{1}$ objective favors sparse solutions while the first constraint enforces consistency between the $1$-bit quantized measurements and the solution. While $(2)$ remains non-convex due to the the unit-sphere requirement, several algorithms has been developed (as well as convex formulations).
Geometric Intuition
The above figure depicts the geometry of the components of $(2)$ in two dimensions on the left, and in three dimensions on the right (animated). In the left figure, the hyperplanes (lines) $\varphi_{1}$ and $\varphi_{2}$ correspond to the first and second rows of $\Phi$, respectively. They are drawn to be perfectly orthogonal but in general this may not be the case. Indeed, if these rows were drawn randomly from a Gaussian distribution, they will be approximately orthogonal. The green shaded region depicts the feasible region, i.e., the set where all $x$ satisfy $Y\Phi x \geq 0$, and thus have measurement signs that are consistent with the diagonal of $Y$. The unit sphere is represented by the circle labelled $\|x\|_{2} = 1$ and thus the only unit norm sparse solution in the feasible region lies at $[0, 1]$, denoted by the red dot. The key feature of this picture is that each row of $\Phi$ defines some hyperplane and each measurement sign determines on which side of the hyperplane the solution lies. Our goal during reconstruction is to find the sparsest solution within the green region. The right figure demonstrates that as we take more measurements or bits (i.e., more hyperplanes), the width of the feasible region decreases, and thus the maximum reconstruction error decreases.
Applications
There are several benefits to obtaining 1-bit quantized measurements. First, efficient hardware quantizers can be built to operate at high speeds, since the quantizer can be a simple comparator that merely tests if a measurement is above or below zero. Indeed, there is an inverse relationship between sample rate and quantization bit-depth, such that the sample rate increases exponentially as the bit-depth is decreased linearly. Second, it has been shown that the program $(2)$ can be used to recover signals with severe non-linearities applied to the measurements. In particular, suppose a non-linearity $f(\cdot)$ is applied to the measurements. If $f(\cdot)$ preserves the sign of the measurements, then clearly (2) can be still be used to recover $x$ with the same performance as using the non-linearity-free measurements. Additionally, if we assume the linearity is monotonic, i.e., preserves the relation $$ f(x_{i}) < f(x_{i+1}) ~~ \mathrm{if} ~~ x_{i} < x_{i+1}, $$ then the program \begin{equation} \label{eq:non-linearity} \widehat{x} \gets \Delta^{1\mathrm{bit}}(\mathrm{sign}(\mathrm{diff}(f(\Phi x))), D\Phi, K), \end{equation} can be used to recover $x$, where $D$ is a difference matrix with $1$'s along the diagonal and $-1$'s along the first sub-diagonal, with $\mathrm{diff}(x) = x_{i+1} - x_{i},~\mathrm{for}~ i=1,\ldots,N-1$. Finally, it has been shown that in cases where there is significant signal noise (low input SNR), i.e., if \begin{equation} y_{\mathrm{noisy}} = \mathrm{sign}(\Phi(x+n)) ~~~\mathrm{with}~~~~\|n\|_2 > 0, \end{equation} then $1$-bit CS often outperforms multibit CS for a fixed total bit-budget.