Learning Maximum Margin Channel Decoders
The success of machine learning (ML) based methods in various domains has spurred a great contemporary interest in the application of ML algorithms for communication problems. However, state-of-the art communication systems heavily rely on expert-based design, and so obtaining a significantly improved performance with ML algorithms typically necessitates the use of the most advanced ML algorithms, most notably, deep neural networks. In accordance, this also typically prevents a theoretical justification for the developed algorithm, as well as theoretical performance guarantees. Therefore, in order to address the theoretical aspects of learning in communication systems, in this talk an alternative route is taken, and the problem of learning a channel decoder is considered for two basic channel models. The first model is an additive noise channel whose noise distribution is unknown and nonparametric. The learner is provided with a fixed codebook and a dataset comprised of independent samples of the noise, and is required to select a precision matrix for a nearest neighbor decoder in terms of the Mahalanobis distance. The second model is a non-linear channel with additive white Gaussian noise and unknown channel transformation. The learner is provided with a fixed codebook and a dataset comprised of independent input-output samples of the channel, and is required to select a matrix for a nearest neighbor decoder with a linear kernel. For both models, the objective of maximizing the margin of the decoder is addressed. Accordingly, for each channel model, a regularized loss minimization problem with a codebook-related regularization term and hinge-like loss function is developed, which is inspired by the support vector machine paradigm for classification problems. Expected generalization error bounds for the error probability loss function are provided for both models, under optimal choice of the regularization parameter. For the additive noise channel, a theoretical guidance for choosing the training signal-to-noise ratio is proposed based on this bound. In addition, for the non-linear channel, a high probability uniform generalization error bound is provided for the hypothesis class. For each channel, a stochastic sub-gradient descent algorithm for solving the regularized loss minimization problem is proposed, and an optimization error bound is stated. The performance of the proposed algorithms is demonstrated through several examples.
* M.Sc. student under the supervision of Professor Nir Weinberger.