Second Workshop on
Optimization for Image and Signal Processing
- Program -
- Abstracts -
Dynamical behavior of a stochastic forward-backward algorithm.
Maximal monotone operators are set-valued mappings which extend (but are not limited to) the notion of subdifferential of a convex function. The proximal point algorithm and the forward-backward algorithm are amongst the most studied algorithms for finding iteratively a zero of a sum of two maximal monotone operators A and B. The purpose of this talk is to investigate the case where operators A and B are either unobserved or difficult to handle numerically. We propose an online version of the forward-backward algorithm where at each iteration, A and B are replaced with a realization of random maximal monotone operators whose expectations (in the Aumann sense) coincide with A and B respectively. Under the assumption of a decreasing step size, we prove that the iterates "shadow" the behavior of a continuous-time differential inclusion and converge almost surely to a zero of A+B provided some conditions. Applications to constrained convex optimization are considered.
|Jose Bioucas Dias
- Convex and Nonconvex Formulations for Image Segmentation Using Hidden Fields
Image segmentation is fundamentally a discrete problem. It consists in
finding a partition of the image domain such that
the pixels in each element of the partition exhibit some kind of
similarity. Very often, the partitions are obtained via integer
which is NP-hard, apart from few exceptions. We sidestep the discrete
nature of image segmentation by formulating
the problem in the Bayesian framework and introducing a set of hidden
real-valued random fields informative with respect to the probability
of the partitions. Armed with this model, and assuming a supervised
scenario, the original discrete optimization is converted into a
which is solved efficiently using the SALSA solver. In the
semi-supervised scenario, we are lead to a nonconvex problem, which
is addressed with
alternating optimization. The effectiveness of the proposed
methodology is illustrated in simulated and real segmentation
- Accelerated and preconditioned Douglas-Rachford: algorithms for the solution of variational imaging problems
We present and discuss accelerated and preconditioned versions of the
Douglas-Rachford (DR) splitting method for the solution of convex-concave
saddle-point problems which often arise in variational imaging.
These algorithms solely depend on implicit steps such as proximal operator
evaluation as well as solution of linear equations and do not require
step-size constraints. While the basic DR iteration admits weak and
ergodic convergence with rate $O(1/k)$ for restricted primal-dual gaps,
acceleration enables, under appropriate strong convexity assumptions,
to obtain convergence rates of $O(1/k^2)$ and $O(q^k)$ for $0 < q < 1$.
Furthermore, all methods allow for the incorporation of preconditioners
for the implicit linear step while preserving the convergence properties.
Neither the error nor the step-sizes have to be controlled for the inexact
The methods are applied to non-smooth and convex variational imaging
problems. We discuss in particular variational models with total variation
(TV) as well as total generalized variation (TGV) penalty. Preconditioners
which are specific to these penalties are presented, the results of
numerical experiments are shown and the benefits of the respective
accelerated and preconditioned algorithms are discussed.
- Total Variation Regularization of manifold-valued data
We present a proximal algorithm to minimize total variation functionals of $\ell^p$ type for manifold-valued data. The algorithm is based on iterative geodesic averaging which makes it applicable to a large class of manifolds. A parallel version of the algorithm is also discussed. For the class of Cartan-Hadamard manifolds (which includes the data space in diffusion tensor imaging) we show the convergence of the proposed minimizing algorithms to a global minimizer. As an illustration, we consider the problem of denoising signals and images which take their values in a manifold: We show some examples on diffusion tensor images, interferometric SAR images, and also an interesting application to shape signals.
This iterative algorithm can also be used to compute approached solutions of the Mumford-Shah functional for manifold-valued data. While convergence of the algorithm to a global minimizer is shown in the one-dimensional case, we show examples of application to the edge-preserving regularisation of diffusion tensor images.
This is a joint work with Martin Storath (EPFL) and Andreas Weinmann (TU Munich).
- Optimality Conditions in State Constrained Optimal Control
Optimal control theory concerns optimization models, where both the dynamics and the cost involve time dependent parameters whose values may be restricted to a set (possibly time dependent). In addition state trajectories may be asked to remain in a prescribed domain. In this talk I will discuss some recent advances in necessary optimality conditions for constrained deterministic finite dimensional control problems. Some of them were/are being extended to PDE involving optimal control problems.
- Optimizing over surface area measures of convex bodies
We are interested in shape optimization problems under
convexity constraints. Using the notion of surface area measure
introduced by H. Minkowski and refined by A. D. Alexandrov,
we show that several well-known problems of this class (constant
width body of maximum volume, Newton's body of minimum resistance)
can be formulated as linear programming (LP) problems on Radon measures.
Then we show that these measure LPs can be solved approximately
with a hierarchy of semidefinite programming (SDP) relaxations.
The support function of the almost optimal convex body is then
recovered from its surface area measure by solving the Minkowski
problem with the SDP hierarchy.
Joint work with Terence Bayen (Montpellier),
Roxana Hess (Toulouse) and Jean-Bernard Lasserre (Toulouse).
- Quasi-isometric embedding of (convex) vector sets in quantized compressed sensing
Compressed sensing theory (CS) shows that a "signal" can be reconstructed from a few linear, and most often random, observations. Interestingly, this reconstruction is made possible if the number of observations (or measurements) is adjusted to the (expected) intrinsic complexity of the signal (e.g., its sparsity). This principle is thus a generalization of the Shannon-Nyquist sampling theorem where the sampling rate is set by the signal frequency bandwidth. However, a significant aspect of CS systems is quantization (or digitization) of the acquired observations before further processing or for the purpose of transmission and compression. This presentation will stress the impact of the non-linear transformation induced by (scalar) quantization on the signal reconstruction guarantees. Conversely to former attempts considering quantization distortion as an additive Gaussian measurement noise, i.e., promoting a Euclidean (l2) fidelity with the signal observations, better signal reconstruction methods are reached by forcing "consistency" between the re-observed signal estimate and the quantized observations. Inspired by recent discoveries in 1-bit quantized compressed sensing, our developments will lead us to new non-linear embeddings of (possibly convex) vector sets in the quantized projection domain. These embeddings differ from the common restricted isometry property (RIP) used in "linear" compressed sensing in that they contain an additive distortion term, that is, they display a quasi-isometric nature directly connected to the information loss induced by quantization. Moreover, quantized random sub-Gaussian projections define such a quasi-isometry with small distortions as soon as the number of measurements grows like the intrinsic complexity of the vector set, as measured by its Gaussian mean width. We will finally show how these quasi-isometric embeddings allows us to analyze consistent reconstruction methods based on basis pursuit relatives.
A Growth Model in Computational Anatomy based on Diffeomorphic Matching
The Large Deformation Diffeomorphic Metric Mapping (LDDMM) framework has proved to be highly efficient for addressing the problem of modelling and analysis of the variability of populations of shapes, allowing for the direct comparison and quantization of diffeomorphic morphometric changes. However, the analysis of medical imaging data also requires the processing of more complex changes, which especially appear during growth or aging phenomena. The observed organisms are subject to transformations over the time which are no longer diffeomorphic, at least in a biological sense. One reason might be a gradual creation of new material uncorrelated to the preexisting one. For this purpose, we offer to extend the LDDMM framework to address the problem of non diffeomorphic structural variations in longitudinal scenarios during a growth or degenerative process. We keep the geometric central concept of a group of deformations acting on a shape space. However, the shapes will be encoded by a new enriched mathematical object allowing through partial mappings an intrinsic evolution dissociated from external deformations. We focus on the specific case of the growth of animal horns.
- Multiscale adaptive learning of high-dimensional data
Many data sets in image analysis and signal processing are in a high-dimensional space but exhibit a low-dimensional structure. We will discuss a multiscale geometric method to build a dictionary which provides sparse representations for these data. Our method is based on a multiscale partition of the data and then constructing piecewise affine approximations. It features adaptivity in the sense that our algorithm automatically learns the distribution of the data and chooses the right partition to use.
- Exact and inexact proximal algorithms for convex and non-convex optimization, with applications in image processing
Inverse problems in imaging are sometimes cast in the form of a non-smooth optimization problem involving an objective function that is the combination of a data misfit term and regularizing penalty or constraint. In this talk I will discuss iterative algorithms for the numerical solution of such problems. The algorithms are based on the availability of the proximal operator of the non-smooth part of the objective function.
In particular, I will discuss convergence when the proximal operator can only be calculated inexactly. Verifiable criteria for such an inexact computation are also given in some cases of practical interest. I will also discuss convergence results in the case of convex and non-convex optimization problems. Finally, some applications in imaging are presented.
This presentation is based on joint work with S. Bonettini, F. Porta, M. Prato and S. Rebegoldi.
Hyperparameter optimization with approximate gradients
Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing the best hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this talk I will describe an algorithm that performs hyperparameter optimization using approximate gradients that arise during early iterations of the training process. This translates into faster convergence than other gradient-based methods since the hyperparameters can be updated well before the training process has finished. By bounding the error in the gradient approximation, we can develop an algorithm that is both efficient and has convergence guarantees. We apply our framework to the estimation of regularization constants of $\ell_2$-regularized logistic regression.
|Clarice MHS Poon
On the use of total variation for accurate reconstructions from highly incomplete Fourier data
The seminal work of Candes, Romberg and Tao in 2006 proved that one can perfectly recover a gradient sparse vector from a highly incomplete subset of its Fourier coefficients, chosen uniformly at random. However, in practice, a uniform random choice of the Fourier coefficients is highly sub-optimal and variable density sampling schemes where one samples more densely at the low frequencies samples has been shown to be far more effective. My talk will present results from  to provide some theoretical justifications towards the use of variable density sampling schemes.
 C. Poon. On the role of total variation in compressed sensing. SIAM J. Imag. Sci. 8(1): 682-720, 2015
- On local convergence of the method of alternating projection
The method of alternating projections is a classical tool to solve feasability problems. Here we prove local convergence of alternating projections between subanalytic sets under a mild regularity hypothesis on one of the sets. We alsoshow that the speed of convergence is O(k−ρ) for some ρ ∈ (0, ∞). As these hypotheses are satisfied in practical situations, we obtain a theoretical explanation for the fact, observed in practice, that even without convexity alternating projections converge well in the neighborhood of A∩B. As an application, we obtain a local convergence proof for the classical Gerchberg- Saxton error reduction algorithm in phase retrieval.
- A simple algorithm for nonconvex and nonsmooth minimization problems
We introduce a new algorithm for a broad class of nonconvex and nonsmooth problems.
It relies on an elementary mixture of first order methods and data information. We outline
a self contained convergence analysis framework describing the main tools and methodology
to prove that the sequence generated by the proposed scheme globally converge to a critical
point. The resulting scheme involves elementary iterations and is particularly adequate for solving
many problems arising in a wide variety of fundamental applications.
- Modern Lagrangian Relaxation
The classical Lagrangian relaxation
is a useful tool to solve difficult optimization problems,
especially those that are large-scale with separable structure. The approach, however, provides a solution to a problem that is bidual to the original one, and as such does not guarantee primal feasibility. To address this issue, extended Lagrangians such as the as so-called ``Sharp" and ``Proximal" can be used, but there is a loss of separability.
To avoid having to choose between two evils, we
combine the recent on-demand accuracy bundle methods with
those extended Lagrangians in a manner that
offers a good compromise between separability and primal convergence.
We outline basic ideas and computational questions, highlighting the main features and challenges with examples from energy optimization. Credit to various co-authors will be given during the presentation.
- Application of a generalized iterative soft-thresholding algorithm to the MEG inverse problem
Magnetoencephalography allows to determine the neural activity in the brain in a non-invasive way and with a high time resolution ($\sim$ 1ms). Magnetic fields outside the scalp, induced by the electrical activity inside the brain, are measured. The reconstruction of this electrical activity involves the resolution of a highly ill-posed inverse problem. In order to solve this problem, a sparsity based regularization can be used. Although the results of this method are satisfactory with respect to the spatial dependence, they introduce unrealistic discontinuities in time. Here, we propose another regularization also involving the promotion of a smooth time variation. We show how the resulting optimization problem can be solved with a generalized iterative soft thresholding algorithm.
- Phase retrieval for the wavelet transform
A phase retrieval problem is an inverse problem in which we aim at reconstructing an unknown vector belonging to a complex vector space from the magnitude of linear measurements. I will present a particular phase retrieval problem, with numerous applications in audio processing: the case where the linear measurements correspond to the wavelet transform. For a particular choice of wavelets, we can prove this problem to be well-posed: any function is uniquely determined by the magnitude of its wavelet transform, and the reconstruction satisfies a property of local stability to noise.
From an algorithmic point of view, this problem, as all phase retrieval problems, is hard to solve, because it is non-convex: most algorithms suffer from the presence of local optima. I will explain how to reformulate the problem to obtain a new algorithm, less sensitive to this phenomenon.
Asynchronous parallel computing in signal processing and machine learning
We propose ARock, an asynchronous parallel algorithmic framework for finding a fixed point to a nonexpansive operator, which abstracts many models in signal processing and machine learning. In the framework, a set of agents (machines, processors, or cores) updates a sequence of randomly selected coordinates of the unknown variable in a parallel asynchronous fashion. As special cases of ARock, novel algorithms in linear algebra, convex optimization, machine learning, distributed and decentralized optimization are introduced. We show that if the nonexpansive operator has a fixed point, then with probability one the sequence of points generated by ARock converges to a fixed point. Very encouraging numerical performance of ARock is observed on solving linear equations, sparse logistic regression, and other large-scale problems in recent data sciences.
This is joint work with Zhimin Peng, Yangyang Xu, and Ming Yan.
- Iteration Complexity Analysis for Some Operator Splitting Methods
We discuss the convergence rates for the Douglas-Rachford and Peaceman-Rachford schemes, two popular operator splitting methods originated from PDE literature with wide applications in many areas including image processing. Our main goal is deriving some worst-case convergence rates which are measured by the iteration complexity. That is, we will show that for such an operator splitting scheme and its k-th iterate, it offers us an approximate solution whose accuracy is at worst O(1/k). We focus on the convex programming contexts and specify these rates to some algorithms that are widely used in many application domains such as the alternating direction method of multipliers and its variants. We also examine the possibility of accelerating these schemes to achieve higher rates such as O(1/k^2).