## Friday, May 31, 2013

### Efficiently Sample from Normal (Gaussian) Distribution

The following code snippet samples from a standard normal distribution using the Box-Muller transform, which avoids nastiness like cropping to a fixed number of standard-deviations or rejection sampling.  I have shamelessly cribbed this from Wikipedia (the previous link), but use it all the time, so I'm putting it up as a snippet.

void utils_sample_normal( float &x, float &y ){
float u=drand48(), v=drand48();
float lnu = sqrt( -2.0*log(u) );
x = lnu*cos(2.0*M_PI*v);
y = lnu*sin(2.0*M_PI*v);
}


### pyPolyCSG library updated

I have updated the pyPolyCSG python constructive solid geometry library to use the most recent version of the Carve CSG library. The updates to Carve appear to make it significantly more robust as well as much faster.

The update to pyPolyCSG:
• Improves the robustness, at least on my existing scripts
• Improves the speed of performing Boolean operations
• Fixes a bug where vertex normals and texture coordinates were treated as vertices (thanks to Ryan Rix for finding this)
• Adds the ability to transform polyhedra by a 3x3 or 4x4 matrix (again thanks to Ryan Rix!)
You can get the updated version from github: https://github.com/jamesgregson/pyPolyCSG

## Sunday, May 26, 2013

### Smooth Feedrate Envelopes for Motion Control, Part II

In the previous post, I derived equations for smooth feedrate control of stepper motors, claiming that by using a smoother feedrate envelope that the motors could be driven faster with less chance of skipping steps and losing position.

In this post, I demonstrate this with a real stepper motor and show that it actually does work: using the envelopes does actually prevent the motors from losing steps.  My test setup is a single NEMA 17 stepper, driven by one of my A4988 driver breakouts, which is controlled by an Arduino sketch running on the Arduino Due.  I'm using half-stepping on the motors, driven by a 200KHz timer interrupt step callback which decides whether or not to step based on the interpolated supplied delays for the start and end of each move.  The move itself approximates a square wave, first accelerating from a slow feedrate, then performing a constant speed portion, then decelerating back to the initial feedrate.

The video below shows the stepper being driven using a constant acceleration profile, which causes the kinks in the feedrate graph in my previous post. You can clearly see it moving around on the table and stalling frequently before it reaches the top speed.

In contrast, here is the result using the third-order cubic feedrate envelope for the same set of moves. The stepper is easily able to handle the top speed and jerks around considerably less on the table.  Of course this comes at a price, a higher pulse-frequency must be used to resolve the acceleration profile.

You can get the code I used for this from following link: https://sites.google.com/site/jamesgregson/tmp/linear_move.zip, it includes a multi-axis DDA implementation suitable for use with timer-interrupts as well as the code for evaluating the feedrate envelopes.

## Thursday, May 23, 2013

### Smooth Feedrate Envelopes for Motion Control

When running motors it is desirable to run them as smoothly as possible to minimize vibrations and possible missed steps. This is why controllers for 3D printers and CNC machines typically incorporate some notion of acceleration rather than instantly switching from one feedrate to another.  Often this is done with a simple ramp of the feedrate, i.e. using a constant acceleration profile.

As an example, consider a machine starting at feedrate $f_0$ and then performing a very long linear move at a constant feedrate $f_1$. For this example, If $f(t)$ is the machine feedrate over time and $a$ is a constant acceleration, this profile would be defined mathematically as follows:

\begin{equation}
f(t) = \begin{cases}
f_0      & \mbox{if } t < t_0 \\
f_0 + a (t - t_0) \hspace{0.5cm} & \mbox{if } 0 \leq t - t_0 \leq \frac{f_1-f_0}{a} \\
f_1 & \mbox{otherwise}
\end{cases}
\end{equation}

Graphically, here's a plot of the feedrate over time for a move starting at rest at $t=0$ and accelerating up to a feedrate of 2 over one unit of time:

The problem with a constant acceleration profile is that there are sharp kinks in the feedrate plotted over time.  These kinks imply instantaneous changes in acceleration, which in turn imply infinite forces for infinitely short periods of time.  Of course, there is no mechanical way to produce these forces, so what actually happens is the machine overshoots very slightly and averages the forces out over a short time. For low feedrates with a light machine this actually works okay, but for a heavy machine at high-feedrates, the overshoot can be more than a motor step which causes the machine to lose position.  In an open-loop design, once the machine loses position, it never recovers and in all likelihood, the part is ruined.

There are a few ways to address this problem:
• Use lower feedrates
• Use higher torque motors
• Use a closed-loop control scheme, e.g. with encoders on the motors
• Make the acceleration smooth
The first is clearly not an option because it wastes time and feedrates may be chosen specifically for valid reasons such as minimizing local part heating or reducing machining time.  In an ideal world we'd do the remaining three items, but options two and three are expensive, particularly for hobby gear.  However the fourth option can be tackled in firmware with minimal hardware overhead.

In order to smoothly transition between accelerations we can simply use a different curve to interpolate the feedrates.  The conditions needed are that the feedrates match the desired rates at the beginning and end of the curve and that the slope of the feedrate curves (i.e. the acceleration) is zero at the endpoints.  In between the endpoints we want the curve to be smooth.

The one of the simplest classes of functions that meet these requirements are cubic polynomials.  These are defined by four coefficients $a$, $b$, $c$ and $d$ using the following equation, where $\tau$ is the fraction of the total time spent accelerating:

\begin{equation}
f(\tau) = a \tau^3 + b \tau^2 + c \tau + d
\end{equation}

We now want to solve for the coefficients needed to reproduce the move.  There are four coefficients so we need four equations.  Two come from the requirement that we match the feedrates at the curve endpoints:

\begin{eqnarray}
f(\tau=0) = a 0^3 + b 0^2 + c 0 + d &=& f_0 \\
f(\tau=1) = a 1^3 + b 1^2 + c 1 + d &=& f_1
\end{eqnarray}

From these, we see that $d=f_0$ and $a+b+c=f_1-f_0$. The remaining two equations can be found using the requirements that the slope of the feedrate curve is zero at the endpoints. To enforce these constraints we need the derivative of the cubic function:

\begin{equation}
f'(\tau) = 3 a \tau^2 + 2 b \tau + c
\end{equation}

The constraints can now be enforced by requiring that:

\begin{eqnarray}
f'(\tau=0) = 3 a 0^2 + 2 b 0 + c &=& 0 \\
f'(\tau=1) = 3 a 1^2 + 2 b 1 + c &=& 0
\end{eqnarray}

These equations make it clear that $c=0$ and $3 a + 2 b = 0$. Combining these with the previous conditions leaves two equations and two unknowns:

\begin{eqnarray}
a + b &=& f_1 - f_0 \\
3 a + 2b &=& 0
\end{eqnarray}

So $a = -\frac{2 b}{3}$ which means that $b = 3 (f_1-f_0)$ and $a = -2 (f_1 - f_0)$. This gives the following equation for the interpolating curve:

\begin{equation}
f(\tau) = -2(f_1-f_0)\tau^3 + 3(f_1-f_0)\tau^2 + f_0
\end{equation}

The only remaining thing is to define $\tau$ in terms of $t$.  This is a simple linear interpolation from the start of the acceleration $t_0$ to the end of the acceleration $t_1=\frac{f_1-f_0}{a}$:

\begin{equation}
\tau = \frac{t-t_0}{t_1-t_0} = \frac{t-t_0}{\frac{f_1-f_0}{a}-t_0}
\end{equation}

Plotting this for the same parameters as before gives a smooth, kink-free curve that considerably reduces the time-rate-of-change of acceleration:

In the post-to-come I will demonstrate applying this to a real stepper motor being driven aggressively.  Although seemingly complicated, for a cost of only a few operations per step, it is possible to switch from the linear acceleration profile to the cubic one derived here and get considerably smoother operation.

### Matlab Signal Deblurring & Denoising Example

To date my research has been largely focused on inverse problem such as tomography or image deblurring.  These problems are often highly under-determined and so must include strong priors to obtain good solutions and finding efficient solvers for these priors is challenging.  A labmate recently pointed me towards the ADMM method, which splits the full problem into coupled sub-problems.  It has a number of advantages:
• Flexibility - ADMM handles a number of non-trivial problems in a common framework
• Efficiency - Often the subproblems have efficient, often embarassingly-parallel, solvers
An excellent introduction to the method can be found in the paper: Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers by Boyd et al.. It's a pretty gentle and practical introduction, with numerous example problems.

I've quickly implemented ADMM for combined deblurring and denoising of 1D input signals using the total-variation regularization in a Generalized-Lasso problem definition.  Unlike the Boyd paper, I've chosen to use Landweber iterations to solve the data subproblem as these are commonly used in large-scale deblurring and tomography.  My well-commented sample implementation (see the end of this post) allows all the method parameters to be tweaked to see the effect of using inexact solves, different penalty parameters and different regularization weights.

As an example, the image below shows a 100 sample square-wave signal, blurred by a Gaussian with standard deviation of 3 and corrupted by Gaussian noise with a sigma of 5%.  Only two inner Landweber iterations were used for each outer iteration.  The red line shows the original blurred and noisy input, while the blue line shows the reconstructed result after 100 iterations.
Overall I find that the method works as advertised. It's fairly easy to implement, either in Matlab or in C/C++, can handle 1-norms of general linear regularizers and converges quickly.  It also allows matrix-free solvers to be used for the data-subproblem, while the solve for the prior is embarrassingly-parallel, so it should scale quite well too.

You can get the implementation that produced this plot below. Note that you may get slightly different results since the noise is generated randomly.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%                                                                     %%%
%%% Sample signal denoising & deblurring using ADMM                 %%%
%%% code written by James Gregson (james.gregson@gmail.com), 2013       %%%
%%% Use the code for whatever you'd like                                %%%
%%%                                                                     %%%
%%% Demonstrates performing Total-Variation (TV) denoising and          %%%
%%% deblurring of a 1D input signal using the ADMM iterative scheme     %%%
%%% approach can be found in .  The approach from  is modified    %%%
%%% slightly by using gradient-descent (Landweber iterations) to solve  %%%
%%% the first subproblem in place of a direct solver or iterative       %%%
%%% method such as conjugate gradient.  Matrix-free Landweber           %%%
%%% iterations are commonly used for large-scale linear inverse         %%%
%%% problems and this sample code allows the accuracy of the            %%%
%%% subproblem solves to be adjusted to see the effect on the final     %%%
%%% reconstructions.                                                    %%%
%%%                                                                     %%%
%%%  Boyd et al., Distributed Optimization and Statistical Learning  %%%
%%%     via the Alternating Direction Method of Multipliers, 2010       %%%
%%%                                                                     %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

close all;
clear all;

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Problem Parameters %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

N           = 100;     % Signal sample count
lambda      = 0.1;     % Total-variation weight
sigma       = 3.0;     % PSF sigma for generating blurred input
noise       = 0.05;    % Gaussian-noise sigma

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

rho         = 1.0;     % ADMM constraint weight 0 < rho <= 2
outer_iters = 100;     % Number of ADMM iterations
inner_iters = 2;       % Number of Landweber steps per ADMM iteration
relax       = 1.9;     % Under-relaxation factor for Landweber steps [0,2]

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Construct System PSF & Image Formation Model %%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

off = -ceil(sigma*3):ceil(sigma*3);  % psf pixel offsets
psf = exp( -off.^2/sigma^2 );        % psf values for offsets
psf = psf/sum(psf);                  % normalize to 1.0

% generate psf matrix by setting diagonals based on 1D psf above
M = zeros( N, N );
for i=1:numel(psf),
M = M + diag( psf(i)*ones(N-abs(off(i)),1), off(i) );
end
M = sparse( M );

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Generate noisy and blurred synthetic input %%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

input(floor(N/4):ceil(3*N/4),:) = 1.0;  % make a center region at 1.0
blur = M * input + noise*(randn(N,1));  % blur the input by the psf

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Construct the difference matrix that computes image gradients %%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

A = sparse( -diag( ones(N,1), 0 ) + diag( ones(N-1,1), 1 ) );
A(N,N-1)=1; % use a backward difference for the final point

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Setup ADMM and perform iterations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

% compute the eigenvalues of the first subproblem
% system matrix and use the largest to define a
% Landweber step size, for large system a a power-
% iteration should be used instead.
tmp = eigs(M'*M + rho*A'*A);
step = relax/tmp(1);

x = blur;             % intrinsic (sharp) signal
z = zeros( N, 1 );    % splitting variable
u = zeros( N, 1 );    % scaled Lagrange multipliers

% define an anonymous shrinkage operator to implement
% the second sub-problem solve
shrink = @(kappa,x) max( abs(x)-kappa, 0 ).*sign(x);

% perform outer_iters outer iterations, plot the
% solver progress as the iterations proceed
for k=1:outer_iters,
fprintf( 1, 'iteration %d of %d\n', k, outer_iters );
plot( x, 'b-+' );
drawnow;

% define an anonymous function returning the gradient
% of the first subproblem w.r.t. x, holding z and u
% fixed, then perform gradient-descent (Landweber
% iterations).
gradF = @(x) M'*(M*x) - M'*blur + rho*A'*( A*x - z + u );

% solve the second sub-problem using the anonymous
% shrinkage operator
z = shrink( lambda/rho, A*x + u );

% update the scaled Lagrange multipliers
u = u + A*x - z;
end

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%% Setup ADMM and perform iterations %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

hold on;
plot( blur, 'r-o' );
plot( x, 'b-o' );