[SciPy-User] fmin_slsqp exit mode 8

Discussion:

j***@gmail.com

2012-09-27 15:15:48 UTC

in statsmodels we have a case where fmin_slsqp ends with mode=8
"POSITIVE DIRECTIONAL DERIVATIVE FOR LINESEARCH"

Does anyone know what it means and whether it's possible to get around it?

the fortran source file doesn't have an explanation.

Thanks,

Josef

Gilles Rochefort

2012-09-28 17:04:11 UTC

Permalink

Could you provide an example that produce such a mode ?

Regards,
Gilles.

Post by j***@gmail.com
in statsmodels we have a case where fmin_slsqp ends with mode=8
"POSITIVE DIRECTIONAL DERIVATIVE FOR LINESEARCH"
Does anyone know what it means and whether it's possible to get around it?
the fortran source file doesn't have an explanation.
Thanks,
Josef
_______________________________________________
SciPy-User mailing list
http://mail.scipy.org/mailman/listinfo/scipy-user

j***@gmail.com

2012-09-28 17:41:10 UTC

Permalink

On Fri, Sep 28, 2012 at 1:04 PM, Gilles Rochefort

Post by Gilles Rochefort
Could you provide an example that produce such a mode ?

It will not be easy or possible to get the example standalone

The example is a L1 penalized maximum likelihood estimation for a
poisson regression
https://github.com/statsmodels/statsmodels/pull/465/files#L2R94

the slsqp part is here
https://github.com/statsmodels/statsmodels/pull/465/files#diff-7

The full code for this is spread over 3 classes (using inheritance)
(or 5 counting all class inheritance levels)

fmin_slsqp works pretty well for Logit, Probit and Multinomial Logit,
but for Poisson with a large regularization parameter, we get exit
code 8. The optimized values also look reasonable in that case.

(the pull request will be merged within a few days.)

Josef

Post by Gilles Rochefort
Regards,
Gilles.

Pauli Virtanen

2012-09-28 18:09:27 UTC

Permalink

Guessing without wading through the F77 goto sphagetti: it could mean
that the optimizer has wound up with a search direction in which the
function increases (or doesn't decrease fast enough). If it's an
termination condition, it probably also means that the optimizer is not
able to recover from this.

Some googling seems to indicate that this depends on the scaling of the
prolem, so it may also be some sort of a precision issue (or an issue
with wrong tolerances):

http://www.mail-archive.com/nlopt-***@ab-initio.mit.edu/msg00208.html

--
Pauli Virtanen

j***@gmail.com

2012-09-28 23:24:18 UTC

Permalink

Post by Pauli Virtanen

I had tried some randomization as new starting values, but in this
example this didn't help.

Post by Pauli Virtanen
Some googling seems to indicate that this depends on the scaling of the
prolem, so it may also be some sort of a precision issue (or an issue

scaling might be a problem in this example

hessian, second derivative of the unpenalized likelihood function

Post by Pauli Virtanen

Post by j***@gmail.com

np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))

array([-16078553.93225711, -1374997.42454279, -299647.67457668,
-138719.26843099, -15800.99493306, -1091.16078941,
-10258.71018359, -3800.22940286, -7530.7029302 ,
-6540.09128479])

Maybe it's just a bad example to use for L1 penalization.

----
I tried to scale down the objective function and gradient, and it works

np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))
array([-588.82869149, -64.89601886, -13.81251974, -6.90900488,
-0.74415772, -0.48190709, -0.03863475, -0.34855895,
-0.28063095, -0.16671642])

I can impose a high penalization factor and still get a successful
mode=0 convergence.
I'm not sure the convergence has actually improved in relative terms.

(Now I just have to figure out if we want to consistently change the
scaling of the loglikelihood, or just hack it into L1 optimization.)

Thanks for the hint,

Josef

Post by Pauli Virtanen
--
Pauli Virtanen
_______________________________________________
SciPy-User mailing list
http://mail.scipy.org/mailman/listinfo/scipy-user

Pauli Virtanen

2012-09-29 11:05:27 UTC

Permalink

29.09.2012 02:24, ***@gmail.com kirjoitti:
[clip]

Post by j***@gmail.com
I tried to scale down the objective function and gradient, and it works
np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))
array([-588.82869149, -64.89601886, -13.81251974, -6.90900488,
-0.74415772, -0.48190709, -0.03863475, -0.34855895,
-0.28063095, -0.16671642])
I can impose a high penalization factor and still get a successful
mode=0 convergence.
I'm not sure the convergence has actually improved in relative terms.
(Now I just have to figure out if we want to consistently change the
scaling of the loglikelihood, or just hack it into L1 optimization.)

Ideally, the SLSQP algorithm itself would be scale invariant, but
apparently something inside the code assumes that the function values
(and maybe gradients) are "of the order of one".

--
Pauli Virtanen

j***@gmail.com

2012-09-29 15:31:18 UTC

Permalink

Post by Pauli Virtanen
[clip]

Ideally, the SLSQP algorithm itself would be scale invariant, but
apparently something inside the code assumes that the function values
(and maybe gradients) are "of the order of one".

That sounds like the right explanation.

I was also surprised that it only has one precision parameter, acc,
where I didn't figure out where it is used (maybe everywhere), but we
needed to make it smaller than the default.

Josef

Post by Pauli Virtanen
--
Pauli Virtanen
_______________________________________________
SciPy-User mailing list
http://mail.scipy.org/mailman/listinfo/scipy-user