Discussion:
[SciPy-User] fmin_slsqp exit mode 8
j***@gmail.com
2012-09-27 15:15:48 UTC
Permalink
in statsmodels we have a case where fmin_slsqp ends with mode=8
"POSITIVE DIRECTIONAL DERIVATIVE FOR LINESEARCH"

Does anyone know what it means and whether it's possible to get around it?

the fortran source file doesn't have an explanation.

Thanks,

Josef
Gilles Rochefort
2012-09-28 17:04:11 UTC
Permalink
Could you provide an example that produce such a mode ?

Regards,
Gilles.
Post by j***@gmail.com
in statsmodels we have a case where fmin_slsqp ends with mode=8
"POSITIVE DIRECTIONAL DERIVATIVE FOR LINESEARCH"
Does anyone know what it means and whether it's possible to get around it?
the fortran source file doesn't have an explanation.
Thanks,
Josef
_______________________________________________
SciPy-User mailing list
http://mail.scipy.org/mailman/listinfo/scipy-user
j***@gmail.com
2012-09-28 17:41:10 UTC
Permalink
On Fri, Sep 28, 2012 at 1:04 PM, Gilles Rochefort
Post by Gilles Rochefort
Could you provide an example that produce such a mode ?
It will not be easy or possible to get the example standalone

The example is a L1 penalized maximum likelihood estimation for a
poisson regression
https://github.com/statsmodels/statsmodels/pull/465/files#L2R94

the slsqp part is here
https://github.com/statsmodels/statsmodels/pull/465/files#diff-7

The full code for this is spread over 3 classes (using inheritance)
(or 5 counting all class inheritance levels)

fmin_slsqp works pretty well for Logit, Probit and Multinomial Logit,
but for Poisson with a large regularization parameter, we get exit
code 8. The optimized values also look reasonable in that case.

(the pull request will be merged within a few days.)

Josef
Post by Gilles Rochefort
Regards,
Gilles.
Post by j***@gmail.com
in statsmodels we have a case where fmin_slsqp ends with mode=8
"POSITIVE DIRECTIONAL DERIVATIVE FOR LINESEARCH"
Does anyone know what it means and whether it's possible to get around it?
the fortran source file doesn't have an explanation.
Thanks,
Josef
_______________________________________________
SciPy-User mailing list
http://mail.scipy.org/mailman/listinfo/scipy-user
Pauli Virtanen
2012-09-28 18:09:27 UTC
Permalink
Post by j***@gmail.com
in statsmodels we have a case where fmin_slsqp ends with mode=8
"POSITIVE DIRECTIONAL DERIVATIVE FOR LINESEARCH"
Does anyone know what it means and whether it's possible to get around it?
the fortran source file doesn't have an explanation.
Guessing without wading through the F77 goto sphagetti: it could mean
that the optimizer has wound up with a search direction in which the
function increases (or doesn't decrease fast enough). If it's an
termination condition, it probably also means that the optimizer is not
able to recover from this.

Some googling seems to indicate that this depends on the scaling of the
prolem, so it may also be some sort of a precision issue (or an issue
with wrong tolerances):

http://www.mail-archive.com/nlopt-***@ab-initio.mit.edu/msg00208.html
--
Pauli Virtanen
j***@gmail.com
2012-09-28 23:24:18 UTC
Permalink
Post by Pauli Virtanen
Post by j***@gmail.com
in statsmodels we have a case where fmin_slsqp ends with mode=8
"POSITIVE DIRECTIONAL DERIVATIVE FOR LINESEARCH"
Does anyone know what it means and whether it's possible to get around it?
the fortran source file doesn't have an explanation.
Guessing without wading through the F77 goto sphagetti: it could mean
that the optimizer has wound up with a search direction in which the
function increases (or doesn't decrease fast enough). If it's an
termination condition, it probably also means that the optimizer is not
able to recover from this.
I had tried some randomization as new starting values, but in this
example this didn't help.
Post by Pauli Virtanen
Some googling seems to indicate that this depends on the scaling of the
prolem, so it may also be some sort of a precision issue (or an issue
scaling might be a problem in this example

hessian, second derivative of the unpenalized likelihood function
Post by Pauli Virtanen
Post by j***@gmail.com
np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))
array([-16078553.93225711, -1374997.42454279, -299647.67457668,
-138719.26843099, -15800.99493306, -1091.16078941,
-10258.71018359, -3800.22940286, -7530.7029302 ,
-6540.09128479])

Maybe it's just a bad example to use for L1 penalization.

----
I tried to scale down the objective function and gradient, and it works

np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))
array([-588.82869149, -64.89601886, -13.81251974, -6.90900488,
-0.74415772, -0.48190709, -0.03863475, -0.34855895,
-0.28063095, -0.16671642])

I can impose a high penalization factor and still get a successful
mode=0 convergence.
I'm not sure the convergence has actually improved in relative terms.


(Now I just have to figure out if we want to consistently change the
scaling of the loglikelihood, or just hack it into L1 optimization.)

Thanks for the hint,

Josef
Post by Pauli Virtanen
--
Pauli Virtanen
_______________________________________________
SciPy-User mailing list
http://mail.scipy.org/mailman/listinfo/scipy-user
Pauli Virtanen
2012-09-29 11:05:27 UTC
Permalink
29.09.2012 02:24, ***@gmail.com kirjoitti:
[clip]
Post by j***@gmail.com
I tried to scale down the objective function and gradient, and it works
np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))
array([-588.82869149, -64.89601886, -13.81251974, -6.90900488,
-0.74415772, -0.48190709, -0.03863475, -0.34855895,
-0.28063095, -0.16671642])
I can impose a high penalization factor and still get a successful
mode=0 convergence.
I'm not sure the convergence has actually improved in relative terms.
(Now I just have to figure out if we want to consistently change the
scaling of the loglikelihood, or just hack it into L1 optimization.)
Ideally, the SLSQP algorithm itself would be scale invariant, but
apparently something inside the code assumes that the function values
(and maybe gradients) are "of the order of one".
--
Pauli Virtanen
j***@gmail.com
2012-09-29 15:31:18 UTC
Permalink
Post by Pauli Virtanen
[clip]
Post by j***@gmail.com
I tried to scale down the objective function and gradient, and it works
np.linalg.eigvals(poisson_l1_res._results.model.hessian(poisson_l1_res.params))
array([-588.82869149, -64.89601886, -13.81251974, -6.90900488,
-0.74415772, -0.48190709, -0.03863475, -0.34855895,
-0.28063095, -0.16671642])
I can impose a high penalization factor and still get a successful
mode=0 convergence.
I'm not sure the convergence has actually improved in relative terms.
(Now I just have to figure out if we want to consistently change the
scaling of the loglikelihood, or just hack it into L1 optimization.)
Ideally, the SLSQP algorithm itself would be scale invariant, but
apparently something inside the code assumes that the function values
(and maybe gradients) are "of the order of one".
That sounds like the right explanation.

I was also surprised that it only has one precision parameter, acc,
where I didn't figure out where it is used (maybe everywhere), but we
needed to make it smaller than the default.

Josef
Post by Pauli Virtanen
--
Pauli Virtanen
_______________________________________________
SciPy-User mailing list
http://mail.scipy.org/mailman/listinfo/scipy-user
Loading...