Cheers, ]]>

‘Note that while in these cases the Jacobian is a square matrix in the event that it is not a square matrix, the determinant of \textbf{J}(\textbf{q})\;\textbf{J}^T(\textbf{q}) can be found instead.’

Can you please refer me to a source which talks about this in detail?

]]>I look forward to perusing some more!

]]>Hmm, what situations are you finding it’s diverging? Running it with Vxx = .5*(Vxx+Vxx’) is adding a filter to the value function, so it would smooth out any quick dips or spikes, essentially averaging over an area of state space if I’m thinking about it correctly. Which seems fine to me to do in noisier systems or where the value function is some crazy shape with a manifold that makes it difficult to follow the gradient…

]]>`(L0 + L1) * (-sin(q0))`

, I’ve fixed it in the post. And you’re right, the determinant should be 0 for all q0, I suspect a calculation mistake.
]]>