Just a quick cheatsheet on derivatives (of scalars and vectors) wrt of a vector. This is borrowed from the wiki page : Matrix Calculus.
Vector Calculus CAS Tricks
The following document is borrowed from here.
Oftentimes, we need to get linear approximation of a complicated function. This is ho you can get one with maxima
f(x,y) := exp(x^2) * sin(y);
taylor(f(x,y), [x,y], [1,2], 1);

Maxima Resources
- https://www2.palomar.edu/users/cchamberlin/Math%20205%20pages/Maxima/MaximaBook.pdf
- http://web.csulb.edu/~woollett/
- gkerns.people.ysu.edu/maxima/(opens in a new tab)
Notations
Usually, in print following notations are in use:
A : Matrix (capital and bold)
b : Vector (small and bold)
c : scalar (small and not bold)
The rules for derivatives of a scalar by a vector


The rules for derivatives of a vector by a vector

Using all these rules, I have derived the commonly used least squares equation. The error function is a scalar. The optimization variable (the unknown is a vector). We are trying to minimize the error, hence we need to take the gradient (vector derivative) of error with respect to the optimization variable( the unknown) and set it to zero-vector. The values that make the gradient zeros will also be optimal value for the error (here minimum).

Hope this helps!