Week 5: Root Finding
Rahman notes:
In the theory lectures we looked at three different root finding techniques, and applied them to a function that looked something like . Here let's look at a concrete, but nontrivial function. Consider the functions on the interval . Suppose we are looking for the local extrema of this function; i.e., we look for the roots of .
Lets first plot the two functions to give us a graphical idea of what is happening.
plot(x, f, x, df, 'linewidth', 4);
Section Search: Iterate and Pray
Notice that we have a discretization of x already in the plot above. And from that we have a discretization of . Now all we have to do is find the entry of x that corresponds to that is closest to zero. In order to do that all we have to do is find the entry that gives us the minimum of the absolute value of . [~, root_entry] = min(abs(df));
plot(x, f, x, df, 'linewidth', 4)
plot(root, 0, 'k.', root, f(root_entry), 'g.', 'Markersize', 40)
Remeber, these are just approximations. That is not far off! But that is because we use a quite fine discretization. Let's try it with a much coarser discretization. This time it's quite far off the root, and since it's quite far of the root, can we even trust the local minima?
[~, root_entry] = min(abs(df));
plot(x, f, x, df, 'linewidth', 4)
plot(root, 0, 'k.', root, f(root_entry), 'g.', 'Markersize', 40)
Now what if we weren't restriced to a domain where the function only has one root? Say . plot(x, df, 'linewidth', 4);
Notice that this has quite a few roots. How do we find them all? This requires us to be a bit clever. Let's do a for loop that goes through the entirety of x, and pick out the points when is decreasing and then suddenly changes direction to increasing. Right after it changes direction we will pick the entry before it. This is like going over the edge, and then figure out the edge was just one step behind you. if abs(df(i+2)) > abs(df(i+1)) && abs(df(i)) > abs(df(i+1))
Bisection Method
With bisection we must know that there is a root between two points a and b. If there are multiple roots, it will only find one.
For the interval , we know for sure that there is a root between 2 and 5. So we will bisect our interval each time until we hone in on a root. df_a = cos(a) - a*sin(a);
df_b = cos(b) - b*sin(b);
df_mid = cos(xmid) - xmid*sin(xmid);
else if sign(df_mid) == sign(df_a)
df_mid = cos(xmid) - xmid*sin(xmid);
df_a = cos(a) - a*sin(a);
df_b = cos(b) - b*sin(b);
[xmid, cos(xmid) - xmid*sin(xmid)]
3.425292968750000 -0.001252652288744
df_a = cos(a) - a*sin(a);
df_b = cos(b) - b*sin(b);
df_mid = cos(xmid) - xmid*sin(xmid);
else if sign(df_mid) == sign(df_a)
df_mid = cos(xmid) - xmid*sin(xmid);
df_a = cos(a) - a*sin(a);
df_b = cos(b) - b*sin(b);
[xmid, cos(xmid) - xmid*sin(xmid)]
3.425618459481728 0.000000000000000
Newton's method
With Newton's method we only need a nearby point to the root, however, we better be sure it's close to that root and not some other root. And even then it may not converge.
x = x - (cos(x) - x*sin(x))/(-2*sin(x) - x*cos(x)); %We do have to calculate the derivative here
[x, cos(x) - x*sin(x)]
-15.771284874815882 -0.000000000000007
x = x - (cos(x) - x*sin(x))/(-2*sin(x) - x*cos(x)); %We do have to calculate the derivative here
[x, cos(x) - x*sin(x)]
3.425618459481728 0.000000000000000
Brief word on built in functions
Built in functions are useful, and it also allows you to not have to think too much about the problem at hand, but we should make sure not to be lulled into a sense of comfort. After all, these are just computer programs, not magic. I personally, for my work, only use built in functions if I know for sure it is optimized (e.g. on MATLAB most Linear Algebra functions are more optimized than something I could write, especially since I am not an expert in Linear Algebra). I am an expert in Differential Equations, and in my work I rarely ever use a built in differential equation solver because I often need it to do something specialized that I know for sure the creators of the solver did not optimize. Further, as we have seen with the backslash operator, we need to know how it works. If we don't we can very easily make irrepreparable damage to our codes (or worse, someone else's code that we are contributing to).
You will probably not have a need to use the following function, but it is there if on the off chance you need it. The following function looks for the local minima of our original between two points, similar to bisection. Here we use 2 and 5. It takes in an anonymous function and the two bounds 2 and 5. fminbnd(@(x)(x * cos(x)), 2, 5)
The next function searches for local minima near a particular point, similar to Newton's method. Here we will use x = 5, y = 7. It takes in an anonymous function, and the nearby x and y values
f = @(v)((v(1) - 1)^2 + 3*(v(2) + 2)^2);
fminsearch(f, [5; 7])
1.000016192604872
-1.999979249216309
They work fine, but since we have to use anonymous functions, which are ridiculously slow for large operations, we are probably better off just writing our own code.