I found
this explanation of how to calculate a sine/cosine function to be interesting. A quick glance shows that they are using multiple rotations of the vector (1,0) till the angle of the final rotation is close enough to the angle of the desired rotation. Inside the description is a method that does this very fast, but I don't really understand that part fully yet(its late , i'm going to sleep, also I really don't care about it at this time).
The interesting thing though is that a similar method can be used to find the inverse cosine. First for simplicity assume the angle lies between 0 and Pi/2. Take the initial vector at (1,0). The idea is that at each iteration if the x component of the vector is larger then the given cosine(the input) then the vector is rotated counter clockwise, otherwise the vector is rotated clockwise. Each iteration uses a rotation that is half as small as the previous rotation(the first rotation being Pi/4-counter clockwise), and the angle the vector is rotated must be kept track of as its the desired output(+ for counter clockwise rotations, -for clockwise rotations). Do as many iterations as you want accuracy(considering round off errors, and that without using the techniques mentioned in that link that you would have to precalculate the cos(Pi/(2^n)) and sin(Pi/(2^n)) for a max of n-1 iterations), however I think you must also consider the problems of accuracy using integers for each method.
TBH I think a look up table using both the value for cos-1 as well as the first two derivatives of cos-1 for a second order series would be the easiest way to go. Oh wait..... I just looked and cos-1(x) is -1/sqrt(1-x^2) which goes to to -infinity as it approaches 1
Did a quick plot of the second order approximation compared to the cos-1 and found that the approximation itself doesn't last very long as the approximation is brought near 1. At x0=.995 I found the approximation lasts reasonably till x=.9975. At x0=.998 the approximation only lasted till about x=.999. The rotation method is starting to look
A LOT faster considering the singularity of the derivative of cos-1(x) at x=1.
EDIT: In that link figure out wth those techniques are to speed up the rotations. I can see how the method itself is sound, but using those techniques to speed up the rotations are probably very important(unless you don't care about speed).
EDIT#2: The error(not including round off error) for the rotations method goes something like Pi/(2^(n+1)) where n is the number of iterations(n=1 being the Pi/4 rotation). A method that converges that fast would be hard to pass up.