I understand why A* algorithm always gives the most optimal path to a goal state when the heuristic always underestimates, but I can't create a formal proof for it.
As far as I understand, for each path considered as it goes deeper and deeper the accuracy of f(n)
increases until the goal state, where it is 100% accurate. Also, no incorrect paths are ignored, as estimation is less than the actual cost; thus leading to the optimal path. But how should I create a proof for it?
The main idea of the proof is that when A* finds a path, it has a found a path that has an estimate lower than the estimate of any other possible paths. Since the estimates are optimistic, the other paths can be safely ignored.
Also, A* is only optimal if two conditions are met:
The heuristic is admissible, as it will never overestimate the cost.
The heuristic is monotonic, that is, if h(ni) < h(ni + 1), then real-cost(ni) < real-cost(ni + 1).
You can prove the optimality to be correct by assuming the opposite, and expanding the implications.
Assume that the path give by A* is not optimal with an admissible and monotonic heuristic, and think about what that means in terms of implications (you'll soon find yourself reaching a contradiction), and thus, your original assumption is reduced to absurd.
From that you can conclude that your original assumption was false, that is, A* is optimal with the above conditions. Q.E.D.