Hello Everyone,

In class, I am given to show that maximizing over x1 first and then maximizing over x2 for f(x1,x2) is the same as maximizing x2 first and then maximizing over x1 for f(x1,x2), assuming f(x1,x2) is always continuous.

Graphically, this statement is apparent to me, because maximizing over x1 first for f(x1,x2) means to find a curve on f(x1,x2) that's only a function of x2 and the maximum point of f(x1,x2) lies on the curve. Maximizing over x2 on this curve means to find the highest point on the curve on the x2 axis. On the other hand, maximizing over x2 first for f(x1,x2) means to find a curve on f(x1,x2) that's only a function of x1 and the maximum point of f(x1,x2) on the curve. Maximizing over x1 on this curve means to find the highest point on the curve on the x1 axis. Either way, the two methods are like the two sides of the same coin in the sense that they both look for the highest point(s) of f(x1,x2).

Algebraically, I'm not sure what my conclusion means. Here's my approach:

(max over x1) (max over x2) f(x1,x2) = (max over x1) g(x1), where x2* = g(x1) is the point of x2 which maximizes f(x1,x2). Then (max over x1) g(x1) = x1* = k for some k in R.

On the other hand,

(max over x2) (max over x1) f(x1,x2) = (max over x2) h(x2), where x1* = h(x2) is the point of x1 which maximizes f(x1,x2). Then (max over x2) h(x2) = x2* = c for some c in R.

Then, in order for (max over x1) (max over x2) f(x1,x2) = (max over x2) (max over x1) f(x1,x2) to be true, the following needs to be true:

g(c) = k

h(k) = c

In particular, this has to be true:

h(g(c)) = c, which implies h = g-1.

I can't make sense of what h = g-1 means graphically or geometrically. I don't even know if I did this right. Could anyone shed any light on this problem and my approach?

Any Idea , Suggestions would be appreciated,

Thanks,

I didn't find the right solution from the internet.

References:http://www.thescienceforum.com/mathematics/46480-maximization-problem.html

The definition of a critical point is where both the partial derivatives (wrt x and wrt y) equal 0. You cannot take the derivatives separately. If you have a radially symmetric surface, you will luck out and be ok, but for something lacking that symmetry, you most likely not find any maximum, let alone a global max.

For instance, plot up f(x,y) =4+x^3+y^3-3xy. (making level curves with Desmos or equivalent is great) There is a local min at (1,1). Starting at the point (2,3) and proceed as you suggest. You will not end up at the same endpoint in either of the ways you suggest, and certainly not at (1,1).

Makes you question all those science teachers who always stated "only change one variable at a time", doesn't it. (Another lecture for another day.)