Lagrange multipliers

Lagrange multipliers are a mathematical tool for optimizing functions subject to constraints. They allow solving problems like:

"Find the maximum/minimum of $f (x)$ , given $g (x) = c$ ."

The method augments the original problem into a new system:

L (x, λ) = f (x) - λ (g (x) - c) .

We solve $d L = 0$ for points $(x, λ)$ . I.e.,

\frac{\partial f}{\partial x^{i}} d x^{i} - λ \frac{\partial g}{\partial x^{i}} d x^{i} - (g (x) - c) d λ = 0

(\frac{\partial f}{\partial x^{i}} - λ \frac{\partial g}{\partial x^{i}}) d x^{i} - (g (x) - c) d λ = 0

Geometric Interpretation
At the solution point:

Since $\frac{\partial f}{\partial x^{i}} - λ \frac{\partial g}{\partial x^{i}} = 0$ we have $d f$ and $d g$ , or $\nabla f$ and $\nabla g$ , are parallel (the constraint surface is tangent to the objective's level set).
$g (x) - c = 0$ , so the point satisfies the constraint.
$λ$ measures how much $f$ changes if $c$ is relaxed.

Pasted image 20250628091359.png

Possible motivation for Lagrange definition of $L$

Step 1: The Problem, Visualized

Imagine you are a hiker trying to find the highest point on a mountain, but you are forced to stay on a specific trail.

The Mountain: This is your objective function, $f (x, y)$ . The altitude at any point is given by $f$ .
The Trail: This is your constraint, $g (x, y) = c$ . It's a specific path on the map.
Level Curves: The contour lines on a map, where the altitude is constant (e.g., $f (x, y) = 1000 m$ , $f (x, y) = 1100 m$ , etc.).
You are walking along the trail ( $g = c$ ). When are you at a local maximum or minimum altitude?
You are at an optimum point when the trail becomes tangent to a contour line of the mountain.
Why? Think about it: If your trail crosses a contour line, it means you are moving from a lower altitude to a higher one (or vice-versa). If you can still move along the trail and increase your altitude, you're not at the maximum yet!
The only place where you can't increase or decrease your altitude by moving a tiny step along the trail is the exact point where the trail direction is momentarily parallel to the contour line direction—the point of tangency.

At a constrained optimum, the gradient of the objective function ( $\nabla f$ ) must be parallel to the gradient of the constraint function ( $\nabla g$ ).

If two vectors are parallel, one must be a scalar multiple of the other. We'll call that scalar $λ$ (lambda).

\nabla f (x) = λ \nabla g (x)

Step 3: The Stroke of Genius – Creating $L$

So, at this point, Lagrange knows he has to solve a system of two conditions for a simple 2D problem:

$\nabla f (x, y) = λ \nabla g (x, y)$ (The geometric tangency condition)
$g (x, y) = c$ (The original constraint)

The first condition can be rewritten as $\nabla f (x, y) - λ \nabla g (x, y) = 0$ . In terms of components, this is:

$\frac{\partial f}{\partial x} - λ \frac{\partial g}{\partial x} = 0$
$\frac{\partial f}{\partial y} - λ \frac{\partial g}{\partial y} = 0$

And the constraint is:

$g (x, y) - c = 0$

Lagrange's genius was to see that this entire system of equations is exactly what you get if you define a single function and find its unconstrained critical points.

He asked: "What function, if I take its partial derivatives with respect to $x$ , $y$ , and even $λ$ , would give me this exact system of equations?"

This leads directly to the definition of the Lagrangian:

L (x, y, λ) = f (x, y) - λ (g (x, y) - c)

2. Coming from Mercury perihelion problem: