
Mathematical Programming Glossary  B
Backtracking. A procedure to consider a node in a heuristic search that has descendants not expanded (see branch and bound). A common procedure for a depthfirst search is LIFO: go back on path from current node towards the root until a first node has been found with a descendant requiring expansion. Backward transformation (abbr. BTRAN). This applies to the factored system, u[E_1*E_2*...*E_n]=c, where each E_i is an elementary matrix. The recursion is: In the end, u = u_1 is the solution. Each elementary system is solved as follows. For notational convenience, suppose the system is uE = w, and v is the distinguished pth column of E. Then,
u(i) = w(i) for i not= p;
This is what is done in the (revised)
simplex method to update the
pricing vector.
u(p) = (w(p)  Sum_i{w(i)v(i): i not= p}) / v(p). Backward triangularization. An algorithm to rearrange a matrix by recursively assigning a singleton column to the rear (with its only row, as its pivot). The recursion applies to the submatrix defined by omitting the assigned column and row (and reducing other column counts accordingly). This results in a lower triangular rearrangement if, and only if, such a rearrangement exists. Balance equation. A constraint of the form I(x)  O(x) = 0, where I(x) is the input and O(x) is the output, each as a function of the decision variables, x. Typically this is linear, conserving flows, measured in units of volume or mass, through something. In an ordinary network, N=[V,A], where there are no gains or losses, this appears as: ___ ___ \ \  x(j,i)   x(i,j) = 0 at node i. / /   j:(j,i) in A j:(i,j) in A ============= ============== I(x) = flow O(x) = flow into node i out of node i[graphic] Barrier function. P is a barrier function on the strict interior, say I, of the set S if P is continuous, and P(x)>infinity as x approaches any point in S\I. This arises in penalty function methods for certain classes that include convex programs of the form: Min{f(x): g(x) >= 0} (where I = {x: g(x) > 0}). Two popular choices are P(x) = Sum_i{1/g_i(x)}, and P(x) = Sum_i{log(g_i(x))}. Barycenter. From physics, this is the center of mass: the point at which a mass (body) can be considered as being concentrated without altering the effect of the attraction that the earth has upon it. This is the average position ({x^i}) of particles, weighted by their masses ({m_i}): Sum_i{m_i x^i}/Sum_i{m_i}. In mathematical programming, this is usually used to mean the simple average of the extreme points of a polytope: Sum_i{x^i}/m, where {x^i} = {x^1, ..., x^m} is the set of extreme points. More generally, this could be a weighted average:
Sum_i{w_i x^i}, where w is in S_m.
Benders' decomposition. This separates complicating variables. The (original) semilinear model is: Generalized Benders decomposition considers the following mathematical program:
Max{f(x, y): g(x, y) <= 0, h(x, y) = 0, x in X, y in Y}.
Suppose this is not a
convex program in total, but that with each fixed y in Y, the
remaining mathematical program (with x the only variable) is convex.
The idea is to consider
v(y) = Sup{f(x, y): g(x, y) < = 0, h(x, y) = 0, x in X},
and solve Max{v(y): y in Y}, with evaluations of v requiring the
solution to the convex program.
Biconcave function. Negative is biconvex. Biconvex function (F). F(x, y) on X*Y is convex on X for each y in Y and convex on Y for each x in X. An example is xy on R² (which is not convex in both x and y together). Biconvex program. A nonlinear program with each function biconvex or biconcave, such that if x or y is fixed, the resulting mathematical program in y or x, resp., is a convex program. A special case is the bilinear program. Bidding algorithm (or auction algorithm). This proceeds by generating a sequence of partial solutions (e.g., a partial assignment), terminating when the solution is complete. There is a pricing vector and a bidding increment whereby the solution is revised and extended by a myopic optimization scheme (which depends on the particular problem). Bilevel program. A multilevel program with two levels (sets of variables). Bilinear program. A nonlinear program whose objective and/or constraint functions are bilinear. An example is the pooling problem. BigM method. An alternative to using Phase I, a large number (M) is used as a linear penalty in the composite objective:
Min cx + M ev: Ax + v = b, x, v >= 0,
where v is an artificial variable and e = (1, 1, ..., 1). Binary relation on sets S and T. A subset of S*T, say R. When T=S, we refer to just the relation on the set S. Binary variable. A decision variable with two feasible values, usually 0 or 1. Binding constraint. A constraint whose removal changes the optimality region. (Some authors require that the removal results in a strict improvement in the objective value.) Bland's rule. This is for pivot selection in the simplex method to avoid cycling: Blending problem. A blend is some combination of materials to make another material. Given raw materials, their blends make intermediate materials, called stock, and/or final materials, called products. (There can be other blending activities that make products from stocks.) The raw materials have purchase costs and the blending activities have costs of operation and maintenance. The products have either fixed demands or selling prices, and the problem is to find blends that minimize total net cost (or maximize profit). This arises in refinery operations, in the food industry, and in other process industries. The problem can sometimes be solved with linear programming, but there can be complicating relations that require nonlinear programming, such as the pooling problem. Block pivot. Given a detached coefficient form, a pivot is performed on a nonsingular submatrix of nonbasic variables (rather than just one element). Suppose the detached coefficient form is partitioned into 2 row blocks and 5 column blocks (with the last column corresponding to the righthand side): . .  I 0 P R e  A =    0 I Q S f  _ _(Note: each I is an identity matrix of appropriate dimension.) If P is nonsingular, the basis can change by entering all (nonbasic) variables associated with the columns of P for all basic variables associated with the rows of P. Blocking polyhedron (of a polyhedron, P). {y in R^n+: y'x >= 1 for all x in P}. BolzanoWeierstrass theorem. This is a fundamental existence theorem: A continuous function on a nonempty compact set achieves a minimum and a maximum there.A useful corollary to the BolzanoWeierstrass theorem is: Suppose f is continuous on its nonempty feasible region, F, and that F is a closed set. Then, f achieves a maximum on F if there exists x^{0} such that {x in F: f(x) >= f(x^{0})} is bounded.In particular, if we have a quadratic objective of the form f(x) = x'Qx + cx, it is sufficient for Q to be negative definite. (Although often credited to Weierstrass alone, it was proven by Bolzano in 1817, and it was known to Cauchy near the end of the 19th century.) . Bordered hessian. The complete hessian of the Lagrangian: . .  H_x[L(x,u,v)] : grad_g(x)' 0   : 0 grad_h(x)'   . . . . . . . . . . .: . . . . . .. . . . . .,  grad_g(x) 0 : 0 0   0 grad_h(x) : 0 0  _ _[graphic] where H_x[L(x,u,v)] is the hessian of the Lagrangian (L) with respect to x only. The southwest and northeast blocks are the constraint gradients because that equals the cross derivative between x and the Lagrange multipliers (u,v). Bottleneck assignment problem. This is the assignment problem, where the "cost", c(i, j), is the rate at which person i can perform job j. The objective is to maximize the rate of production of the entire assembly line. Bottleneck transportation problem. This the transportation problem with a traversal time instead of arc cost, and the objective is to minimize the time it takes to satisfy the demands:
minimize max{t_ij: x_ij > 0}: x >= 0,
sum_j{x_ij} <= s_i, sum_i{x_ij} >= d_j,
where x_ij is the flow from source i to destination j, and t_ij is the time it takes to ship across arc (i, j) (independent of the quantity shipped). Bottleneck TSP problem. Is there a TSP tour whose value is less than some given value? Box constraint. Simple bounds of the form L <= x <= U. Braess paradox. The phenomenon that in a traffic network in which travel times depend on link (arc) flows the addition of a link (arc) can increase travel time for every user if all users minimize their own travel time. Hence, the paradox is that building a new road can make traffic congestion worse. See Myths and Counter Examples for an example. Branch and bound (B&B). Born in integer programming (IP), B&B is more generally used in global optimization, including continuous functions. A prototype is as follows. B&B terminates when the list becomes empty, and the best feasible solution obtained is x* with value f*, which is within t of optimality. (If f* = inf, there is no feasible solution.)
Branch and cut. A hybrid of Branch and bound and cutting plane methods. Branch and price. The "price" operation is the same as column generation. Broyden family. This is a family of algorithms to solve an unconstrained nonlinear program where the objective function is in C². The algorithms in this family differ by how they update the approximation of the hessian inverse (assumed to exist). These are of the form:
H = aH^(BFGS) + (1a)H^(DFP),
and the two matrices are the
BroydenFletcherGoldfarbShanno update (BFGS) and the
DavidonFletcherPowell update
(DFP),
respectively. If a is in [0, 1], this is called the Broyden convex family.
BroydenFletcherGoldfarbShanno (BFGS) method. This is a method to solve an unconstrained nonlinear program that proceeds as follows.
Compare this with the DFP method, and note a key difference is that DFP estimates the inverse hessian, whereas BFGS estimates the hessian (and solves the linear system). There is a correspondance, and empirical evidence suggests BFGS has less problem with error. BroydenFletcherGoldfarbShanno (BFGS) update. This is a way to update an approximation of a hessian, used for unconstrained optimization. (qBp)(qBp)' B' = B + , p'(qBp)where B' is the update from B. The primes on the right denote transposes. Taking the inverse to compare with the DFP update, 1 + q'Hq pp' pq'H + Hqp' H' = H +    , q'p p'q q'pwhere H' is the update from H, and p = change in position = x^k+1  x^k (originally = a H grad_f, where a is a positive scalar determined by line search);(Note: pp', pq'H and Hq'p' are matrices, and all denominators are inner products, presumed to be nonzero.) Bundle method, bundle algorithm. A class of nonsmoothoptimization methods for the numerical minimization of a convex function, θ : R^{n} → R. At the k^{th} iteration the technique uses the available information contained in a bundle (θ(v_{1}), g_{1}, ... , θ(v_{k}), g_{k}) that is obtained from an oracle; here each g_{i} is a subgradient of θ at v_{i}. Important applications are Lagrangian relaxation and column generation applied to a primal problem
In this context, for a given v ∈ R^{n},
When minimizing θ(v), a basic method to construct the next iterate v_{k+1} uses the cuttingplane paradigm: one minimizes the piecewiselinear function
which can be viewed as a model of θ (in fact θ_{k} ≤ θ). Minimizing θ_{k} is a linear program, which is called the restricted master program in the context of column generation. One of the deficiencies of this method is instability i.e. the sequence {v_{k}} usually oscillates uncontrollably. Several possibilities exist to stabilize the process; one, adopted in the bundle approach, replaces the minimization of θ_{k} by the quadratic program
Here t > 0 is a stability parameter that is usually adjusted at each iteration, and the stability center C is some v_{i} (i ≤ k) taken from the bundle, typically one that gave the best θvalue. Send questions and comments to icsMPGlossary@mail.informs.org. View the INFORMS Computing Society's Editorial Board Copyright^{©} 1996 – 2014, Mathematical Programming Glossary, by the INFORMS Computing Society 