Karmarkar–Karp bin packing algorithms

The Karmarkar-Karp (KK) bin packing algorithms are several related approximation algorithm for the bin packing problem.^[1] The bin packing problem is a problem of packing items of different sizes into bins of identical capacity, such that the total number of bins is as small as possible. Finding the optimal solution is computationally hard. Karmarkar and Karp devised an algorithm that runs in polynomial time and finds a solution with at most $\mathrm {OPT} +{\mathcal {O}}(\log ^{2}(OPT))$ bins, where OPT is the number of bins in the optimal solution. They also devised several other algorithms with slightly different approximation guarantees and run-time bounds.

The KK algorithms were considered a breakthrough in the study of bin packing: the previously-known algorithms found multiplicative approximation, where the number of bins was at most $r\cdot \mathrm {OPT} +s$ for some constants $r>1,s>0$ , or at most $(1+\varepsilon )\mathrm {OPT} +1$ .^[2] The KK algorithms were the first ones to attain an additive approximation.

Input

The input to a bin-packing problem is a set of items of different sizes, a₁,...a_n. The following notation is used:

n - the number of items.
m - the number of different item sizes. For each i in 1,...,m:
- s_i is the i-th size;
- n_i is the number of items of size s_i.
B - the bin size.

High-level idea

The KK algorithms essentially solve the configuration linear program:

${\text{minimize}}~~\mathbf {1} \cdot \mathbf {x} ~~~{\text{s.t.}}~~A\mathbf {x} \geq \mathbf {n} ~~~{\text{and}}~~\mathbf {x} \geq 0~~~{\text{and}}~~\mathbf {x} ~{\text{is an integer}}~$ .

Here, A is a matrix with m rows. Each column of A represents a feasible configuration - a multiset of item-sizes, such that the sum of all these sizes is at most B. The set of configurations is C. x is a vector of size C. Each element x_c of x represents the number of times configuration c is used.

Example:^[3] suppose the item sizes are 3,3,3,3,3,4,4,4,4,4, and B=12. Then there are C=10 possible configurations: 3333; 333; 33, 334; 3, 34, 344; 4, 44, 444. The martix A has two rows: [4,3,2,2,1,1,1,0,0,0] for s=3 and [0,0,0,1,0,1,2,1,2,3] for s=4. The vector n is [5,5] since there are 5 items of each size. A possible optimal solution is x=[1,0,0,0,0,0,1,0,0,1], corresponding to using three bins with configurations 3333, 344, 444.

There are two main difficulties in solving this problem. First, it is an integer linear program, which is computationally hard to solve. Second, the number of variables is C - the number of configurations, which may be enormous. The KK algorithms cope with these difficulties using several techniques, some of which were already introduced by de-la-Vega and Lueker.^[2] Here is a high-level description of the algorithm (where $I$ is the original instance):

1-a. Let $J$ $J$ be an instance constructed from $I$ $I$ by removing small items.
- 2-a. Let $K$ $K$ be an instance constructed from $J$ $J$ by grouping items and rounding the size of items in each group to the highest item in the group.
  - 3-a. Construct the configuration linear program for $K$ $K$ , without the integrality constraints.
    - 4. Compute a (fractional) solution x for the relaxed linear program.
  - 3-b. Round x to an integral solution for $K$ .
- 2-b. "Un-group" the items to get a solution for $J$ .
1-b. Add the small items to get a solution for $I$ .

Below, we describe each of these steps in turn.

Step 1. Removing and adding small items

The motivation for removing small items is that, when all items are large, the number of items in each bin must be small, so the number of possible configurations is (relatively) small. We pick some constant $g\in (0,1)$ , and remove from the original instance $I$ all items smaller than $g\cdot B$ . Let $J$ be the resulting instance. Note that in $J$ , each bin can contain at most $1/g$ items. We pack $J$ and get a packing with some $b_{J}$ bins.

Now, we add the small items into the existing bins in an arbitrary order, as long as there is room. When there is no more room in the existing bins, we open a new bin (as in next-fit bin packing). Let $b_{I}$ be the number of bins in the final packing. Then:

$b_{I}\leq \max(b_{J},(1+2g)\cdot OPT(I)+1)$ .

Proof. If no new bins are opened, then the number of bins remains $b_{J}$ . If a new bin is opened, then all bins except maybe the last one contain a total size of at least $B-g\cdot B$ , so the total instance size is at least $(1-g)\cdot B\cdot (b_{I}-1)$ . So the optimal solution needs at least $(1-g)\cdot (b_{I}-1)$ bins. Therefore, $b_{I}\leq OPT/(1-g)+1=(1+g+g^{2}+\ldots )OPT+1\leq (1+2g)OPT+1$ .

Step 2. Grouping and un-grouping items

The motivation for grouping items is to reduce the number of different item sizes, to reduce the number of constraints in the configuration LP. The general grouping process is:

Order the items by descending size.
Partition the items into groups.
For each group, modify the size of all items in the group to the largest size in the group.

There are several different grouping methods.

Linear grouping

Let $k>1$ be an integer parameter. Put the largest $k$ items in group 1; the next-largest $k$ items in group 2; and so on (the last group might have fewer than $k$ items). Let $J$ be the original instance. Let $K'$ be the first group (the group of the $k$ largest items), and $K$ the grouped instance without the first group. Then:

$OPT(K)\leq OPT(J)$ - since group 1 in $J$ dominates group 2 in $K$ (all k items in group 1 are larger than the k items in group 2); similarly, group 2 in $J$ dominates group 3 in $K$ , etc.
$OPT(K')\leq k$ - since it is possible to pack each item in $K'$ into a single bin.

Therefore, $OPT(J)\leq OPT(K)+k$ ; given a solution to $K$ with $b_{K}$ bins, we can get a solution to $J$ with at most $b_{K}+k$ bins.

Geometric grouping

Let $k>1$ be an integer parameter. Geometric grouping proceeds in two steps:

Partition the instance $J$ into several instances $J_{0},J_{1},\ldots$ such that, in each instance $J_{r}$ , all sizes are in the interval $[B/2^{r+1},B/2^{r})$ . Note that, if all items in $J$ have size at least $g\cdot B$ , then the number of instances is at most $\log _{2}(1/g)$ .
On each instance $J_{r}$ , perform linear rounding with paramter $k\cdot 2^{r}$ . Let $K_{r},K'_{r}$ be the resulting instances.

Alternative geometric grouping

TODO

Step 3. Constructing and rounding the LP

The main tool used by the KK algorithms is the fractional configuration linear program:

${\text{minimize}}~~\mathbf {1} \cdot \mathbf {x} ~~~{\text{s.t.}}~~A\mathbf {x} \geq \mathbf {n} ~~~{\text{and}}~~\mathbf {x} \geq 0$ .

Here, A is a matrix with m rows. Each column of A represents a feasible configuration - a multiset of item-sizes, such that the sum of all these sizes is at most B. The set of configurations is C. x is a vector of size C. Each element x_c of x represents the number of times configuration c is used. If x is integral then the solution to this problem is exactly OPT. Since x is allowed to be fractional, the solution might be smaller; denote it by LOPT. Moreover, let FOPT = (a₁+...+a_n)/B = the theoretically-optimal number of bins, when all bins are completely filled with items or item fractions. The following relations are obvious:

FOPT(I) ≤ LOPT(I), since FOPT(I) is the (possibly fractional) number of bins when all bins are completely filled with items or fractions of items. Clearly, no solution can be more efficient.
LOPT(I) ≤ OPT(I), since LOPT(I) is a solution to a minimization problem with fewer constraints.
OPT(I) < 2*FOPT(I), since in any packing with at least 2*FOPT(I) bins, the sum of the two least-full bins is at most B, so they can be combined into a single bin.

Rounding the fractional LP

Given an optimal solution to the fractional LP, it can be rounded into a solution for the integral ILP, proving that OPT(I) ≤ LOPT(I) + m/2:

Let x be an optimal basic feasible solution of the fractional LP. By definition, the value of x is LOPT(I). Since the fractional LP has S constraints, x has at most S nonzero variables, that is, at most S different configurations are used. We construct from x an integral packing consisting of a principal part and a residual part.
The principal part contains floor(x_c) bins of each configuration c for which x_c > 0.
For the residual part (denoted by R), we construct two candidate packings:
- A single bin of each configuration c for which x_c > 0; all in all S bins are needed.
- A greedy packing, with fewer than 2*FOPT(R) bins (since if there are at least 2*FOPT(R) bins, the two smallest ones can be combined).
The smallest of these packings requires min(S, 2*FOPT(R)) ≤ average(S, 2*FOPT(R)) = FOPT(R) + S/2.
Adding to this the rounded-down bins of the principal part yields LOPT(I) + S/2.
The execution time of this conversion algorithm is O(n log n).

---

e - a fraction in (0,1), such that eB is the smallest size of an item.

Step 4. Solving the fractional LP

The dual LP

The dual linear program of the fractional LP is:

${\text{maximize}}~~\mathbf {n} \cdot \mathbf {y} ~~~{\text{s.t.}}~~A^{T}\mathbf {y} \leq \mathbf {1} ~~~{\text{and}}~~\mathbf {y} \geq 0$ .

It has S variables y₁,...,y_S, and C constraints - one for each configuration. It has the following economic interpretation. For each size s, we should determine a nonnegative price y_s. Our profit is the total price of all items. We want to maximize the profit n y subject to the constraints that the total price of items in each configuration is at most 1.

Solving the fractional LP

A linear program with no integrality constraints can be solved in time polynomial in the number of variables and constraints. The problem is that the number of variables in the fractional configuration LP is equal to the number of possible configurations, which might be huge. Karmarkar and Karp present an algorithm that, for any tolerance factor h, finds a basic feasible solution of cost at most LOPT(I) + h, and runs in time:

$O\left(S^{8}\log {S}\log ^{2}({\frac {Sn}{eh}})+{\frac {S^{4}n\log {S}}{h}}\log({\frac {Sn}{eh}})\right)$ ,

where S is the number of different sizes, n is the number of different items, and the size of the smallest item is eB. In particular, if e ≥ 1/n and h=1, the algorithm finds a solution with at most LOPT+1 bins in time: $O\left(S^{8}\log {S}\log ^{2}{n}+S^{4}n\log {S}\log {n}\right)$ .

A randomized variant of this algorithm runs in expected time:

$O\left(S^{7}\log {S}\log ^{2}({\frac {Sn}{eh}})+{\frac {S^{4}n\log {S}}{h}}\log({\frac {Sn}{eh}})\right)$ .

Their algorithm uses separation oracle to the dual LP.

Guarantees

Karmarkar and Karp presented four different algorithms. The run-time of all these algorithms depends on a function $T(\cdot ,\cdot )$ , which is a polynomial function describing the time it takes to solve the configuration linear program: $T(m,n)\in O(m^{8}\log {m}\log ^{2}{n}+m^{4}n\log {m}\log {n})$ . The algorithms attain the following guarantees:

At most $\mathrm {OPT} +{\mathcal {O}}(\log ^{2}OPT)$ bins, with run-time in $O(T(FOPT,n))$ .
At most $\mathrm {OPT} +{\mathcal {O}}(\log ^{2}m)$ bins, with run-time in $O(T(m,n))$ .
At most $\mathrm {OPT} +{\mathcal {O}}(OPT^{\alpha })$ bins, with run-time in $O(T(FOPT^{(1-\alpha )},n))$ , where $\alpha \in (0,1)$ is a constant.
At most $(1+\epsilon )\mathrm {OPT} +{\mathcal {O}}(\epsilon ^{-2})$ bins, with run-time in $O(T(\epsilon ^{-2},n))$ , where $\epsilon >0$ is a constant.

Improvements

The KK techniques were improved later, to provide even better approximations: an algorithm by Rothvoss^[4] uses at most $\mathrm {OPT} +O(\log(\mathrm {OPT} )\cdot \log \log(\mathrm {OPT} ))$ bins, and an algorithm by Hoberg and Rothvoss^[5] uses at most $\mathrm {OPT} +O(\log(\mathrm {OPT} ))$ bins.

References

^ Karmarkar, Narendra; Karp, Richard M. (November 1982). "An efficient approximation scheme for the one-dimensional bin-packing problem". 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982): 312–320. doi:10.1109/SFCS.1982.61. S2CID 18583908.
^ ^a ^b Fernandez de la Vega, W.; Lueker, G. S. (1981). "Bin packing can be solved within 1 + ε in linear time". Combinatorica. 1 (4): 349–355. doi:10.1007/BF02579456. ISSN 1439-6912. S2CID 10519631.
^ Claire Mathieu. "Approximation Algorithms Part I, Week 3: bin packing". Coursera.{{cite web}}: CS1 maint: url-status (link)
^ Rothvoß, T. (2013-10-01). "Approximating Bin Packing within O(log OPT * Log Log OPT) Bins". 2013 IEEE 54th Annual Symposium on Foundations of Computer Science: 20–29. arXiv:1301.4010. doi:10.1109/FOCS.2013.11. ISBN 978-0-7695-5135-7. S2CID 15905063.
^ Hoberg, Rebecca; Rothvoss, Thomas (2017-01-01), "A Logarithmic Additive Integrality Gap for Bin Packing", Proceedings of the 2017 Annual ACM-SIAM Symposium on Discrete Algorithms, Proceedings, Society for Industrial and Applied Mathematics, pp. 2616–2625, doi:10.1137/1.9781611974782.172, ISBN 978-1-61197-478-2, S2CID 1647463, retrieved 2021-02-10

[:12-1] Karmarkar, Narendra; Karp, Richard M. (November 1982). "An efficient approximation scheme for the one-dimensional bin-packing problem". 23rd Annual Symposium on Foundations of Computer Science (SFCS 1982): 312–320. doi:10.1109/SFCS.1982.61. S2CID 18583908.

[:0-2] Fernandez de la Vega, W.; Lueker, G. S. (1981). "Bin packing can be solved within 1 + ε in linear time". Combinatorica. 1 (4): 349–355. doi:10.1007/BF02579456. ISSN 1439-6912. S2CID 10519631.

[:22-3] Claire Mathieu. "Approximation Algorithms Part I, Week 3: bin packing". Coursera.{{cite web}}: CS1 maint: url-status (link)

[:2-4] Rothvoß, T. (2013-10-01). "Approximating Bin Packing within O(log OPT * Log Log OPT) Bins". 2013 IEEE 54th Annual Symposium on Foundations of Computer Science: 20–29. arXiv:1301.4010. doi:10.1109/FOCS.2013.11. ISBN 978-0-7695-5135-7. S2CID 15905063.

[:3-5] Hoberg, Rebecca; Rothvoss, Thomas (2017-01-01), "A Logarithmic Additive Integrality Gap for Bin Packing", Proceedings of the 2017 Annual ACM-SIAM Symposium on Discrete Algorithms, Proceedings, Society for Industrial and Applied Mathematics, pp. 2616–2625, doi:10.1137/1.9781611974782.172, ISBN 978-1-61197-478-2, S2CID 1647463, retrieved 2021-02-10

[1]

[2]

[3]

[4]

[5]