Multiway number partitioning
In computer science, multiway number partitioning is the problem of partitioning a multiset of numbers into a fixed number of subsets, such that the sums of the subsets are as similar as possible. It was first presented by Ronald Graham in 1969 in the context of the Identical-machines scheduling problem.[1]: sec.5 The problem is parametrized by a positive integer k, and called k-way number partitioning.[2] The input to the problem is a multiset S of numbers (usually integers), whose sum is k*T.
The associated decision problem is to decide whether S can be partitioned into k subsets such that the sum of each subset is exactly T. There is also an optimization problem: find a partition of S into k subsets, such that the k sums are "as near as possible". The exact optimization objective can be defined in several ways:
- Minimize the difference between the largest sum and the smallest sum. This objective is common in papers about multiway number partitioning, as well as papers originating from physics applications.[2]
- Minimize the largest sum. This objective is equivalent to one objective for Identical-machines scheduling. There are k identical processors, and each number in S represents the time required to complete a single-processor job. The goal is to partition the jobs among the processors such that the makespan (the finish time of the last job) is minimized.
- Maximize the smallest sum. This objective corresponds to the application of fair item allocation, particularly the maximin share. It also appears in voting manipulation problems,[3] and in sequencing of maintenance actions for modular gas turbine aircraft engines.[4][5] Suppose there are some k engines, which must be kept working for as long as possible. An engine needs a certain critical part in order to operate. There is a set S of parts, each of which has a different lifetime. The goal is to assign the parts to the engines, such that the shortest engine lifetime is as large as possible.
These three objective functions are equivalent when k=2, but they are all different when k≥3.[6][7]
All these problems are NP-hard,[8] but there are various algorithms that solve it efficiently in many cases.
Some closely-related problems are:
- The partition problem - a special case of multiway number partitioning in which the number of subsets is 2.
- The 3-partition problem - a different and harder problem, in which the number of subsets is not considered a fixed parameter, but is determined by the input (the number of sets is the number of integers divided by 3).
- The bin packing problem - a dual problem in which the total sum in each subset is bounded, but k is flexible; the goal is to find a partition with the smallest possible k. The optimization objectives are closely related: the optimal number of d-sized bins is at most k, iff the optimal size of a largest subset in a k-partition is at most d.[9]
- The uniform-machines scheduling problem - a more general problem in which different processors may have different speeds.
Approximation algorithms
There are various algorithms that obtain a guaranteed approximation of the optimal solution in polynomial time. There are different approximation algorithms for different objectives.
Minimizing the largest sum
The approximation ratio in this context is the largest sum in the solution returned by the algorithm, divided by the largest sum in the optimal solution (the ratio is larger than 1). Most algorithms below were developed for identical-machines scheduling.
- Greedy number partitioning (also called the Largest Processing Time in the scheduling literature) loops over the numbers, and puts each number in the set whose current sum is smallest. If the numbers are not sorted, then the runtime is and the approximation ratio is at most . Sorting the numbers increases the runtime to and improves the approximation ratio to 7/6 when k=2, and in general. If the numbers are distributed uniformly in [0,1], then the approximation ratio is at most almost surely , and in expectation.
- Largest Differencing Method (also called the Karmarkar-Karp algorithm ) sorts the numbers in descending order and repeatedly replaces numbers by their differences. The runtime complexity is . In the worst case, its approximation ratio is similar - at most 7/6 for k =2, and at most in general. However, in the average case it performs much better than the greedy algorithm: for k =2, when numbers are distributed uniformly in [0,1], its approximation ratio is at most in expectation. It also performs better in simulation experiments.
- The Multifit algorithm uses binary search combined with an algorithm for bin packing . In the worst case, its makespan is at most 8/7 for k =2, and at most 13/11 in general.
Several polynomial-time approximation schemes (PTAS) have been developed:
- Graham[1]: sec.6 presented the following algorithm. For any integer r>0, choose the r largest numbers in S and partition them optimally. Then allocate the remaining numbers arbitrarily. This algorithm has approximation ratio and it runs in time .
- Sahni[10] presented a PTAS that attains (1+ε)OPT in time . It is an FPTAS if k is fixed. For k=2, the run-time improves to . The algorithm uses a technique called interval partitioning.
- Hochbaum and Shmoys[9] presented the following algorithms, which work even when k is part of the input.
- For any r >0, an algorithm with approximation ratio at most (6/5+2−r ) in time .
- For any r >0, an algorithm with approximation ratio at most (7/6+2−r ) in time .
- For any ε>0, an algorithm with approximation ratio at most (1+ε) in time . This is a PTAS.
Maximizing the smallest sum
The approximation ratio in this context is the smallest sum in the solution returned by the algorithm, divided by the smallest sum in the optimal solution (the ratio is less than 1).
- For greedy number partitioning , if the numbers are not sorted then the worst-case approximation ratio is 1/k.[11] Sorting the numbers increases the approximation ratio to 5/6 when k=2, and in general, and it is tight.[12]
- Woeginger[11] presented a PTAS that attains an approximation factor of in time , where a huge constant that is exponential in the required approximation factor ε. The algorithm uses Lenstra's algorithm for integer linear programming.
- The FPTAS of Sahni[10] works for this objective too.
Other objective functions
Let si (for i between 1 and k) be the sum of subset i in a given partition. Instead of minimizing the objective function max(si), one can minimize the objective function max(f(si)), where f is any fixed function. Similarly, one can minimize the objective function sum(f(si)), or maximize min(f(si)), or maximize sum(f(si)). Alon, Azar, Woeginger and Yadid[13] presented a general PTAS, that works for any f which satisfies a certain continuity condition, and is convex (for the minimization problems) or concave (for the maximization problems). Their algorithm generalizes the PTAS-s of Sanhi, Hochbaum and Shmoys, and Woeginger. The runtime of their PTAS is linear in n, but exponential in the approximation precision. They show that, for some functions f that do not satisfy these conditions, no PTAS exists unless P=NP.
Cardinality constraints
In the balanced number partitioning problem, the cardinality of each subset must be either floor(n/k) or ceiling(n/k), i.e., the cardinality of all subsets must be the same up to 1. This is important, for example, in manufacturing of VLSI chips, and in assigning tools to machines in flexible manufacturing systems.[14]
- Frenk and Rinnooy Kan[15] prove that the difference between the LPT largest sum and the optimal largest sum is at most almost surely (for uniform or negative exponential distributions), and at most in expectation (for uniform distribution).
- Karmarkar and Karp[16] developed the Largest differencing method (LDM), and proved that, for k=2, its work-difference (the difference between largest and smallest sum) is almost surely when the distribution is smooth.
- Yakir[17] presents another algorithm for balanced partitioning with the same rate of convergence to zero, .
- Lueker[18] presented the Pairing-differencing-method (PDM), a simpler variant of LDM. He proved that, for k=2, its work-difference is .
- Coffman, Frederickson and Lueker[19] show that a restricted version of the LPT algorithm (called RLPT or RLF), in which inputs are assigned in pairs, attains a work-difference of .
- Tsai[14] presents an algorithm called Restricted Largest Difference (RLD), attaining a work-difference of almost surely when inputs are independent uniform random variables. The algorithm also minimizes the total flow time.
There are variants of multiway number partitioning in which there are cardinality constraints on the sets.
- Kellerer and Woeginger[20] study a variant in which there are at most 3*k items, and each set must contain at most 3 items (a generalization of 3-partition problem). They adapt the LPT algorithm to this setting: their algorithm (called modified-LPT or MLPT) orders the input items from large to small, and puts each item in turn into the bin with the smallest sum among those bins that contain fewer than 3 items. They show that the MLPT algorithm attains at most of the minimum largest sum, which the same approximation ratio that LPT attains for the unconstrained problem. The bound is tight for MLPT. It is conjectured[21] that MLPT has the same approximation ratio for more general cardinality constraints.
- Chen, He and Lin[22] show that, for the same problem, MLPT attains at least of the maximum smallest sum, which is again the same ratio that LPT attains for the unconstrained problem.
- Kellerer and Kotov[23] present a different algorithm (for the case with exactly 3*k items), that attains at most of the minimum largest sum.
- Babel, Kellerer and Kotov[21] study a variant in which there are c*k items (for some integer c), and each of the k sets must contain exactly c items. They prove that the ratio between the unconstrained optimum and the constrained optimum is at most , and it is tight. They also present several heuristic algorithms for approximating the minimum largest sum.
- Folding algorithm: optimal for k=2, and in general has tight approximation ratio .
- Exchange algorithm: tight approximation ratio . It is not known if it runs in polynomial time.
- Primal-dual algorithm (similar to MultiFit): approximation ratio at most . It is not known if this is tight.
Exact algorithms
There are exact algorithms, that always find the optimal partition. Since the problem is NP-hard, such algorithms might take exponential time in general, but may be practically usable in certain cases.
- The pseudopolynomial time number partitioning takes O(n(k − 1)mk − 1) memory, where m is the largest number in the input. It is practical only when k=2, or when k=3 and the inputs are small integers.[24]
- The Complete Greedy Algorithm (CGA) considers all partitions by constructing a k-ary tree. Each level in the tree corresponds to an input number, where the root corresponds to the largest number, the level below to the next-largest number, etc. Each of the k branches corresponds to a different set in which the current number can be put. Traversing the tree in depth-first order requires only O(n) space, but might take O(kn) time. The runtime can be improved by using a greedy heuristic: in each level, develop first the branch in which the current number is put in the set with the smallest sum. This algorithm finds first the solution found by greedy number partitioning, but then proceeds to look for better solutions.
- The Complete Karmarkar-Karp algorithm (CKK) considers all partitions by constructing a tree of degree . Each level corresponds to a pair of k-tuples, and each of the branches corresponds to a different way of combining these k-tuples. This algorithm finds first the solution found by the largest differencing method, but then proceeds to find better solutions. For k =2 and k =3, CKK runs substantially faster than CGA on random instances. The advantage of CKK over CGA is much larger in the latter case (when an equal partition exists), and can be of several orders of magnitude. In practice, with k=2, problems of arbitrary size can be solved by CKK if the numbers have at most 12 significant digit s; with k=3, at most 6 significant digits.[25] CKK can also run as an anytime algorithm : it finds the KK solution first, and then finds progressively better solutions as time allows (possibly requiring exponential time to reach optimality, for the worst instances).[26] For k ≥ 4, CKK becomes much slower, and CGA performs better.
- Recursive Number Partitioning (RNP) uses CKK for k=2, but for k>2 it recursively splits S into subsets and splits k into halves. It performs much better than CKK.[24]
- Korf, Schreiber and Moffitt presented hybrid algorithms, combining CKK, CGA and other methods from the subset sum problem and the bin packing problem to achieve an even better performance.[27][28][29][30]
Reduction to bin packing
The bin packing problem has many fast solvers. A BP solver can be used to find an optimal number partitioning.[31] The idea is to use binary search to find the optimal makespan. To initialize the binary search, we need a lower bound and an upper bound:
- Some lower bounds on the makespan are: (sum S)/k - the average value per subset, s1 - the largest number in S, and sk + sk+1 - the size of a bin in the optimal partition of only the largest k+1 numbers.
- Some upper bounds can be attained by running heuristic algorithms, such as the greedy algorithm or KK.
Given a lower and an upper bound, run the BP solver with bin size middle := (lower+upper)/2.
- If the result contains more than k bins, then the optimal makespan must be larger: set lower to middle and repeat.
- If the result contains at most k bins, then the optimal makespan may be smaller set higher to middle and repeat.
Variants
A recent variant is the multidimensional multiway number partitioning.[32]
Applications
One application of the partition problem is for manipulation of elections. Suppose there are three candidates (A, B and C). A single candidate should be elected using the veto voting rule, i.e., each voter vetos a single candidate and the candidate with the fewest vetos wins. If a coalition wants to ensure that C is elected, they should partition their vetoes among A and B so as to maximize the smallest number of vetoes each of them gets. If the votes are weighted, then the problem can be reduced to the partition problem, and thus it can be solved efficiently using CKK. For k=2, the same is true for any other voting rule that is based on scoring. However, for k>2 and other voting rules, some other techniques are required.[3]
References
- ^ a b Graham, Ron L. (1969-03-01). "Bounds on Multiprocessing Timing Anomalies". SIAM Journal on Applied Mathematics. 17 (2): 416–429. doi:10.1137/0117039. ISSN 0036-1399.
- ^ a b Mertens, Stephan (2006), "The Easiest Hard Problem: Number Partitioning", in Allon Percus; Gabriel Istrate; Cristopher Moore (eds.), Computational complexity and statistical physics, Oxford University Press US, p. 125, arXiv:cond-mat/0310317, Bibcode:2003cond.mat.10317M, ISBN 978-0-19-517737-4
- ^ a b Walsh, Toby (2009-07-11). "Where Are the Really Hard Manipulation Problems? The Phase Transition in Manipulating the Veto Rule" (PDF). Written at Pasadena, California, USA. Proceedings of the Twenty-First International Joint Conference on Artificial Intelligence. San Francisco, California, USA: Morgan Kaufmann Publishers Inc. pp. 324–329. Archived (PDF) from the original on 2020-07-10. Retrieved 2021-10-05.
- ^ Friesen, D. K.; Deuermeyer, B. L. (1981-02-01). "Analysis of Greedy Solutions for a Replacement Part Sequencing Problem". Mathematics of Operations Research. 6 (1): 74–87. doi:10.1287/moor.6.1.74. ISSN 0364-765X.
- ^ Deuermeyer, Bryan L.; Friesen, Donald K.; Langston, Michael A. (1982-06-01). "Scheduling to Maximize the Minimum Processor Finish Time in a Multiprocessor System". SIAM Journal on Algebraic and Discrete Methods. 3 (2): 190–196. doi:10.1137/0603019. ISSN 0196-5212.
- ^ Korf, Richard Earl (2010-08-25). "Objective Functions for Multi-Way Number Partitioning". Third Annual Symposium on Combinatorial Search.
- ^ Walter, Rico (2013-01-01). "Comparing the minimum completion times of two longest-first scheduling-heuristics". Central European Journal of Operations Research. 21 (1): 125–139. doi:10.1007/s10100-011-0217-4. ISSN 1613-9178.
- ^ Garey, Michael R.; Johnson, David S. (1979). Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman and Company. p. 238. ISBN 978-0716710448.
- ^ a b Hochbaum, Dorit S.; Shmoys, David B. (1987-01-01). "Using dual approximation algorithms for scheduling problems theoretical and practical results". Journal of the ACM. 34 (1): 144–162. doi:10.1145/7531.7535. ISSN 0004-5411. S2CID 9739129.
- ^ a b Sahni, Sartaj K. (1976-01-01). "Algorithms for Scheduling Independent Tasks". Journal of the ACM. 23 (1): 116–127. doi:10.1145/321921.321934. ISSN 0004-5411.
- ^ a b Woeginger, Gerhard J. (1997-05-01). "A polynomial-time approximation scheme for maximizing the minimum machine completion time". Operations Research Letters. 20 (4): 149–154. doi:10.1016/S0167-6377(96)00055-7. ISSN 0167-6377.
- ^ Csirik, János; Kellerer, Hans; Woeginger, Gerhard (1992-06-01). "The exact LPT-bound for maximizing the minimum completion time". Operations Research Letters. 11 (5): 281–287. doi:10.1016/0167-6377(92)90004-M. ISSN 0167-6377.
- ^ Alon, Noga; Azar, Yossi; Woeginger, Gerhard J.; Yadid, Tal (1998). "Approximation schemes for scheduling on parallel machines". Journal of Scheduling. 1 (1): 55–66. doi:10.1002/(SICI)1099-1425(199806)1:1<55::AID-JOS2>3.0.CO;2-J. ISSN 1099-1425.
- ^ a b Tsai, Li-Hui (1992-02-01). "Asymptotic Analysis of an Algorithm for Balanced Parallel Processor Scheduling". SIAM Journal on Computing. 21 (1): 59–64. doi:10.1137/0221007. ISSN 0097-5397.
- ^ Frenk, J. B. G.; Rinnooy Kan, A. H. G. (1987-05-01). "The Asymptotic Optimality of the LPT Rule". Mathematics of Operations Research. 12 (2): 241–254. doi:10.1287/moor.12.2.241. ISSN 0364-765X.
- ^ Narendra Karmarkar and Richard M. Karp, "The differencing method of set partitioning", Tech report UCB/CSD 82/113, Computer science division, University of California, Berkeley, 1982
- ^ Yakir, Benjamin (1996-02-01). "The Differencing Algorithm LDM for Partitioning: A Proof of a Conjecture of Karmarkar and Karp". Mathematics of Operations Research. 21 (1): 85–99. doi:10.1287/moor.21.1.85. ISSN 0364-765X.
- ^ Lueker, George S (1987-12-01). "A note on the average-case behavior of a simple differencing method for partitioning". Operations Research Letters. 6 (6): 285–287. doi:10.1016/0167-6377(87)90044-7. ISSN 0167-6377.
- ^ Coffman, E. G.; Frederickson, G. N.; Lueker, G. S. (1984-05-01). "A Note on Expected Makespans for Largest-First Sequences of Independent Tasks on Two Processors". Mathematics of Operations Research. 9 (2): 260–266. doi:10.1287/moor.9.2.260. ISSN 0364-765X.
- ^ Kellerer, Hans; Woeginger, Gerhard (1993-09-07). "A tight bound for 3-partitioning". Discrete Applied Mathematics. 45 (3): 249–259. doi:10.1016/0166-218X(93)90013-E. ISSN 0166-218X.
- ^ a b Babel, Luitpold; Kellerer, Hans; Kotov, Vladimir (1998-02-01). "The k-partitioning problem". Mathematical Methods of Operations Research. 47 (1): 59–82. doi:10.1007/BF01193837. ISSN 1432-5217.
- ^ Chen, Shi Ping; He, Yong; Lin, Guohui (2002-03-01). "3-Partitioning Problems for Maximizing the Minimum Load". Journal of Combinatorial Optimization. 6 (1): 67–80. doi:10.1023/A:1013370208101. ISSN 1573-2886.
- ^ Kellerer, Hans; Kotov, Vladimir (1999-02-01). "A 7/6–Approximation Algorithm For 3-Partitioning And Its Application To Multiprocessor Scheduling". INFOR: Information Systems and Operational Research. 37 (1): 48–56. doi:10.1080/03155986.1999.11732368. ISSN 0315-5986.
- ^ a b Korf, Richard E. (2009). Multi-Way Number Partitioning (PDF). IJCAI.
- ^ Korf, Richard E. (1995-08-20). "From approximate to optimal solutions: a case study of number partitioning". Proceedings of the 14th International Joint Conference on Artificial Intelligence - Volume 1. IJCAI'95. Montreal, Quebec, Canada: Morgan Kaufmann Publishers Inc.: 266–272. ISBN 978-1-55860-363-9.
- ^ Korf, Richard E. (1998-12-01). "A complete anytime algorithm for number partitioning". Artificial Intelligence. 106 (2): 181–203. doi:10.1016/S0004-3702(98)00086-1. ISSN 0004-3702.
- ^ Korf, Richard E. (2011-07-16). "A hybrid recursive multi-way number partitioning algorithm". Proceedings of the Twenty-Second International Joint Conference on Artificial Intelligence - Volume Volume One. IJCAI'11. Barcelona, Catalonia, Spain: AAAI Press: 591–596. ISBN 978-1-57735-513-7.
- ^ Schreiber, Ethan L.; Korf, Richard E. (2014-07-27). "Cached iterative weakening for optimal multi-way number partitioning". Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence. AAAI'14. Québec City, Québec, Canada: AAAI Press: 2738–2744.
- ^ Richard E. Korf, Ethan L. Schreiber, and Michael D. Moffitt (2014). "Optimal Sequential Multi-Way Number Partitioning" (PDF).
{{cite web}}
: CS1 maint: multiple names: authors list (link) - ^ Schreiber, Ethan L.; Korf, Richard E.; Moffitt, Michael D. (2018-07-25). "Optimal Multi-Way Number Partitioning". Journal of the ACM. 65 (4): 24:1–24:61. doi:10.1145/3184400. ISSN 0004-5411. S2CID 63854223.
- ^ Schreiber, Ethan L.; Korf, Richard E. (2013-08-03). "Improved bin completion for optimal bin packing and number partitioning". Proceedings of the Twenty-Third International Joint Conference on Artificial Intelligence. IJCAI '13. Beijing, China: AAAI Press: 651–658. ISBN 978-1-57735-633-2.
- ^ Pop, Petrică C.; Matei, Oliviu (2013-11-01). "A memetic algorithm approach for solving the multidimensional multi-way number partitioning problem". Applied Mathematical Modelling. 37 (22): 9191–9202. doi:10.1016/j.apm.2013.03.075. ISSN 0307-904X.