×

A case for on-machine load balancing. (English) Zbl 1219.68078

Summary: This paper diverges from the traditional load balancing, and introduces a new principle called the on-machine load balance rule. The on-machine load balance rule leads to resource allocations that are better in tolerating uncertainties in the processing times of the tasks allocated to the resources when compared to other resource allocations that are derived using the conventional “across-the-machines” load balancing rule. The on-machine load balance rule calls for the resource allocation algorithms to allocate similarly sized tasks on a machine (in addition to optimizing some primary performance measures such as estimated makespan and average response time). The on-machine load balance rule is very different from the usual across-the-machines load balance rule that strives to balance load across resources so that all resources have similar finishing times.
We give a mathematical justification for the on-machine load balance rule requiring only liberal assumptions about task processing times. Then we validate with extensive simulations that the resource allocations derived using on-machine load balance rule are indeed more tolerant of uncertain task processing times.

MSC:

68M99 Computer system organization
68M20 Performance evaluation, queueing, and scheduling in the context of computer systems
Full Text: DOI

References:

[1] Agrawala, A.; Jr., E. Coffman; Garey, M.; Tripathi, S.: A stochastic optimization algorithm minimizing expected flow times on uniform processors, IEEE transactions on computers 33, 351-356 (1984) · Zbl 0528.68022 · doi:10.1109/TC.1984.1676440
[2] Ahmad, I.; Ghafoor, A.: Semi-distributed load balancing for massively parallel multicomputer systems, IEEE transactions on software engineering 17, No. 10, 987-1004 (1991)
[3] Ali, S.; Braun, T. D.; Siegel, H. J.; Beck, N.; Bölöni, L.; Maheswaran, M.; Reuther, A. I.; Robertson, J. P.; Theys, M. D.; Yao, B.: Characterizing resource allocation heuristics for heterogeneous computing systems, Advances in computers 63, 91-128 (2005)
[4] Ali, S.; Maciejewski, A. A.; Siegel, H. J.; Kim, J. -K.: Measuring the robustness of a resource allocation, IEEE transactions on parallel and distributed systems 15, No. 7, 630-641 (2004)
[5] I. Banicescu, V. Velusamy, Performance of scheduling scientific applications with adaptive weighted factoring, in: 10th IEEE Heterogeneous Computing Workshop, HCW 2001, in the Proceedings of the 15th International Parallel and Distributed Processing Symposium, IPDPS 2001, April 2001.
[6] H. Barada, S.M. Sait, N. Baig, Task matching and scheduling in heterogeneous systems using simulated evolution, in: 10th IEEE Heterogeneous Computing Workshop, HCW 2001, in the Proceedings of the 15th International Parallel and Distributed Processing Symposium, IPDPS 2001, April 2001.
[7] L. Bölöni, D.C. Marinescu, On the robustness of metaprogram schedules, in: 8th IEEE Heterogeneous Computing Workshop, HCW’99, April 1999, pp. 146–155.
[8] Bruno, J.; Downey, P.; Fredrickson, G.: Sequencing tasks with exponential service times to minimize the expected flow time or makespan, Journal of the ACM 28, 100-113 (1981) · Zbl 0454.68016 · doi:10.1145/322234.322242
[9] Chang, C. -S.; Nelson, R.; Pinedo, M.: Scheduling two classes of exponential jobs on parallel processors: structural results and worst case analysis, Journal of applied probability (1991) · Zbl 0741.90032
[10] Chang, C. -S.; Righter, R.: The optimality of LEPT in parallel machine scheduling, Journal of applied probability 31, 788-796 (1994) · Zbl 0833.90067 · doi:10.2307/3215156
[11] Cybenko, G.: Dynamic load balancing for distributed memory multiprocessors, Journal of parallel and distributed computing 7, No. 2, 279-301 (1989)
[12] Daniels, R. L.; Carrillo, J. E.: {\(\beta\)}-robust scheduling for single-machine systems with uncertain processing times, IIE transactions 29, No. 11, 977-985 (1997)
[13] , Heterogeneous computing (1996)
[14] B. Eslamnour, S. Ali, A probabilistic approach to measuring robustness in computing systems, in: 21st International Parallel and Distributed Processing Symposium, IPDPS 2007, 2007.
[15] Eslamnour, B.; Ali, S.: Measuring robustness of computing systems, Simulation modelling practice and theory (2009)
[16] S. Ghosh, Guaranteeing fault tolerance through scheduling in real-time systems, Ph.D. Thesis, Faculty of Arts and Sciences, Univ. of Pittsburgh, 1996.
[17] A. Goel, P. Indyk, Stochastic load balancing and related problems, in: 40th Annual Symposium on Foundations of Computer Science, 1999, pp. 579–586.
[18] S.D. Gribble, Robustness in complex systems, in: 8th Workshop on Hot Topics in Operating Systems, HotOS-VIII, May 2001, pp. 21–26.
[19] Harchol-Balter, M.: Task assignment with unknown duration, Journal of the ACM 49, No. 2, 260-288 (2002) · Zbl 1323.68033
[20] Harchol-Balter, M.; Downey, A. B.: Exploiting process lifetime distributions for dynamic load balancing, ACM transactions on computer systems 15, No. 3, 253-285 (1997)
[21] D.E. Hastings, A.L. Weigel, M.A. Walton, Incorporating uncertainty into conceptual design of space system architectures, in: ESD Internal Symposium, Working Paper Series, ESD-WP-2003-01.01, May 2002.
[22] Hordijk, A.; Koole, G.: On the assignment of customers to parallel queues, Probability in the engineering and informational sciences 6, 495-512 (1992) · Zbl 1134.90341 · doi:10.1017/S0269964800002692
[23] Huang, C.; Weiss, G.: Preemptive scheduling of stochastic jobs with a two stage processing time distribution, Probability in the engineering and informational sciences 6 (1992) · Zbl 1134.90407 · doi:10.1017/S0269964800002436
[24] Jain, R.: The art of computer systems performance analysis, (1991) · Zbl 0824.68013
[25] E. Jen, Stable or robust? What is the difference? Santa Fe Institute Working Paper No. 02-12-069, 2002.
[26] Jensen, M.: Improving robustness and flexibility of tardiness and total flow time job shops using robustness measures, Journal of applied soft computing 1, No. 1, 35-52 (2001)
[27] Kämpke, T.: Optimal scheduling of jobs with exponential service times on identical parallel processors, Operations research 37, 126-133 (1989) · Zbl 0673.90058 · doi:10.1287/opre.37.1.126
[28] Khokhar, A.; Prasanna, V. K.; Shaaban, M.; Wang, C. L.: Heterogeneous computing: challenges and opportunities, IEEE computer 26, No. 6, 18-27 (1993)
[29] Leon, V. J.; Wu, S. D.; Storer, R. H.: Robustness measures and robust scheduling for job shops, IEE transactions 26, No. 5, 32-43 (1994)
[30] Maheswaran, M.; Ali, S.; Siegel, H. J.; Hensgen, D.; Freund, R. F.: Dynamic mapping of a class of independent tasks onto heterogeneous computing systems, Journal of parallel and distributed computing 59, No. 2, 107-131 (1999)
[31] G. Malewicz, Parallel scheduling of complex DAGs under uncertainty, in: 17th Annual ACM Symposium on Parallelism in Algorithms and Architectures, 2005, pp. 66–75.
[32] Michalewicz, Z.; Fogel, D. B.: How to solve it: modern heuristics, (2000) · Zbl 0943.90002
[33] Nino-Mora, J.: Stochastic scheduling, Encyclopedia of optimization, 367-372 (2001)
[34] A.J. Oliner, L. Rudolph, R.K. Sahoo, J.E. Moreira, M. Gupta, Probabilistic QoS guarantees for supercomputing systems, in: The 2005 International Conference on Dependable Systems and Networks, DSN 2005, June–July 2005, pp. 634–643.
[35] Pinedo, M.: Minimizing the expected makespan in stochastic flow shops, Operations research 30, 148-162 (1982) · Zbl 0481.90047 · doi:10.1287/opre.30.1.148
[36] Pinedo, M.; Weiss, G.: The ”largest variance first” policy in some stochastic scheduling problems, Operations research 35, 884-891 (1987) · Zbl 0643.90036 · doi:10.1287/opre.35.6.884
[37] I.A. Rai, G. Urvoy-Keller, E.W. Biersack, Analysis of LAS scheduling for job size distributions with high variance, in: The 2003 ACM SIGMETRICS International Conference on Measurement and Modeling of Computer Systems, 2003, pp. 218–228.
[38] S. Ray, R. Ungrangsi, F.D. Pellegrinin, A. Trachtenberg, D. Starobinski, Robust location detection in emergency sensor networks, in: The 22nd Annual Joint Conference of the IEEE Computer and Communications Societies, INFOCOM 2003, April 2003.
[39] Righter, R.; Xu, S.: Scheduling jobs on heterogeneous processors, Annals of operations research 28, 587-602 (1991) · Zbl 0717.90036 · doi:10.1007/BF02283615
[40] Righter, R.; Xu, S.: Scheduling jobs on nonidentical IFR processors to minimize general cost functions, Advances in applied probability 23, 909-924 (1991) · Zbl 0745.90040 · doi:10.2307/1427683
[41] Schopf, J. M.; Berman, F.: Stochastic scheduling, IEEE supercomputing (1999)
[42] M. Sevaux, K. Sörensen, Genetic algorithm for robust schedules, in: 8th International Workshop on Project Management and Scheduling, PMS 2002, April 2002, pp. 330–333.
[43] M. Skutella, M. Uetz, Scheduling precedence-constrained jobs with stochastic processing times on parallel machines, in: 12th Annual ACM–SIAM Symposium on Discrete Algorithms, SODA 2001, 2001, pp. 589–590. · Zbl 1018.90016
[44] Sotskov, Y. N.; Tanaev, V. S.; Werner, F.: Stability radius of an optimal schedule: a survey and recent developments, Industrial applications of combinatorial optimization, 72-108 (1998) · Zbl 0936.90030
[45] M. Stastny, Mahalanobis distance, 2001. http://www.wu-wien.ac.at/usr/h99c/h9951826/distance.pdf.
[46] Weiss, G.; Pinedo, M.: Scheduling tasks with exponential service times on non-identical processors to minimize various cost functions, Journal of applied probability 17, 187-202 (1980) · Zbl 0427.90051 · doi:10.2307/3212936
[47] Whitt, W.: Deciding which queue to join: some counter examples, Operations research 34, 55-62 (1986) · Zbl 0622.90035 · doi:10.1287/opre.34.1.55
[48] M.-Y. Wu, W. Shu, H. Zhang, Segmented min–min: a static mapping algorithm for meta-tasks on heterogeneous computing systems, in: 9th IEEE Heterogeneous Computing Workshop, HCW 2000, May 2000, pp. 375–385.
[49] Xu, S.: Stochastically minimizing total delay of jobs subject to random deadlines, Probability in the engineering and informational sciences 5, 333-348 (1991) · Zbl 1134.68326
[50] L. Yang, J.M. Schopf, I. Foster, Conservative scheduling: using predicted variance to improve scheduling decisions in dynamic environments, in: The 2003 ACM/IEEE Conference on Supercomputing, 2003, pp. 1–16.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.