A POLYNOMIAL-T IME ALGORITHM FOR THE DISCRETE FACILITY LOCATION PROBLEM WITH LIMITED DISTANCES AND CAPACITY CONSTRAINTS

- problem with capacity constraints regarding the number of served clients. These constraints are relevant for introducing


INTRODUCTION
Location Analysis is one of the most active fields in terms of Operations Research.It deals with the decision of optimally placing facilities in order to minimize operational costs (Nickel et Puerto, 2005).Solved in several situations by intuitive methods, facility location decisions usually demand more in-depth studies.Regardless of the type of business in which the company is involved, the decisions about location are strategic and belong to the core of any management process.Furthermore, these decisions lead to long term commitments (Moreira, 2010) due to high costs involved in installing location facilities.
Each location problem depends on the interests of the organization.Thus, some companies prefer to be closer to clients (like supermarkets, hospitals or police stations), while others are attracted by the proximity to raw materials or components (such as cement factories or potteries), or even by places where the labor is plenty, well trained or cheap, depending on the type of business.This type of problem has been subject of scientific studies for a long time.It traces back to Fermat in the 17 th century, who proposed the following problem: for three given points in a plane, find a fourth one so that the sum of its distance to the three existing points is minimum.This geometrical problem was solved by Torricelli in 1647 (see more in Wesolowsky, 1993;Smith et al., 2009).
In 1909, Alfred Weber proposed a generalization for this problem.The minisum problem, also known as the Weber problem (cf.Wesolowsky, 1993;Fekete et al., 2005) is a central problem in location theory.It refers to a situation in which there exists a set of demand points and the location of a facility must be chosen such that the total sum of the weighted distances from the points to the facility is minimized.The function to be minimized is presented below: (1) Where n is the number of client points p 1 , p 2 ,..., p n in R 2 ; w i are weights and; ||.|| is the Euclidean norm.The Weber problem is convex with a non-differentiable objective function, since the facility location may coincide with a client location.Real life applications for this problem are many (e.g.Correa et al., 2004;Johnson et al., 2005;Pelegrín et al., 2006;Beresnev, 2009;Marín, 2011).
The choice of the location may have influence over the relations between the company and its clients.If the client must be physically in the process, it is unlikely that a location is acceptable if the travel time or distance between the provider and the client is large.(Krajewski et al., 2009) Particularly, for some emergency services (e.g.ambulances, police calls), the service provided by the facility has no effect after certain threshold time/distance.For example, a house on fire would be completely destroyed after a given period of time.In the case of a police call, criminals would be likely untraceable after a time limit.This model characteristic can be approached in different ways.Evans et al. (1997) proposed a min-max model to determine the location for these types of emergency units.The model has the objective of locating a facility among a group of candidates, such that the distance to the farther client is minimized.Drezner et al. (1991) proposed a variation of the Weber problem to model location problems in which the service provided by the facility is insensitive after a threshold distance (directly related to a maximum time limit).To illustrate their model, let us use an example provided by Drezner et al. (1991) to locate a fire station.In this context, each property has a limit distance after which the service provided by the fireman would be useless, and the property would be completely destroyed.A certain damage occurs in a property located in p i for i = 1,…, n at a null distance from the fire station (located at y ∈ R 2 ), linearly increasing up to a distance λ i , where the damage is 100%.By denoting d(p i , y) the distance between the point p i and the facility located at y, and Ω the proportion of damage at zero distance, the proportion of damage in p i is given by Ω + (1 -Ω) d(p i , y) / λ i in the case d(p i , y) < λ i , and 1 otherwise.The corresponding facility location problem is then expressed as: (2) The first term of the summation is constant and (1 -Ω) is irrelevant to the second term.By introducing binary variables υ i that select between d(p i , y) and λ i to the summation of the objective function, we have the minimization problem defined in (3).et al. (1991) by adding capacity constraints to the service provided by the facility.In some real applications, during the operation of a facility, there are a maximum number of clients who can be served without affecting the service quality.In other situations, it is necessary to add to the model a minimum limit of users that would justify the existence of the facility.
The structure of the paper is as follows.The discrete facility location problem with limited distances and capacity constraints is mathematically formulated in the next section.In the following, our solution algorithm is described and a formal demonstration of its optimality is presented.After that, we report our test results, referring to the efficiency and effectiveness of the algorithm.Finally, conclusions are given in the last section.

MATHEMATICAL DEFINITION OF THE PROBLEM
Let us define d(p i , y) as the distance between point p i and the facility y located in R 2 .Given n service points in the plane p 1 , p 2 ,..., p n with limited distances λ i > 0 and weights w i ≥0 for i = 1,..., n, the discrete limited distance in terms of the minisum problem with capacity constraints may be expressed by: (4) The first set of constraints defines bounds LB and UB in the number of variables υ i that can be equal to 1.The second set of constraints assures that υ i can be equal to 1 only if the distance between p i and the facility located at y is inferior (or equal) to the limit distance λ i .This avoids the attribution υ i = 1 only to satisfy the constraint ∑ = ≥ . The objective function of (4) and its feasible set is non-convex, which demands more sophisticated solution methods.In this model the variable y may assume a value from a discrete finite set of values Y.
The objective function (4) may still be rewritten by removing its constant terms.It is then expressed as: (5) Fernandes et al. (2011) approached the continuous version of the problem.The only difference between the continuous and discrete version lies in the fact that the facility location y is constrained to be in R 2 , instead of within a finite set Y Ì R 2 .The authors propose a global optimization algorithm for the problem based on a decomposition that, initially, selects for evaluation only the sub regions of the plane that may contain the optimal solution y*.Next, each subproblem is convexified and then solved by convex optimization solvers.
Discrete location models are very common in real applications where there are often a finite number of potential places where the facility can be installed.The p-median problem (cf. Galvão, 1980;Mladenović et al., 2007) is a popular example of a discrete facility location model.

Proposed polynomial time algorithm
Problem (4) has a smaller complexity if compared to its continuous version, since the number of potential candidates is already known beforehand in the plane.The pseudo-code of the algorithm proposed here to the problem is presented in Figure 1.From lines 1 to 7, the variables that configure an instance for the problem are created, while lines 8 and 9 create two variables to store the best solution y* with its corresponding cost (BestCost).The main loop in lines 10-24 calculates, in each iteration, the facility installation cost for a candidate place, and selects the best alternative.If B = |SP| or B = UB, we can take an element i' ∈ SP and i' ∉ V so that v i' * = 1.This way, we obtain a better solution than the previously assumed best solution (a contradiction).

|V| > B.
If B = UB and |V| > B, the solution assumed as the best is infeasible.Otherwise, if B = |SP|, there is an element i' in V that doesn't belong to SP.Thus, this element may be removed from the solution (v i' * = 0) in order to obtain a better solution than v* (contradiction).
So, |V| = B. Yet, we still need to prove that the elements in V correspond to the first B elements in SP.Let us suppose that this is not true: there is an element i 1 among the first B elements from SP that is not in V. Without loss of generality, we also assume w i (d(i, y*) -λ i ) ≠ w j (d(j, y*) -λ j ), " i,j Î I. Since |V| = B, there is an element i 2 in V that is not among the B first elements from SP.Therefore, V is not optimal, because we can do v i1 * = 1 and v i2 * = 0, and consequently, the solution cost of v* is improved (contradiction).
Hence, the algorithm proposed in Figure 1 finds the optimal solution for the discrete single facility location problem with limited distances and capacity constraints.

COMPUTATIONAL TESTS
To evaluate the performance of the algorithm proposed, two sets of experiments were devised.The first one tests the efficiency of the algorithm and its computational limits, while the second set tests the accuracy and efficacy of the algorithm to provide an approximation to the continuous version of the problem.
The algorithm was run in an Intel Core 2 Duo E4500 platform with 2.20 GHz and 2 Gb of RAM memory.The algorithm was implemented in C++ and compiled by Dev-C++.

First set of experiments: accessing algorithm's performance
In this set of tests, we tried to detail the performance of the algorithm under different possible scenarios, which may help the decision maker to define its application and usability for real-world problems.These tests were designed to investigate the influence of: (i) the threshold distance value, (ii) the lower and upper bounds: LB and UB, and (iii) the number of candidate/client points for facility location over algorithm performance.The instances are listed in Table 1.In the first column it is shown the number of points in each instance.In the second column the threshold distance for each instance is given, and in the next two columns the capacity constraints values LB and UB are defined.The last two columns report CPU times (in seconds) and the obtained cost, respectively.In Table 1, we note that variation in the threshold distance value directly affects the computing times.For instances with the same number of points, as well as the same lower and upper bounds, increasing the threshold distances is likely to increase the cardinality of SP.Indeed, candidate points may now be closer to client points, thus eventually serving more points within the allowed distance.The most expensive part of the algorithm occurs in line 12 of the algorithm, where the set SP is sorted -the bigger the set is, the longer it takes to sort.For example, in the instances with 100,000 points, including 2 and 4 as capacity constraints values, the computing time varies from 17,198.27 to 43,495.27seconds by changing the threshold distances.
CPUs time also increase as the number of candidate point for facility location augments, as shown in Figure 2 for the instances with LB=2 and UB=4.Other parameters that may affect the algorithm's efficiency are the capacity constraint values LB and UB.Regarding the upper bound value, as UB augments, more points in SP set may have to be verified in lines 15-17 of the algorithm.Lower bound values usually have a considerable impact over the algorithm performance, its intensity depending on the dispersion of the points of the instance.As larger lower bounds LB are used, the algorithm takes less time to execute them since more candidate points do not meet the condition in line 11. Figure 4 shows the computing time spent by the algorithm on solving the generated instance with 10,000 points, with l = 0.1 and UB = +∞, while varying the lower bound value LB.To perform this comparison, we constructed a grid with a mesh that adjusts the precision of our approximation.The grid is bounded by the location of the most distant points, and the limited plan is divided in an equal number of cells as shown in Figure 6.The points in the intersections of the cells represent candidate points to facility location.The quality of the approximation depends directly on the size of the mesh.A mesh with size g=3, has 3x3=9 cells.This value influences algorithm's performance due to its direct relation with the number of candidate points.As g increases, more candidate points are put in Y, thus improving the quality of the approximation.Fernandes et al. (2011) proposed 9 instances, with different numbers of client points, threshold distances and capacity constraint values.The instances can be found at http:// www.gerad.ca/~aloise/publications.html.They may be divided into three groups, according to the number of clients: the first 3 have 10 client points, the next 3 have 100 and the last 3 have 1000 client points.The lower and upper values (LB) and (UB) were made constant within each group: LB=2, UB=4 for instances with n=10; LB=4, UB=8 for instances with n=100 and; LB=6, UB=12 for instances with n=1000.
Table 2 reports (2011).However, it is noted that, for the instances tested, the ratio between their costs was never inferior to 0.99, which attests the goodness of our approximation in the tested instances (this conclusion cannot be directly generalized for all possible instances due to the limitation of grid search global optimization algorithms -cf.Hendrix and Tóth (2010)).
Figures 7 and 8 illustrate how the grid search and decomposition algorithms are influenced by the threshold distance values l in the instances with 100 and 1000 points, respectively.For both figures, we note that the grid algorithm is insensitive to l (indeed this value is just slightly increased in this set of experiments), and increases its CPU time as g augments.In Figure 3, the decomposition algorithm is outperformed by the grid search algorithm, regardless of the mesh precision; however, l ≥ 73.23.For the instances with 1000 points, this happens for l ≥ 27.02.Furthermore, we can also conclude from the figures that the decomposition algorithm is always outperformed by grid search algorithms in the tested instances if g ≤ 50.
Figures 9 and 10 show the influence of the value g over the performance of the grid search algorithm in instances l2_1000_1 and l2_1000_001.The curves in the figures present CPU time spent by the grid search algorithm for different values of g.The dotted lines in both figures represent CPU time spent by the decomposition algorithm in the referred instances.As observed, the performance of the grid search algorithm deteriorates as more candidate points are searched.This deterioration is accompanied by an improvement in terms of the approximation, representing a tradeoff for the decision maker between quality and performance when using the grid search algorithm.Indeed, the decomposition algorithm is preferred when it presents a better performance than the grid search algorithm for a particular value of g (for instance, in Figure 6, for g ≥ 250).In this paper, we proposed a polynomial-time algorithm to the discrete problem.The algorithm was tested for different scenarios by varying instance parameters that influence its performance, such as the number of candidate points, threshold distance values, and lower and upper bounds in the number of served points.Moreover, we compared our algorithm within a grid search framework with the decomposition algorithm of Fernandes et al. (2011).These computational experiments showed that the quality of approximation provided by the grid search algorithm in the tested instances is high, always above 99%.While the quality of the approximation can be further improved by increasing the precision of the mesh, the performance of the algorithm is compromised since its computational complexity depends directly on the number of candidate points for facility installation.
As shown by the experiments, the proposed algorithm can work with instances with thousands of demand points.It is worth mentioning that the mathematical formulation of the problem, as well as the algorithm proposed here, does not depend on the distance norm adopted; it only considers that a distance matrix is provided.

Fernandes
et al. (2011) extended the model of Drezner

λ.
Figure 1.Polynomial-time algorithm proposed to the optimization of problem (4).The proposed algorithm is polynomial, and its complexity is analyzed by observing its structure.The loop in lines 10-24 is executed for each location that is candidate for facility installation (O(|Y|)).The SP set construction at line 11 takes O(|I|) time, since all client points are evaluated for their inclusion in SP.The sorting operation in line 13 for the elements in SP has complexity O(|I| * Log (|I|)), and it is the most expensive operation for the main loop.Hence, the final complexity of the algorithm is O(|Y| * |I| * Log(|I|)).The optimality of the algorithm is assured by the proof below.Proposition 1: The proposed algorithm finds the optimal solution for the discrete single facility location problem with limited distances and capacity constraints.Proof: Let us assume y* as the optimal place to the installation of the facility.The algorithm selects the first B elements of SP where B = min{|SP|, UB}.These elements i Î SP are sorted increasingly according to w i (d(i, y*) -λ i ) ≤ 0.Let us assume an optimal solution v*, which corresponds Problem instances were artificially generated from a uniform distribution in the unit square.The threshold distancthe diagonal of the unit square.In this set of experiments, all generated points are candidate for installing the facility.

Figure 2 .
Figure 2. Computing times for instances with LB = 2 and UB = 4 with different threshold distances.The curves are approximated by quadratic functions.

Figure 3 .
Figure 3. Decay in the algorithm's efficiency due to an increase in the upper bound value in the instance with 10,000 points, l = 4 / 2and LB = 2.The curve is approximated by a quadratic function.

Figure 4 .
Figure 4. Computing time decay caused by an increase in the lower bound value in the instance with 10,000 points, l = 0.1 and UB = +∞.

Figure 5
Figure 5 presents the variation in the number of subproblems for the same experiment, i.e. |SP| ≥ LB, as the lower bound value increases.We note that the number of subproblems decreases in the same rate of the graph in Figure 4.They actually have the same behaviour and show the importance of the LB parameter for the performance of the algorithm.

Figure 5 .
Figure 5. Number of subproblems solved as LB augments and UB = +∞ in the instance with 10,000 points and l = 0.1.

Figure 6 .
Figure 6.Grid for the problem divided in nine cells.The black dots represent the clients and the red ones represent candidates to facility location.
the results obtained by the grid search algorithm proposed here in comparison with the results of Fernandes et al. (2011) for the continuous model.The first column presents the name of the instances.The second column shows the threshold distances of the instances, which is equal for all candidate points.The next six columns present cost results and computing time for different g values.Each cost value is reported by a ratio between the optimal solution value in the continuous model, obtained by the decomposition algorithm, and the solution value obtained by the grid search algorithm.Finally, the last two columns report the optimal solutions and CPU time spent by the decomposition algorithm of Fernandes et al. (2011) on solving the continuous model.The continuous model is a relaxation of model (4) (since 2 ℜ ⊂ Y ).Consequently, the solutions obtained by the grid search algorithm are always greater or equal to the solutions obtained in the decomposition algorithm of Fernandes et al.

Figure 7 .
Figure 7. Computing time for instances with 100 points spent by the grid search algorithm for g values of 50, 250 and 500, and by the decomposition algorithm (T decomposition ).

Figure 8 .
Figure 8. Computing time for instances with 1000 points spent by the grid search algorithm for g values of 50, 250 and 500, and by the decomposition algorithm (T decomposition ).

Figure 9 .
Figure 9. CPU time for the instance l2_1000_001 spent by the grid search algorithm for g values of 50, 250 and 500, and by the decomposition algorithm (T decomposition )

Table 1 .
Results for the first set of experiments