2011년 2월 23일 수요일

Sokoban Planner

State Representation
Sokoban world could easily be described by a static part and a dynamic part. The static part consists of the position of the walls (including obstacles) and the position of goals. The dynamic part consists of the position of the robot and the position of the boxes. In a state, it is su cient to describe only the changing parameters. In light of these observations, we created our state to only include the position of the robot and the positions of the boxes. We think this is a light-weight and e fficient representation. It is important to note here that we do not tag the boxes and goals by hand.

Method of Planning
Our planner works in 2 phases in which a graph is formed and a search is performed. In the initial phase, given an initial world con figuration (static part + initial state), our planner generates a sparse graph, where the states are the nodes and the actions are the edges. The graph is generated as follows; given a state (i.e., node), available actions are evaluated and then applied (thus actions become edges) to generate new states. This goes on until there are no more possible nodes to generate. A newly created node is checked whether is was generated before to eliminate redundant nodes. During the generation of the graph, nodes are given heuristic values excluding their path costs. After this graph is constructed, A* search is applied to find a path from the initial state to a goal state.

Heuristics
We calculate the heuristic as the sum of Manhattan distances between boxes and the corresponding goals. We determine the corresponding goals of boxes as their position in a goal state. Recall that we expand all the nodes. An important property to note here is that we do not explicitly match the boxes with the goals. Mathematically, the heuristic of a state is calculated.



2011년 2월 22일 화요일

Improving the Optimality of RRT: RRT*

1. Introduction

Motion planning is to find a path from a given start to a given goal without colliding with obstacles. Many researchers have proposed diverse algorithms: algebraic planners, cell decomposition, potential fields, and sampling-based methods. These algorithms have played an important role in robotics, animation, virtual environments, and computer games.

In robotics, the sampling-based methods are popular and especially, Rapidly-exploring Random Tree (RRT) is widely used now. One reason why the algorithm is popular is that as workspace of robots has become complex and is getting larger, it is almost impossible to build a map for a huge search space. Moreover, our environment continuously changes, so obstacles can be known in prior.

Even if the RRT can find a path to a given goal fast, the optimality of the final path is not acceptable. The final path looks like a tree, so a post smooth method is required to reduce the cost to goal. Furthermore, as the number of samples increases, the RRT algorithm converges to a sub-optimal solution. For the purpose of solving the critical problem, the sub-optimality, we are going to use RRT* [3]. The RRT* is almost same with RRT except adjusting the length [3] of new connections in the extend procedure. By adjusting the length, RRT* can find a more optimal path than RRT.

2. Related Work

The RRT [1] [2] is proposed by Steven M. LaValle and James Kuffner, and it has been widely used in robotics path planning. The algorithm designed for efficiently searching high-dimensional search spaces. The tree is constructed in such a way that any sample in the space is added by connecting it to the nearest sample already in the tree. Since vertex selection is based on nearest neighbors, this implies that vertices with large Voronoi regions are more likely to be selected for expansion. Therefore, the main advantage of RRT is that the tree grows toward unexplored regions. However, as the number of samples increases significantly, several paths to given a goal exist, and the possibility that the RRT algorithm converges to a sub-optimal solution increases. Thus, huge samples results in reducing the optimality of the RRT.

3. Notation

V is a vertex, and E is an edge. A graph, G, is composed of a vertex set V and an edge set E. Nearest(x) is to return a vertex that is the closest to x. ObstacleFree(a, b) is to return true, if there is no obstacles between a and b. Otherwise, it returns false. Steer(x, y) returns a point that is closer to y than x. The straight continuous path between two vertices is denoted by Line(a, b).

4. Approach

Both the RRT* and the RRT algorithms are similar to most other incremental sampling-based planning algorithms as Algorithm 1 describes. Initially, the algorithms start with the graph that includes the initial state as its single vertex and no edges; then, they incrementally grow a graph on , the sample space by sampling a state from at random and extending the graph towards . As algorithm 1 illustrates, the RRT and the RRT* choose from the sample space randomly. After choosing , each algorithm creates based on its Extend procedure.

Figure 1. The Extend procedure of RRT

The RRT* differs from the RRT algorithms only in the way that it handles the Extend procedure. The Extend procedure for the RRT and RRT* is given in Algorithm 3, 4. The Extend procedure of the RRT is simpler than one of the RRT* as algorithm 2 describes. First, it finds the nearest vertex, to , and then creates by moving an incremental distance, v from in the direction of . If the does not collide with obstacles, add the vertex, and the edge to connect between and to the graph, G. Figure 1 illustrates the extend procedure of the RRT.

Algorithm 1. Body of RRT and RRT* Algorithms

Algorithm 2. Extend Procedure of RRT Algorithm

The RRT* algorithm first extends the nearest neighbor towards the sample (Lines 2-3 in algorithm 3). However, it connects the new vertex, , to the vertex that incurs the minimum accumulated cost up until and lies within the set of vertices returned by the Near procedure (Lines 6-13 in algorithm 3). RRT* also extends the new vertex to the vertices in in order to “rewire” the vertices that can be accessed through with smaller cost (Lines 14-17 in Algorithm 3). Figure 2 describes the rewire procedure.

Algorithm 3. Pseudo Code for RRT* Extend

5. Evaluation

In previous section, we look into the RRT and RRT*, and the Extend procedure of the RRT* is more complex than the RRT. Thus, time efficiency, cost to goal, and the number of extended vertices will be compared between two algorithms. In addition, we will check that the RRT* finds a more optimal path than the RRT in a map that has several paths to a given goal by computing the cost to the goal.

Figure 2. The rewire procedure of RRT*

To evaluate the RRT and the RRT*, we implemented two algorithms with C++, and all algorithms are run on dual-core 2.66 GHz, 4GB RAM, and Windows 7 64-bit. Also, we used the same seed in generating random numbers for the purpose of making the same test environment. The map size is 24x24, and the step size is 0.4.

Figure 3. Extended vertices when each algorithm reaches to a given goal in 2D path planning (a) RRT in a bug trap map (b) RRT* in a bug trap map (c) RRT in a maze map (d) RRT* in a maze map

In order to compare the efficiency of two algorithms, as figure 3 depicts, we evaluate them in two different maps: the bug trap map and the maze map. Figure 3 shows the area that each algorithm explored as well as the final path from a start to a goal. Intuitively, the RRT* is more efficient since it does less explore the map than the RRT, and the result path of the RRT* is shorter than one of the RRT.

Figure 4.2D Path planning results of RTT and RRT* (a) 15000 iteration, RRT (b) 15000 iteration, RRT*(c) 2000 iteration, RRT* (d) 4000 iteration, RRT* (e) 8000 iteration, RRT*
Figure 5. Cost to a goal vs Iteration of RRT and RRT* in the map of figure 4

Table 1. The number of vertices, cost to a goal, and elapsed time of the RRT and the RRT* in the bug trap map and the maze map

Bug Trap Map

Vertices

Elapsed Time

Cost

RRT

6027

320 ms

40.80

RRT*

3099

383 ms

31.50

Maze Map

Vertices

Elapsed Time

Cost

RRT

887

77 ms

26.00

RRT*

205

20 ms

21.19

As table 1 shows, the RRT* is more optimal in two maps and is more efficient in memory because its vertices are significantly less than the RRT. However, the elapsed time of the RRT* in the bug trap map is longer than the RRT due to the fact that the rewire procedure is a time consuming process. The procedure makes the RRT* search vertices in a kd-tree and rearrange edges, and results in taking more time in finding a path.

In order to evaluate the optimality between two algorithms, we made a map which has several paths to a goal and compared the final path of two algorithms. As figure 4 (a) and (b) depict, the RRT* found an optimal path. As the RRT* extends its vertices, it rewires its edges based on the cost from a start to a new vertex and consequently finds a more optimal path by rewiring, and figure 4 (a), (b) and (c) show each step to reach to the optimal path. In figure 5, the RRT* is explicitly optimal in finding a path compared to the RRT. As iteration increases, the cost of the RRT* decreases consistently, but the cost of the RRT does not decrease. Consequently, even if the RRT algorithm converges a sub-optimal solution, the RRT* can find a more optimal path.

6. Discussion

As the number of samples increases, the RRT algorithm converges to a sub-optimal solution inevitably. To solve the problem, we presented a way to find an optimal path, the RRT* algorithm. The RRT* adjusts the length of new connections in the extend procedure and can find an optimal path compared to the RRT algorithm.

However, it is less time efficient than the RRT in some cases even if it can find a more optimal path. It is mainly because of the rewire procedure. As the number of newly added vertices increases consistently, the time to search vertices in a kd-tree for the rewire increases. Finally, it results in the inefficiency of the RRT* in several cases.

Even if the RRT* can find a more optimal path than the RRT, it is hard to decide when finishing the search and how to know whether or not the algorithm finds an optimal path. As figure 5 illustrates, the cost to a goal will decrease by continuously exploring a given map and decrease significantly when the RRT* algorithm finds a more optimal path. The optimality of the RRT* increases as vertices are added, but the efficiency decreases. Thus, we should think about a policy that the search stops when the cost to a goal decreases with a certain amount or the amount of the cost does not decrease for a certain times.

In this paper, one directional RRT* algorithm is considered and is tested in two-dimensional space. Thus, it is necessary to implement bi-directional RRT* using the method [4] which the bi-directional RRT used and test its feasibility in high dimensional spaces.

Execution File

RRT* in 3-dimensional space

7. Reference

[1] S. M. LaValle. Rapidly-exploring random trees: A new tool for path planning. TR 98-11, Computer Science Dept., Iowa State Univ., Oct. 1998.

[2] Kuffner, J. and LaValle, S., “RRT-Connect: an Efficient Approach to Single-Query Path Planning,” IEEE International Conference on Robotics and Automation, 2000, pp. 995–1001.

[3] Sertac Karamana and Emilio Frazzoli. Incremental sampling-based optimal motion planning. In Proceedings of the Robotics: Science and Systems, 2010.

[4] S. M. LaValle and J. J. Ku®ner. Rapidly-exploring random trees: Progress and prospects. In Workshop on the Algorithmic Foundations of Robotics, 2000.