- Home
- Documents
*Advanced Algorithm Design and Analysis (Lecture 1) simas/aalg2011/slides/aalg8.pdf¢ ...*

prev

next

out of 23

View

0Download

0

Embed Size (px)

Advanced Algorithm Design and Analysis (Lecture 8)

SW8 spring 2011 Simonas Šaltenis 3.2.12 simas@cs.aau.dk

AALG, lecture 8, © Simonas Šaltenis, 2011 2

Range Searching in 2D • Main goals of the lecture:

to understand and to be able to analyze ◆ the kd-trees and the range trees;

to see how data structures can be used to trade the space used for the running time of queries

AALG, lecture 8, © Simonas Šaltenis, 2011 3

Range queries • How do you efficiently find points that are inside of a

rectangle? Orthogonal range query ([x1, x2], [y1, y2]): find all points

(x, y) such that x1

AALG, lecture 8, © Simonas Šaltenis, 2011 4

Preprocessing • How much time such a query would take? • Rules of the game:

We preprocess the data into a data structure Then, we perform queries and updates on the data structure Analysis:

◆ Preprocessing time ◆ Efficiency of queries (and updates) ◆ The size of the structure

Assumption: no two points have the same x- coordinate (the same is true for y-coordinate).

AALG, lecture 8, © Simonas Šaltenis, 2011 5

1D range query • How do we do a 1D range query [x1, x2]?

Balanced BST where all data points are stored in the leaves ◆ Internal nodes store copies of points:

▲ the left subtree ≤ node < the right subtree ◆ The size of it?

Where do we find the answer to a query?

T

… b1 q1 a1 … b2 q2 a2 …

Search path for x2Search path for x1

Total order of data points

AALG, lecture 8, © Simonas Šaltenis, 2011 6

1D range query • How do we find all these leaf nodes?

A possibility: have a linked list of leaves and traverse from q1 to q2

◆ but, will not work for more dimensions… Sketch of the algorithm:

◆ Find the split node ◆ Continue searching for x

1 , report all right-subtrees

◆ Continue searching for x 2 , report all left-subtrees

◆ When leaves q1 and q2 are reached, check if they belong to the range

AALG, lecture 8, © Simonas Šaltenis, 2011 7

1DRangeSearch

• Why is this correct?

1DRangeSearch(T, x1, x2) 01 v ← FindSplit(T, x1, x2) 02 if v is a leaf then 03 if x1 ≤ v.key ≤ x2 then return v 04 else return DoLeft(v.leftChild, x1, x2) ∪ DoRight(v.rightChild, x1, x2)

DoLeft(v, x1, x2) 01 if v is a leaf then 02 if x1 ≤ v.key ≤ x2 then return v 03 else 04 if x1 ≤ v.key then return ReportSubtree(v.rightChild) ∪ DoLeft(v.leftChild, x1, x2) 05 else return DoLeft(v.rightChild, x1, x2)

DoRight(v, x1, x2) // similar to DoLeft, but with modified lines 04-05

AALG, lecture 8, © Simonas Šaltenis, 2011 8

Analysis of 1D range query • What is the worst-case running time of a query?

It is output-sensitive: two traversals down the tree plus the O(k), where k is the number of reported data points:

O(log n + k)

• What is the time of construction? Sort, construct by dividing into two, creating the root and

conquering the two parts recursively O(n log n)

• Size: O(n)

AALG, lecture 8, © Simonas Šaltenis, 2011 9

2D range query • How can we solve a 2D range query?

Observation – 2D range query is a conjunction of two 1D range queries: x1

AALG, lecture 8, © Simonas Šaltenis, 2011 10

Range tree • Idea: when performing search on x-coordinate, we need to

start filtering points on y-coordinate earlier! Canonical subset P(v) of a node v in a BST is a set of points

(leaves) stored in a subtree rooted at v

BST on y-coords

P(v)

Ta(v) T

P(v)

v

BST on x-coords

Range tree is a multi-level data structure:

◆ The main tree is a BST T on the x-coordinate of points

◆ Any node v of T stores a pointer to a BST Ta(v) (associated structure of v), which stores canonical subset P(v) organized on the y-coordinate

◆ 2D points are stored in all leaves!

AALG, lecture 8, © Simonas Šaltenis, 2011 11

Querying the range tree • How do we query such a tree?

Use the 1DRangeSearch on T, but replace ReportSubtree(w) with 1DRangeSearch(Ta(w), y1, y2)

• What is the worst-case running time? Worst-case: We query the associated structures on all nodes

on the path down the tree On level j, the depth of the associated structure is

log n 2 j

=logn− j

• Total running time: O(log2 n + k)

AALG, lecture 8, © Simonas Šaltenis, 2011 12

Size of the range tree • What is the size of the range tree?

At each level of the main tree associated structures store all the data points once (with constant overhead) (Why?) : O(n)

There are O(log n) levels Thus, the total size is O(n log n)

AALG, lecture 8, © Simonas Šaltenis, 2011 13

Building the range tree • How do we efficiently build the range tree?

Sort the points on x and on y (two arrays: X,Y) Take the median v of X and create a root, build its

associated structure using Y Split X into sorted XL and XR, split Y into sorted YL and YR

(s.t. for any p ∈XL or p ∈YL, p.x < v.x and for any p ∈XR or p ∈YR, p.x ≥ v.x)

Build recursively the left child from XL and YL and the right child from XR and YR

• What is the running time of this? O(n log n)

AALG, lecture 8, © Simonas Šaltenis, 2011 14

Range trees: summary • Range trees

Building (preprocessing time): O(n log n) Size: O(n log n) Range queries: O(log2 n + k)

• Running time can be improved to O(log n + k) without sacrificing the preprocessing time or size

Layered range trees (uses fractional cascading) Priority range trees (uses priority search trees as associated

structures)

AALG, lecture 8, © Simonas Šaltenis, 2011 15

Kd-trees • What if we want linear space?

Idea: partition trees – generalization of binary search trees Kd-tree: a binary tree

◆ Data points are at leaves ◆ For each internal node v:

▲ x-coords of left subtree ≤ v < x-coords of right subtree, if the depth of v is even (split with vertical line)

▲ y-coords of left subtree ≤ v < y-coords of right subtree, if the depth of v is odd (split with horizontal line)

Space: O(n) – points are stored once.

AALG, lecture 8, © Simonas Šaltenis, 2011 16

Example kd-tree

1 2 3 4 5 6 7 8 1

6 7

5 4 3 2

8

d e b a d

e

5 4 3

2 3 6 b

a

c c f

gf

g

x

y

x

AALG, lecture 8, © Simonas Šaltenis, 2011 17

Draw a kd-tree

1 2 3 4 5 6 7 8 1

6 7

5 4 3 2

8

a

b

c

d

e

f

g

h

• Draw a kd-tree storing the following data points

AALG, lecture 8, © Simonas Šaltenis, 2011 18

Querying the kd-tree • How do we answer a range query?

Observation: Each internal node v corresponds to a region(v) (where all its children are included).

We can maintain region(v) as we traverse down the tree

1 2 3 4 5 6 7 8 1

6 7

5 4 3 2

8

d e b a d

e

5 4 3

2 3 6 b

a

c c f

gf

g

AALG, lecture 8, © Simonas Šaltenis, 2011 19

Querying the kd-tree • The range query algorithm (range R):

If region(v) does not intersect R, do not go deeper into the subtree rooted at v

If region(v) is fully contained in R, report all points in the subtree rooted at v

If region(v) only intersects with R, go recursively into v’s children.

AALG, lecture 8, © Simonas Šaltenis, 2011 20

Analysis of the search alg. • What is the worst-case running time of the search?

Traversal of subtrees v, such that region(v) is fully contained in R adds up to O(k).

We need to find the number of regions that intersect R – the regions which are crossed by some border of R

◆ As an upper bound for that, let’s find how many regions are crossed by a vertical (or horizontal) line

◆ What recurrence can we write for it?

T(n) = 2 + 2T(n/4)

O n O nk Solution: Total time:

AALG, lecture 8, © Simonas Šaltenis, 2011 21

Building the kd-tree • How do we build the kd-tree?

Sort the points on x and on y (two arrays: X,Y) Take the median v of X (if depth is even) or Y (if depth is odd) and

create a root Split X into sorted XL and XR, split Y into sorted YL and YR, s.t.

◆ for any p ∈XL or p ∈YL, p.x < v.x (if depth is even) or p.y < v.y (if depth is odd)

◆ for any p ∈XR or p ∈YR, p.x ≥ v.x (if depth is even) or p.y ≥ v.y (if depth is odd)

Build recursively the left child from XL and YL and the right child from XR and YR

• What is the running