You are on page 1of 80

Data Structures in Java

Midterm Review
3/10/2015

Daniel Bauer

Midterm

Midterm on Thursday (in-class)

Similar format to sample questions.

Closed books/notes/electronic devices (except


calculators).

Bring a pen, water, and nothing else.

60 minutes. Be on time!

If you are taking the midterm tonight (only if signed up!):


5pm in 620 CEPSR/Shapiro

Topics - Overview

Series, Proofs.

Running Time Analysis of Algorithms. Big-Oh Notation.

Abstract Data Types.

Data Structure Implementations.

Applications.

Implementations in Java, Java Concepts.

Types of Proofs
Remember some examples?

Proof by Induction

Proof by Contradiction

Proof by Counterexample

Goals of Algorithm Analysis

Does the algorithm terminate?

Does the algorithm solve the problem?


(correctness)

What resources does the algorithm use?

Time / Space

Comparing Function Growth:


Big-Oh Notation
such that

if there are positive constants


when
.

and
f(N) = N2 + 2

T(N) = 10N+ 100

e.g. c = 1, n0 = 16.1

Comparing Function Growth:


Big-Oh Notation
such that

if there are positive constants


when
.

and
f(N) = N2 + 2

T(N) is in the
order of f(N)

T(N) = 10N+ 100

e.g. c = 1, n0 = 16.1

Comparing Function Growth:


Big-Oh Notation
such that

if there are positive constants


when
.

and
f(N) = N2 + 2

T(N) is in the
order of f(N)
f(N) is an
upper bound
on T(N)

T(N) = 10N+ 100

e.g. c = 1, n0 = 16.1

Comparing Function Growth:


Additional Notations

Lower Bound:
such that

if there are positive constants


when
.

Tight Bound: T(N) and f(N) grow at the same rate


if

and

and

Strict Upper Bound:


if for all positive constants
such that
when

there is some
.

Typical Growth Rates


cubic

exponential

quadratic

linear
log-squared
logarithmic
constant

Data Structures for


Sequences
List
Array
Simple Linked List
Doubly Linked List
Queue (FIFO)

Stack (LIFO)

Linked Lists as Queue


Circular Array Queue

Linked Lists as Stacks


Array Stack

Tree Data Structures


Tree
Fixed number of children (Binary, N-Ary Tree)
Sibling List Representation

Tree Data Structures


Tree
Fixed number of children (Binary, N-Ary Tree)
Sibling List Representation
Ordered Sets/Maps
Search Tree
Binary Search Tree
N-Ary Search Tree

Tree Data Structures


Tree
Fixed number of children (Binary, N-Ary Tree)
Sibling List Representation
Ordered Sets/Maps
Search Tree
Binary Search Tree
N-Ary Search Tree
Balanced Search Tree
AVL Tree
B-Tree

Sets and Maps


Set

Map

Ordered Set
Balanced Search Tree

Ordered Map
Balanced Search Tree

Hash Table
Linked List entries
Probing Hash Tables

The List ADT

A0 A1 A2 A3 A4 A5 A6

The List ADT

A list L is a sequence of N objects A0, A1, A2, , AN-1

A0 A1 A2 A3 A4 A5 A6

The List ADT

A list L is a sequence of N objects A0, A1, A2, , AN-1

N is the length/size of the list. List with length N=0


is called the empty list.

A0 A1 A2 A3 A4 A5 A6

The List ADT

A list L is a sequence of N objects A0, A1, A2, , AN-1

N is the length/size of the list. List with length N=0


is called the empty list.

Ai follows/succeeds Ai-1 for i > 0.

A0 A1 A2 A3 A4 A5 A6

The List ADT

A list L is a sequence of N objects A0, A1, A2, , AN-1

N is the length/size of the list. List with length N=0


is called the empty list.

Ai follows/succeeds Ai-1 for i > 0.

Ai precedes Ai+1 for i < N.


A0 A1 A2 A3 A4 A5 A6

Array List
1

N=7

6 7 8 9
Worst Case Running Times
printList

O(N)

find(x)

O(N)

findKth(k)

O(1)

insert(x,k)
remove(x)

Array List
1

insert(5,7): O(1)

N=7

6 7 8 9
Worst Case Running Times
printList

O(N)

find(x)

O(N)

findKth(k)

O(1)

insert(x,k)
remove(x)

Array List
1

insert(5,7): O(1)
remove(7): O(1)

N=7

6 7 8 9
Worst Case Running Times
printList

O(N)

find(x)

O(N)

findKth(k)

O(1)

insert(x,k)
remove(x)

Array List
7 moves
5

insert(5,7): O(1)
remove(7): O(1)
insert(5,0): O(N)

N=7

6 7 8 9
Worst Case Running Times
printList

O(N)

find(x)

O(N)

findKth(k)

O(1)

insert(x,k)
remove(x)

Array List
7 moves
5

insert(5,7): O(1)
remove(7): O(1)
insert(5,0): O(N)
remove(0): O(N)

N=7

6 7 8 9
Worst Case Running Times
printList

O(N)

find(x)

O(N)

findKth(k)

O(1)

insert(x,k)

O(N)

remove(x)

O(N)

Need to copy entire list to larger array if array becomes full.

Simple Linked Lists


head

42

23

Sequence of nodes linked by next pointers.


Worst case Running Times
printList
find(x)
findKth(k)
insert(x,k)
remove(k)
next()

null

Simple Linked Lists


head

42

23

Sequence of nodes linked by next pointers.


Worst case Running Times
printList
O(N)
find(x)
findKth(k)
insert(x,k)
remove(k)
next()

null

Simple Linked Lists


head

42

23

Sequence of nodes linked by next pointers.


Worst case Running Times
printList
O(N)
find(x)
O(N)
findKth(k)
insert(x,k)
remove(k)
next()

null

Simple Linked Lists


head

42

23

Sequence of nodes linked by next pointers.


Worst case Running Times
printList
O(N)
find(x)
O(N)
findKth(k)
O(N)
insert(x,k)
remove(k)
next()

null

Simple Linked Lists


head

42

23

Sequence of nodes linked by next pointers.


Worst case Running Times
printList
O(N)
find(x)
O(N)
findKth(k)
O(N)
insert(x,k)
remove(k)
next()

O(N)

null

Simple Linked Lists


head

42

23

Sequence of nodes linked by next pointers.


Worst case Running Times
printList
O(N)
find(x)
O(N)
findKth(k)
O(N)
insert(x,k)
O(N)
remove(k)
O(N)
next()

null

Simple Linked Lists


head

42

23

Sequence of nodes linked by next pointers.


Worst case Running Times
printList
O(N)
find(x)
O(N)
findKth(k)
O(N)
insert(x,k)
O(N)
remove(k)
O(N)
next()
O(1)
In many applications we can use an iterator instead of findKth(k).

null

Doubly Linked Lists


head

A0

A1

A2

A3

tail

Sequence of nodes linked by next and prev pointers.


Worst case Running Times
printList
O(N)
find(x)
O(N)
O(N)
findKth(k)
insert(x,k)
O(N)
remove(k)
O(N)
O(1)
next()
Actually a little faster in practice, because we only have to
search at most half the list.

The Stack ADT


Last In First Out (LIFO).

Operations have the same running time in all


implementations:

push(x)

O(1)

pop()

O(1)

peek()

O(1)

empty()

O(1)

Top

Implementations discussed:
Using an Array List, Using a LinkedList
Hardware Stacks (memory abstraction, stack machine)

The Stack ADT


Last In First Out (LIFO).

Operations have the same running time in all


implementations:

push(x)

O(1)

pop()

O(1)

peek()

O(1)

empty()

O(1)

Top

42
5

Implementations discussed:
Using an Array List, Using a LinkedList
Hardware Stacks (memory abstraction, stack machine)

The Stack ADT


Last In First Out (LIFO).

Operations have the same running time in all


implementations:
Top
push(x)
O(1)
pop()

O(1)

3
23

peek()

O(1)

42

empty()

O(1)

Implementations discussed:
Using an Array List, Using a LinkedList
Hardware Stacks (memory abstraction, stack machine)

The Stack ADT


Last In First Out (LIFO).

Operations have the same running time in all


implementations:
push(x)
pop()

O(1)
O(1)

Top

23

peek()

O(1)

42

empty()

O(1)

Implementations discussed:
Using an Array List, Using a LinkedList
Hardware Stacks (memory abstraction, stack machine)

Stack Applications

Stack Applications

Method call stacks.

Stack Applications

Method call stacks.

Evaluating postfix expressions.

Stack Applications

Method call stacks.

Evaluating postfix expressions.

Converting infix to postfix notation.

Stack Applications

Method call stacks.

Evaluating postfix expressions.

Converting infix to postfix notation.

Constructing an expression tree from a postfix expression.

Stack Applications

Method call stacks.

Evaluating postfix expressions.

Converting infix to postfix notation.

Constructing an expression tree from a postfix expression.

Perform a tree traversal without recursion (relation to recursion).

Stack Applications

Method call stacks.

Evaluating postfix expressions.

Converting infix to postfix notation.

Constructing an expression tree from a postfix expression.

Perform a tree traversal without recursion (relation to recursion).

Implementing Queue.

Stack Applications

Method call stacks.

Evaluating postfix expressions.

Converting infix to postfix notation.

Constructing an expression tree from a postfix expression.

Perform a tree traversal without recursion (relation to recursion).

Implementing Queue.

Re-arranging subway cars.

The Queue ADT


First In First Out (FIFO) storage.

Operations have the same running time in all


implementations:

enqueue(x)

O(1)

dequeue()

O(1)

empty()

O(1)
front back

Implementations discussed:
Using a linked list
Using a circular array

The Queue ADT


First In First Out (FIFO) storage.

Operations have the same running time in all


implementations:
enqueue(x)

O(1)

dequeue()

O(1)

empty()

Implementations discussed:
Using a linked list
Using a circular array

O(1)
front

back

17

23

The Queue ADT


First In First Out (FIFO) storage.

Operations have the same running time in all


implementations:
enqueue(x)

O(1)

dequeue()

O(1)

empty()

Implementations discussed:
Using a linked list
Using a circular array

O(1)
front

back

17

23

The Queue ADT


First In First Out (FIFO) storage.

Operations have the same running time in all


implementations:
enqueue(x)

O(1)

dequeue()

O(1)

empty()

O(1)

front

Implementations discussed:
Using a linked list
Using a circular array

17

back

23

Circular Array
Implementation of Queue

Problem: In naive array implementation, dequeues cause


empty space at the beginning of the array.

Circular array re-uses empty space by allowing backpointer to wrap around.


5

17

front

23

back

Circular Array
Implementation of Queue

Problem: In naive array implementation, dequeues cause


empty space at the beginning of the array.

Circular array re-uses empty space by allowing backpointer to wrap around.


5

17

back

front

23

Need to copy entire queue to larger array if array becomes full.

Tree ADT

A tree T consists of
T

A root node r.

zero or more nonempty subtrees T1, T2, TN,

each connected by a directed edge from r.

Support typical collection operations: size, get,


set, add, remove, find,

Tree ADT
r

A tree T consists of
T1

T2

A root node r.

zero or more nonempty subtrees T1, T2, TN,

Tn

each connected by a directed edge from r.

Support typical collection operations: size, get,


set, add, remove, find,

Representing Trees

Option 2: Organize siblings as a linked list.

1st child

n1
1st child

n0
next sibling

n2

n3

next sibling

Problem: Takes longer to find a node from the root.

Representing Trees

Option 1: Every node has fixed number of


references to children.
n0

n1

n2

n3

Problem: Only reasonable for small or constant number


of children.

M-ary Trees

Each node can have M subnodes.

Height of a complete M-ary tree is

Binary Trees

For binary trees, the number of children is at most


two.

Binary trees are very common in data structures


and algorithms.

They are convenient to analyze.

Tree Traversals: In-order


1. Process left child
2. Process root
3. Process right child

(a + b * c) + (d * e + f) * g
+

*
b

*
d

g
f

Tree Traversals: Post-order


abc*+de*f+g*+

1. Process left child


2. Process right child
3. Process root

*
b

*
d

g
f

Tree Traversals: Pre-order


++a*bc*+*defg

1. Process root
2. Process left child
3. Process right child

*
b

*
d

g
f

Binary Search Trees

BST property:
For all nodes s in Tl, sitem < ritem.

For all nodes t in Tl, titem > ritem.

contains(x)
insert(x)
findMin()
findMax()
remove()

Tl

O(height(T))
O(height(T))
O(height(T))
O(height(T))
O(height(T))

Tr

Worst and Best Case Height


of a Binary Search Tree

Assume we have a BST with N nodes.

Worst case: T does not


branch. height(T)=N

Best case:
height(T)=log N

4
2

2
1

3
4

5
3

AVL Tree Condition

An AVL Tree is a Binary Search Tree in which the


following balance condition holds after each
operation:

For every node, the height of the left and right


subtree differs by at most 1.
3
1

7
2

1
3

not an AVL tree

4
1

Maintaining Balance in an
AVL Tree

Assume the tree is balanced.

After each insertion, find the lowest node k that violates


the balance condition (if any).

Perform rotation to re-balance the tree.

Rotation maintains original height of subtree under k


before the insertion. No further rotations are needed.

Single Rotation
k2
k1

z
y

Single Rotation
k1
k2

Double Rotation
k3
k1

z
k2

x
yl

yr

Double Rotation
k2
k3

k1

yl

yr

B-Trees

A B-Tree is an M-Ary search tree.

Every internal node (except for the root) has


children and contains
values.

All leaves contain


L=M-1)

All leaves have the same depth.

Often used to store large tables


on hard disk drives.
(databases, file systems)

values (usually

16

25

33

27

38

34

36

41

46

48

OrderedSet ADT

A set with a total order defined on the items (all


pairs of items are in a > or < relation to each
other).

Supported operations: all Set operations and

findMin()

findMax()

A
1

AB
2

5
6
8

B
9

AB

Set ADT

A Set is a collection of data that does not allow


duplicates.

Supported operations:

insert(x)

remove(x)

contains(x)

isEmpty()

size()

Set ADT

A Set is a collection of data that does not allow


duplicates.

Supported operations:

insert(x)

addAll(s) / union(s)

remove(x)

removeAll(s)

contains(x)

retainAll(s) / intersection(s)

isEmpty()

size()

A
1

AB
2

5
6
8

B
9

AB

Map ADT

A map is collection of (key, value) pairs.

Keys are unique, values need not be (keys are a Set!).

Two operations:

get(key)

put(key, value)

returns the value associated with this key

key1
key2
key3
key4

(overwrites existing keys)

value1
value2
value3

Hash Tables

Define a table (an array) of some length TableSize.

Define a function hash(key) that maps key objects to an integer


index in the range
0 TableSize -1

Assuming hash(key) takes constant time, get and put run in O(1).

0
Alice 555-341-1231

hash(key)

Alice 555-341-1231

TableSize - 1

Separate Chaining

Keep all items with the same hash value on a linked


list.

Slow if load factor becomes > 1.

Anna 555-521-2973

0
hash(key)

Alice 555-341-1231

1
2

TableSize - 1

Bob 555-341-1231

Separate Chaining

Keep all items with the same hash value on a linked


list.

Slow if load factor becomes > 1.

Anna 555-521-2973

0
hash(key)

Anna 555-521-2973

1
2

TableSize - 1

Bob 555-341-1231

Alice

555-341-1231

Hash Tables without Linked


Lists: Probing

When a collision occurs put item in an empty cell of


the hash table itself.

40

hash(key)

x % 11

0
1
2
3
4
5
6
7
8
9
10

40

Linear Probing

Can always find alternative cell if there is still space.


Search becomes slow because of primary clustering.

17

hash(key)

x % 11

0
1
2
3
4
5
6
7
8
9
10

39
40

51
18
17

Quadratic Probing

No primary clustering.
If table size is not prime or table is more than half full it is
possible that no empty cell can be found for a key, even if
there is still space in the table.
3
47

hash(key)

x % 11

f(3) = 9

0
1
2
3
4
5
6
7
8
9
10

47
25
3

14

Double Hashing
Compute a second hash function to
determine a linear offset for this key.
f(1) = 1 hash2(x) =3

62

hash(key)

x % 11
hash2(key)

5-x%5

0
1
2
3
4
5
6
7
8
9
10

40
84
62

You might also like