Assignments for Data Structures II
Important Note: All homework problems must be submitted as files in your class directory. They will be graded on-line. Unless otherwise stated all "written" assignments are due one week after they are assigned in class and all programming assignments are due two weeks after they are assigned in class.
- Review the contents of the Getting Started web page.
- You must create a folder for this class. The folder will be located on your P: drive and have the name CS-242.
After creating CS-242, right click on the folder name and then click on Properties.
Click on Security
In the look-in box select lynchburg-edu
In the users box type in roussos_c and click on Check Names
Select roussos_c and give full control to this CS-242 folder
Under CS-242 create three more folders called Programs, Labs and Homework.
- Program #1 (and lab 1) - HeapSort
HeapSort is a worst-case N*log2(N) sort.
Your program will have a driver similar to the one described for MergeSort above. After
the numbers are read into an array your program will
- insert each data item, one by one, into a heap and then
- dequeue (remove) each item and place it back into the array
Since a heap is an implementation of a priority queue the data will be extracted in
order; thus sorting the data.
Multiway Trees
- Read Chapter 7 - Multiway Trees
(omit Bit-Trees, 2-4 Trees, vh-Trees)
Do exercises 1, 2, 3, 5, 8, 10, 14 and 19 on pages 369 - 370
- Lab 2 - Create a balanced binary tree- Top to bottom and left to right
Your program will
- Read in randomly generated integers into an array
- Sort them
- Create a balanced binary tree - Top to bottom and left to right
You can accomplish this by writing a function called rootindex that takes as input the total number of data values, N, and returns the number of nodes less than the root. This is the index of the root.
For example, if the total number of nodes is 13 and they are numbered 0 - 12 then there are 7 nodes less than the root node so the root of the tree is in position 7.
You can then perform this same operation recursively with the left subtree and the right subtree to complete the construction of the tree.
The trick here is, of course, to be able to write the function that returns the root index given the number of values.
Let the Binary Search Tree we are creating be named BST.
Observe that a perfectly balanced, top to bottom, left to right tree, consists of a complete binary tree (call it CBT) plus one more level containing 0 or more nodes/values.
Let r be the root of BST (and CBT). Name the left subtree of CBT, LS, its right subtree RS and let N be the number of values in the full data set. Finally let |X| be the number of nodes in the tree, X and let lg be log2.
|CBT| = 2⌊lg(N)⌋-1
|LS| = 2⌊lg(N)⌋-1-1 = |RS|.
The number of nodes at the bottom level is N - |CBT|.
The number of nodes at the bottom level less than the root is N - |CBT| - the number of nodes greater than the root. So, in other words, The number of nodes at the bottom level less than the root is
min(N - |CBT|, 2⌊lg(N)⌋-1)
where the second parameter is the max possible number of nodes at the bottom level less than the root.
This is because the number of leaves in a complete binary tree is equal to one more than the number of internal nodes in the tree.
So, the total number of nodes less than the root is |LS| + min(N - |CBT|, 2⌊lg(N)⌋-1).
Sorting
- Read Chapter 9 - Sorting (We will return to chapter 8 later)
Do Exercises 6, 12, 14 and 15 on pages 520-521.
- Lab 3 - Write and test the function, KSmall, which finds the kth smallest number in a list
See a further description here
- Program #2 - QuickSort and MergeSort Comparison
MergeSort Description, QuickSort Description
Review the pseudocode for
Merge,
Recursive MergeSort and Nonrecursive Mergesort
- Lab - Random Walk in one dimension
A random walk in one dimension begins at the zero point on the number line. At each iteration the walker selects the number -1 or +1 at random. This number indicates whether he will take a step in the negative or positive direction respectively. The number is added to his current position which then determines his new position. For example if the three numbers -1, -1, +1 are chosen in succession then the walker moves from 0 to -1 to -2 to -1.
At the end of the random walk of N steps the distance back to the origin is calculated. This is quite simple in the one-dimensional random walk as a positive position indicates the distance and a negative position will be negated to indicate the position.
The program specified above constitutes a simulation of a random walk. There are two questions you should be able to answer based upon the simulation.
1. After N steps how far, on average, is the walker from the origin? Ideally, you would like a formula but a chart for significant values such as 1000, 10000, 100,000 and 1,000,000 will do.
2. Determine how consistent your results are. ideally, you would like to know what the standard deviation is for, say, 10000 runs of the program.
You may display the distance data in array form if you wish. You might run your program simulation several times and display in table form the distance from the origin for each run.
- Program #3 - Random Walk in two dimensions
A random walk in two dimensions is performed on a coordinate axis and begins at the (0,0) point. At each iteration the walker selects an angle at random. For example he might select an integer between 0 and 359 degrees or a number of radians between 0 and 6.27. After selecting the angle the walker moves 1 unit along the randomly-chosen angle. If the angle is 'a' then the new position (xnew, ynew) = (xold, yold) + (cos(a), sin(a)).
At the end of the random walk of N steps the distance back to the origin is calculated. This may be done using the Pythagorean Theorem. Thus, distance = sqrt(xnew*xnew + ynew*ynew).
The program specified above constitutes a simulation of a two dimensional random walk. There are two questions you should be able to answer based upon the simulation.
1. After N steps how far, on average, is the walker from the origin? Ideally, you would like a formula but a chart for significant values such as 1000, 10000, 100,000 and 1,000,000 will do.
2. Determine how consistent your results are. ideally, you would like to know what the standard deviation is for, say, 10000 runs of the program.
You may display the distance data in array form if you wish. You might run your program simulation several times and display in table form the distance from the origin for each run.
- Read Chapter 8, Graphs
Do exercises 2, 5, 9, 14, 15, 16, 18, 24, 25, 26b, 28 (see pg 403), 31, 35, 43 on pages 465 - 469.
- Lab 4 - Read, store and write a graph
Your program, named Lab6, should do the following.
1. Ask the user for the name of a graph file
2. Read the graph data from the graph file into internal graph structures (graph, nodes, edges)
3. Ask the user for an output file name
4. Write the graph from the internal graph structures to an output graph file
For information on graph files and internal structures see GraphAlgs.htm.
Your C++ program will create classes for nodes, edges and graphs. I.e., we will be creating ADTs for nodes, graphs and edges.
The first line of a graph file contains the number of nodes in the graph and a flag that indicates whether the graph is directed. It is followed by a list of edges (one edge per line) which is placed into our internal graph representation consisting of the following:
- a graph which is an array of nodes, together with the number of nodes and a "directed" flag.
- a node which consists of 3 components; a string of information about the node, the number of nodes in the adjacency list and a pointer to a list of incident edges.
- a linked list of edges for each node. Each edge consists of the edgeweight, the destination node and a pointer to the next edge in the list.
- Lab 5 and Program #3 - Implement Dijkstra's Algorithm
Dijkstra's Algorithm is a single source shortest path algorithms for weighted graphs with non-negative edge-weights.
You may find a pseudocode solution to Dijksta's Algorithm in GraphAlgs.htm along with pseudocode for the other functions you will need to read in and store a graph. Recall that this was done in our previous lab.
Your program should ask the user for the name of a file containing an undirected graph file (as described earlier) and read in and store the graph. Then the program should ask the user for the source node index and the destination node index. Finally it should write out the length of the path from the source node to the destination node followed by the actually path as a sequence of edges.
- Lab 6 - Implement the WFI all to all shortest path algorithm
- Generate a random directed or undirected graph with edgeweights written to a file.
- Read the graph file into an NXN array (an adjacency matrix) where N is the number of nodes in the graph
The (i,j) cell of the adjacency matrix will contain the cost of the best known path from i to j together with the predecessor vertex of j in the path. Initially this will be the weight of edge i,j if the edge exists in the graph. Initialize the diagonal to 0 and all other cells to INT_MAX (∞).
Probably the easiest way to do this is to initialize the entire adj. mx. to INT_MAX, then set the diagonal to all 0s and finally read the edges into the adj. mx.
- Write out the adjacency matrix of the input graph
- Execute WFI to find all shortest paths
- Write out the number of vertices and the time required to find all shortest paths
- Write the adjacency matrix as modified by WFI indicating all shortest paths.
The pseudocode for a "proper" way to generate a random graph is located in GraphAlgs.htm. However, we will be using a simpler method, in part, because this will generate graphs with separate components.
- Lab 7 and Program #4 - Implement Prim's Algorithm
You may find a pseudocode solution to Prim's Algorithm in GraphAlgs.htm along with pseudocode for the other functions you will need to read in and store a graph. Recall that these were done in a previous lab.
- Read Chapter 10 - Hashing
Do Exercises 1 and 14 on page 560.
- Program #7 - Hash Function Testing
The program that we did in lab (lab5) is located at cs-netlab-01.lynchburg.edu/courses/DataStII/Lab5.cpp
Your program #5 assignment on hashing is located in HashAssign2.htm
First read an explanation of collision resolution using chaining before attempting this assignment.
- Program #6 - Dictionary Word Look-up Using Hashing
In this assignment we will implement a dictionary word look-up using hashing. Your main menu will read in words from a dictionary file and store them in an array. You may use redirection of input at the DOS level and cin to get started on your program but you are required to use real C++ file input functions in the final version.
This program is a contest. The object of your program is to create an efficient hash function for words in the dictionary.
You will test your program/hash function on large text files.
- Ask the user for a file spec
- Open the file
- Initialize a word counter and a lookup counter
- while more words to read
- read in a word, increment the word counter
- use your hash function/algorithm to look up the word.
- increment the lookup counter by the number of lookups required
- end while loop
- Divide the lookup counter by the word counter
- report the result to the user.
You will be reporting the average number of lookups per word.
I will test your program on certain files I have and compare your results with your classmates' results. The smallest average number of lookups wins. This means it is absolutely critical that you count the number of words and the number of lookups properly.
Useful Resources
1. Here are a couple of hash functions that Marcus Wright found for us. They use the string function library so you may want to modify them for char* instead. You cannot be the "winner" if you use one of these but using one of these will not detract from your grade on this program.
2. I have a number of dictionaries located at ftp://cs-netlab-01.lynchburg.edu/ in the Dictionary folder. I suggest using either LongList.txt or ShortList.txt located in the WordPlay subfolder. LongList has almost 237,000 words and ShortList has almost 37,000 words. ShortList will suffice for our purposes.
3. You will find examples of C++ file I/O at
http://cs-netlab-01.lynchburg.edu/courses/IntroII/Programs/CopyFile/CopyFile.cpp
Top of this page
Home page 
