Homework Seven
Search Method for an Indexed File Class

The purpose of this assignment is to implement some of the Indexed File concepts discussed in class.


You will need to copy four files from Dannelly's home directory:
     universities.dat - the data file
     index1.dat - the top level index
     index2.dat - the level 2 (1:1) index
     hw07.cpp - the C++ code

The data file contains information on 1302 US universities. It is a binary file of these records:

      struct univ_record
         {
          char name[45];    // university name
          char state[3];    // 2-letter state code
          int publc;        // 1=public  2=private
          int avg_act;      // Average ACT score
          int fulltime;     // fulltime undergrad enrollment
          int parttime;     // partime undergrad enrollment
          int instate;      // In-State Tuition
          int outstate;     // Out-of-State Tuition
          int fac_pct;      // percentage of facutly with a termincal degree
          float fac_ratio;  // student/faculty ratio
         };
The datafile is indexed by university name. Each entry in the top level points to a block of 100 entries in level two. Level two is 1:1 to the datafile. Index1.dat and index2.dat are binary files of these records:
      struct index_record
         {
          char name[45];    // university name
          int  index;       // record number
         };


The UniversityDataClass contains the following methods:
constructor - completed
FindPartial - completed
Print - completed
PrintNames - completed
Retrieve - completed
Find - not complete - your job

The constructor reads all the toplevel index entries into an array. It also determines the number of records in the datafile and the number of entries in the toplevel index. Just so you can see a little bit about the toplevel index, I kept my debugging cout statements that print each entry in toplevel.

The Print method prints all the fields for a given Record. It is passed an RRN, then retrieves that record from the datafile, and prints all ten fields.

PrintNames just iterates throught the entire datafile and prints the names of all universities.

The Retrieve method is passed an RRN. It gets that record from the data file and passes back all ten fields as parameters.

Find is not complete. That is your assignment. For testing purposes Find always just returns the RRN of the last record.

FindPartial is passed a portion of a university name. It finds all university names that start with those letters. For example, if FindPartial is passed "Win", the methods prints:
Wingate College
Winona State University
Winston-Salem State University
Winthrop University

Note that my FindPartial is very different from your Find. FindPartial is only looking for names. Hence, FindPartial never looks in the datafile; only keys (names) in the level two index. Also, your Find is looking for one particular record. FindPartial is looking for zero or **many** records. So, FindPartial **must** sequentially search many index entries.

For example, assume we do a partial name search for "University". The top level index indicates "University of Nebraska at Omaha" is entry number 1100 in the level two index. But, we need all the "University"s before record 1100 (such as Univ of Alabama, and Univ of Alaska), as well as many universities after Nebraska. So, my FindPartial reads 100 records before and 100 records after the location found in the top level index. Which leads to the problems of searching before record 0 and after record 1302. ... You get the idea. Some of FindPartial may be useful to you, but most of it is wrong for you need to do.

Your Find method should be more effecient than my FindPartial method.


Submit your single C++ file as an email attachment to dannellys by 2:00pm on Tuesday April 19.