Data Identification



next up previous
Next: Architecture Up: Design Document for 1993 Previous: User Needs Analysis

Data Identification

  This section describes the data that have been identified thus far for potential inclusion in the information system. A description of the different types of data and their applicable tools for analysis are also included.

Data Relevant to Issues

Many issue areas from a variety of sources have been identified and deemed relevant to the El Paso/Ciudad Juarez region. Data relating to surface water and groundwater quality issues in the region are available from the Texas Natural Resources Conservation Commission, Texas Water Commission, El Paso Public Service Board, International Boundary Water Commission, and a variety of other sources. More specifically these data relate to relevant Mexican and United States policy, in addition to chemical and biological threats from hazardous waste and wastewater disposal throughout the region. The data includes documents, maps, and images, as well as chemical analysis results from major well fields in the region.

Data relating to general surface water quantity issues and in particular to the Hueco Bolson and Mesilla Bolson regions are available from the United States Bureau of Reclamation, El Paso Department of Planning, National Oceanic and Atmospheric Administration (NOAA), and other sources. Numerous documents, images, and statistics relate to United States and Mexican policy, water demands, and water resources from rivers, reservoirs, and precipitation in the region.

Data Types and Analysis

The ENFORMS will provide a variety of analysis and integration tools that support a variety of data formats. These analysis and integration tools are based on the specific data formats that they will manipulate. The data formats identified for the ENFORMS thus far are: text, image maps, and spreadsheet data.

A variety of documents and reports identified are in textual form. The ENFORMS will provide services to display the documents and to search the documents for key words or phrases. Much of the identified data are maps in image format. The ENFORMS will provide services to display the image maps. Various well field chemical analyses, trace element statistics, and well pumping rates are in spreadsheet format. The ENFORMS will provide access to standard spreadsheet services to analyze and integrate the data.

Data Classification

  The ENFORMS is a highly configurable archiving system defined by three sets of metadata: a classification scheme specification, object descriptors, and object classifications. An object (also referred to as an item) is a specific piece of information that has a set of defined operations. For example, a text file has read and print operations, and a spreadsheet data file can be manipulated by a compatible spreadsheet package. Objects may also be executable programs such as a static animation utility with run and stop operations.

An object descriptor is one type of metadata that defines an object for the system. An object is made accessible through the system by adding its descriptor to the registration database, a text file containing these descriptors. Descriptors contain two types of information: system information for internal use by the ENFORMS, and user information that is presented to the user in various formats. An example registration is shown in Figure 1.

  
Figure 1: An example object descriptor.

All information is specified in the form name = value, such as KEY = intbound. There are four basic pieces of information that constitute a descriptor: a key, a type, a description, and a set of access descriptors. Each item that is in the archive must have a unique key that is specified by the keyword KEY (system information). EXTTYPE specifies the object's type, and DESCRIPTION is a brief description of the object, both of which are user information. Access descriptors are listed in the section enclosed by a BEGIN-MANIPULATORS/END-MANIPULATORS pair in Figure 1. An access descriptor, specified by a M-BEGIN/M-END pair, provides the system with information as to how the object can be used with specific integrated software tools (manipulators). Each manipulator has a unique name that is recognized by the ENFORMS according to the keyword MANIPULATOR. Currently, three manipulators are provided by the ENFORMS: textdisp can display text files, imview can display images in a wide variety of formats, and app can launch arbitrary GUI-based applications. Each manipulator has specific parameters that it uses to perform the necessary task.

The second type of metadata needed by the ENFORMS is a classification scheme that configures the system's query facility. The ENFORMS currently supports a simple hierarchical classification scheme called IPAR; the height of the hierarchy is limited to four tiers, namely the Issue, Problem, Aspect (subtopic), and Refinement tiers (hence the acronym). An instantiation of the classification scheme consists of a specific hierarchy, such as that shown in Figure 2(a). In this figure, the root node is the name of the classification scheme, and the children of the root node are the issues. A classification scheme is supplied to the ENFORMS in the form shown in Figure 2(b), where each branch in the hierarchy has an entry (since there is only one root node, it is not repeated for each classification).

  

The third type of metadata needed are the object classifications that define how the registered objects are to be located by queries, where a query is a request for objects that ``match'' a given set of classifications. The object classification database is a text file that consists of entries of the form shown in Figure 3. Each classifier gives the key of the object being classified and a list of classifications. Adding a classification such as Surface Water Quantity > Demands to an object's classifier implicitly asserts that the given object represents information related to the problem of Demands as it relates to the issue Surface Water Quantity. Note that the object classifications must be consistent with the system's overall classification scheme. For example, the classification Surface Water Quantity > Demands is consistent with the hierarchy defined in Figure 2(b) but Trade > Research is not.

  
Figure 3: Example of object classification.



next up previous
Next: Architecture Up: Design Document for 1993 Previous: User Needs Analysis



Dr. Betty Cheng
Mon May 1 15:46:33 EDT 1995