Project 5: Other Database Constructs

Description

This project will involve adding more features of the SQL language to your DBMS.

Files in Project5/

  • Test_Suite/: Folder containing test SQL files. Each file contains a number of SQL statements.
  • project.py: You need to supply this file. You should probably copy the project.py file from Project3/ or from Project3_solution/.
  • cli.py: This file imports your project.py. It takes 2 required arguments (a test_sql_file and an output_file) and one optional argument (--sqlite). When executed with ./cli.py test.sql output.txt, it loads the test.sql file, passes it to the database (your project.py file) and writes a report to output.txt. If you give it the optional argument (./cli.py --sqlite test.sql output.txt), it passes the SQL statements to sqlite instead. You can use this to see the correct output.
  • run_tests.py: Run this file ("./run_tests.py") to run every test in Test_Suite with cli.py and get your current grade.
  • README.txt: You need to create this file, and it should contain your name, MSU email, feedback, and any sources beyond this course you used.

What you need to do

All of the code you write for this project must be in "project.py", you may not modify any of the existing tests or test runners.

"cli.py" imports "project.py" and expects to find a "connect" function with one parameter. It passes a filename (specifically ":memory:"), which isn't used in this project. The "connect" function should return an object that has two methods "execute" and "close". For each SQL statement in a test file, "cli.py" will pass that SQL statement as a string to the "execute" method. The method should return an empty list, unless a select statement was executed that yielded rows. In that case, a list of tuples with each denoting a row, should be returned. The "close" method doesn't do anything, yet.

There is no need to write any database output to any files. Persistence will be covered in a later project. Each test will be done on a clean, empty database.

Your program is allowed to use any built in modules of python, with the exception of sqlite3. sqlite3 is the reference implementation for this project, your output will be compared against it for correctness. Importing the sqlite3 module in project.py will be considered academic dishonesty for these projects.

You may (and should) write additional functions and classes in "project.py" to perform the needed actions.

Test Categories

Regression

These tests should pass if you completed Project 4. They should pass automatically at the start of the project. If these fail, you've made a regression (you have broken previously functional code).

View

Views are read-only named SELECT statements. They act like a table, but if any of the tables the underlying SELECT statements it draws from changes, the results returned are changed. Example:

1: CREATE TABLE students (name TEXT, grade REAL);
1: CREATE VIEW stu_view AS SELECT * FROM students WHERE grade > 3.0 ORDER BY name;
1: SELECT name FROM stu_view ORDER BY grade;

For simplicity, none of the columns of the view's underlying tables will share names.

Parameterized Queries

Parameterized queries make it easy to reuse queries with wildcard (?) values pulled from the variables in the programming language interface. You need to implement a executemany method on your Connection class that accepts two arguments: a SQL statement with wildcard placeholders, and list of tuples with the values that should be slotted in.

The test cases look a little strange to support this type of input. Example:

1: CREATE TABLE students (name TEXT, grade REAL, class INTEGER);
Parameters: [('Josh', 3.5), ('Tyler', 2.5), ('Grant', 3.0)]
1: INSERT INTO students VALUES (?, ?, 480);
1: SELECT * FROM students ORDER BY grade;

The above gets essentially transformed into the following python code:

import project
conn = project("test.db")
conn.execute("CREATE TABLE students (name TEXT, grade REAL, class INTEGER);")
conn.executemany("INSERT INTO students VALUES (?, ?, 480);", [('Josh', 3.5), ('Tyler', 2.5), ('Grant', 3.0)])
conn.execute("SELECT * FROM students ORDER BY grade;")

Custom Collations

Custom collation functions allow the user to supply a function that specifies the way a column should be ordered. Defines a function that takes two arguments (left and right): and the function returns -1 if left is less than right, 0 if they are even and 1 otherwise. Then the create_collation method of the connection is called with two arguments (the name of the collation within SQL and the function itself). You need to write the create_collation and update your SELECT-related code to accept such collations.

Example test case:

1: CREATE TABLE students (name TEXT, grade REAL, class INTEGER);
Parameters: [('Josh', 3.5, 480), ('Tyler', 2.5, 480), ('Tosh', 4.5, 450), ('Losh', 3.2, 450), ('Grant', 3.3, 480), ('Emily', 2.25, 450), ('James', 2.25, 450)]
1: INSERT INTO students VALUES (?, ?, ?);
1: SELECT * FROM students ORDER BY class, name;
gather:def collate_ignore_first_letter(string1, string2):
gather:    string1 = string1[1:]
gather:    string2 = string2[1:]
gather:    if string1 == string2:
gather:        return 0
gather:    if string1 < string2:
gather:        return -1
gather:    else:
gather:        return 1
exec:
create_collate skip collate_ignore_first_letter 1:
1: SELECT * FROM students ORDER BY name COLLATE skip, grade;

Gets turned into essentially this:

import project
conn = project("test.db")
conn.execute("CREATE TABLE students (name TEXT, grade REAL, class INTEGER);")
conn.executemany("INSERT INTO students VALUES (?, ?, 480);", [('Josh', 3.5, 480), ('Tyler', 2.5, 480), ('Tosh', 4.5, 450), ('Losh', 3.2, 450), ('Grant', 3.3, 480), ('Emily', 2.25, 450), ('James', 2.25, 450)])
conn.execute("SELECT * FROM students ORDER BY class, name;")
def collate_ignore_first_letter(string1, string2):
    string1 = string1[1:]
    string2 = string2[1:]
    if string1 == string2:
        return 0
    if string1 < string2:
        return -1
    else:
        return 1
conn.create_collation("skip", collate_ignore_first_letter)
conn.execute("SELECT * FROM students ORDER BY name COLLATE skip, grade;")

DESC (Descending Ordering)

The DESC keyword indicates that the collation should be reversed, so that values are ordered in a descending manner instead of the default (ascending). Be aware that DESC can be used with custom collations like so:

SELECT * FROM students ORDER BY name COLLATE skip DESC, grade DESC;

Default Values

It is often useful to supply non-NULL default values for specific columns. For instance, if no grade is supplied for an insert statement, I may want to assume a grade of 0.0. The way you specify the default value for a column is in the CREATE TABLE statement:

CREATE TABLE students (name TEXT, grade REAL DEFAULT 0.0, id TEXT);

Additionally, if you want to insert a row with entirely default values, you can do so like:

INSERT INTO students DEFAULT VALUES;