Project 4: Data Definition Language

Description

This project will involve adding more Data Definition Language (DDL) constructs, mostly corresponding to dropping tables and transactions.

Files in Project4/

  • Test_Suite/: Folder containing test SQL files. Each file contains a number of SQL statements.
  • project.py: You need to supply this file. You should probably copy the project.py file from Project3/ or from Project3_solution/.
  • cli.py: This file imports your project.py. It takes 2 required arguments (a test_sql_file and an output_file) and one optional argument (--sqlite). When executed with ./cli.py test.sql output.txt, it loads the test.sql file, passes it to the database (your project.py file) and writes a report to output.txt. If you give it the optional argument (./cli.py --sqlite test.sql output.txt), it passes the SQL statements to sqlite instead. You can use this to see the correct output.
  • run_tests.py: Run this file ("./run_tests.py") to run every test in Test_Suite with cli.py and get your current grade.
  • README.txt: You need to create this file, and it should contain your name, MSU email, feedback, and any sources beyond this course you used.

What you need to do

All of the code you write for this project must be in "project.py", you may not modify any of the existing tests or test runners.

"cli.py" imports "project.py" and expects to find a "connect" function with one parameter. It passes a filename (specifically ":memory:"), which isn't used in this project. The "connect" function should return an object that has two methods "execute" and "close". For each SQL statement in a test file, "cli.py" will pass that SQL statement as a string to the "execute" method. The method should return an empty list, unless a select statement was executed that yielded rows. In that case, a list of tuples with each denoting a row, should be returned. The "close" method doesn't do anything, yet.

There is no need to write any database output to any files. Persistence will be covered in a later project. Each test will be done on a clean, empty database.

Your program is allowed to use any built in modules of python, with the exception of sqlite3. sqlite3 is the reference implementation for this project, your output will be compared against it for correctness. Importing the sqlite3 module in project.py will be considered academic dishonesty for these projects.

You may (and should) write additional functions and classes in "project.py" to perform the needed actions.

Modification to test file contents

1: CREATE TABLE students (name TEXT, grade REAL);
2: INSERT INTO students VALUES ('Josh', 2.4);
1: CREATE TABLE colors (r INT, g INT, b INT);
2: INSERT INTO colors VALUES (1, 2, 3);
1: SELECT * FROM colors ORDER BY r;
2: SELECT * FROM students ORDER BY name;

Above is an example test file. Each line is legal SQL, except for the very beginning of each line. The value before the colon is an identifier for a connection. In this case, there are two connections (named "1" and "2"). The rest of the line is executed with the associated connection.

Raising Exceptions

There are times when your code should raise an exception (see https://docs.python.org/3.4/tutorial/errors.html). The type and message of the exception don't matter (and will never matter in any project). But, not raising an exception when expected will fail the test case.

Test Categories

Regression

These tests should pass if you completed Project 3. They should pass automatically at the start of the project. If these fail, you've made a regression (you have broken previously functional code).

Connections

These tests check that your code accepts multiple connections to the same database (only one database per test in this project). The instructor's Project 3 solution already passes these tests, but your solution may not without some modification. By default, statements autocommit, so no need to handle transaction relevant code yet.

Create_Drop_Table

You need to implement the CREATE and DROP table commands. The CREATE TABLE command you have already implemented, but there's one minor addition. Your code should raise an exception if a CREATE TABLE statement attempts to make a table that already exists.

If the CREATE TABLE command includes "IF NOT EXISTS" (example: CREATE TABLE IF NOT EXISTS students ...), then nothing should happen if the table already exists. You also need to implement the DROP TABLE statement (which has an optional "IF EXISTS" clause to not raise an error if the table doesn't exist). Examples:

DROP TABLE students;
DROP TABLE IF EXISTS students;

Transactions

  • Transactions begin with a BEGIN TRANSACTION statement.
  • Transactions end with COMMIT TRANSACTION (ROLLBACK tests are in a different category).
  • The only statements the tests will happen within transactions are SELECT, INSERT, UPDATE and DELETE (the Data Manipulation Language statements).
  • If a statement is before a BEGIN statement or after a COMMIT statement, that statement is in "autocommit mode" meaning that any locks needed are acquired before the statement runs, and released when the statement is complete. This behavior is disabled after a BEGIN statement (manual transaction mode).
  • A shared lock is needed to read (SELECT). Shared locks block exclusive locks.
  • A reserved lock is needed to write (INSERT, UPDATE, and DELETE). Reserved locks block exclusive locks and other reserved locks.
  • An exclusive lock is needed to commit a write. Specifically, if the transaction has a reserved lock, it must be promoted to an exclusive lock upon commit. Exclusive locks block all other locks.
  • Locks are released upon commit.

Remember: sqlite only locks the entire database, not individual tables or rows.

Recommended Implementation

The way we recommend you handle transactions is make a copy of the database's tables when the transactions starts and do your reads and writes to the copy. Upon commit, copy the modified tables to the "real" database. If you implement the locking correctly, no other transaction should generate conflicts.

It may seem inefficient to copy the entire database for each transaction, and it is. The way sqlite actually does it is keeping private copies of all the pages of memory it writes to (and only writing to disk on commit), but that is too difficult to do from Python. So do the inefficient, but easy thing for this project, copy all the tables for each transaction.

Transaction_Modes

Transactions can occur in one of three modes (BEGIN "mode" TRANSACTION:

  • DEFERRED = locks are acquired when needed by a statement in the transaction
  • IMMEDIATE = a reserved lock is acquired at start of the transaction and an exclusive lock is acquired when needed
  • EXCLUSIVE = an exclusive lock is acquired at start of transaction

Transactions that don't specify a mode default to DEFERRED.

Isolation

Writes are only visible to other connections upon commit (before then the transactions need to keep private copies of their writes).

Rollback

The second way a transaction can end is with a rollback. All the changes made by the transaction needs to be reversed, and all locks released.

If you implement full isolation between transactions by only modifying a local copy of the database in each transaction, rollback is very simple, just don't copy your modified tables to the "real" database, just get rid of them.