ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿ ³ DB/C Newsletter ³ ³ June 1993 ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ Editor's Notes This issue of the newsletter is dedicated to SQL and its impact on you. Like it or not, SQL is here to stay, and those of you who choose to ignore it will be left behind. I have used this same scare tactic when discussing Graphical User Interfaces. The respective impacts of both SQL and GUIs are major and you should be making plans to factor them into your future systems. Whether this future starts now or later, you need to be prepared for it. The first article is an overview of SQL, its history, terminology, current status and near future. The second article describes how DB/C interfaces to an SQL database. The third article compares SQL with traditional DATABUS access methods - sequential, random, ISAM and AIM. I am not an expert on SQL. I have attempted to be as factual as possible, but if you can provide additional information or find errors in these articles, please let me know. DNW SQL SQL stands for Structured Query Language. It is a term that was invented to describe a language of statements that access a relational database. The concept of a relational database was invented by E.F. Codd, an IBM researcher, in 1970. The most basic concept in a relational database is that of a table of data. The table consists of rows and columns that are not in any particular order, and that each row is different from each other row. The mathematical term to describe an unordered set of tuples (i.e. rows) is a relation. Thus the term relational database. In the 1970's, IBM created and tested an experimental relational database system called System/R. When this concept proved to be commercially viable, IBM decided to make an actual product. However, ORACLE Corporation was the first company to bring an SQL database product to the market. They introduced ORACLE in 1979. IBM and Ingres soon followed with actual products. Today there are over a dozen major software vendors selling SQL based relational database systems. The first ANSI Standard for SQL was adopted in 1986. It was incomplete and was very weak in its specifications. But by having an ANSI Standard, SQL quickly gained legitimacy, even though performance of most products was poor and portability was very poor. SQL2 was adopted as an ANSI Standard in 1992, and for the first time, portability was possible. Many of today's products conform partially with SQL2, and those that don't soon will. Another standard, SQL3, is currently under development. It contains many features that are necessary for creating truly robust, full-featured data processing systems. SQL3 will probably be adopted in 1995 or 1996. In a relational database system, data is stored in tables. Each table consists of multiple rows and columns. A table is similar to a spreadsheet like those used by Lotus 1-2-3 or Excel. In data processing terms, a table is just a file, a row is a record, and a column is a column of data fields. All rows of a table have of the same columns. In data processing terms, each record has the same layout. Thus a table in a relational database system is a very straightforward concept, and it fits well into traditional data processing system design. The operations on rows, columns and tables are what differentiate relational database systems from other database systems or traditional file management systems. Some basic operations on tables are: CREATE TABLE, ALTER TABLE, DROP TABLE, INSERT and DELETE. CREATE TABLE defines the data types for each column of a table and creates an empty table. ALTER TABLE changes the columns of a table. DROP TABLE destroys a table. INSERT adds a row into a table. DELETE deletes a row from a table. The SELECT statement allows data to be retrieved. SELECT can be simple or very complex. A simple SELECT might just return a single row of a table. A complex select might return a few columns from multiple tables that meet a specific search criteria, and they are returned in a particular order. For example: SELECT CUSTOMER.NAME, ORDER.NUMBER FROM CUSTOMER, ORDER WHERE ORDER.DATE > 01-01-93 ORDER BY ORDER.NUMBER This SELECT statement will return customer names and order numbers in order by order number for all orders with an order date greater than January 1, 1993. This SELECT statement does a join operation on the two tables. The join operation creates a table known as the product or result table. The contents of the result table are the rows that make up the result of the query. A FETCH statement is used to retrieve each row one by one. An UPDATE statement may be used to update the currently FETCHed row. Additionally, an UPDATE statement may be used independently of a SELECT statement to update one or more rows directly. The SELECT statement may be used to accomplish other queries. Here are some examples: SELECT SUM(VALUE) FROM ORDERS SELECT COUNT(NAME) FROM CUSTOMERS SELECT CUSTOMER.NAME, AVG(ORDER.VALUE) FROM CUSTOMERS, ORDERS GROUP BY CUSTOMER.NUMBER The first example returns the total value of all orders. The second example returns the number of customers in the customer table. The third example returns the average value of the orders for each customer - that is, there will be one row for each customer containing the customer name and the average order value for that customer. Another primary feature of an SQL relational database is the transaction processing feature implemented by the COMMIT WORK and ROLLBACK WORK statements. Each transaction is implicitly started by any SQL statement. A transaction is completed successfully when the COMMIT WORK statement is executed. When this happens, all changes to the database are actually made. A transaction may be aborted with the ROLLBACK WORK statement. When a transaction is aborted, any changes made since the beginning of the transaction are ignored and the database is restored to the state it was in before the transaction started. The relational database management system is responsible for making sure that when multiple users updating the database simultaneously, that all data integrity and correctness is maintained. In many respects, SQL is the state of the art for data management today. SQL IN DB/C PROGRAMS The interface between DB/C and SQL databases is accomplished with these verbs: SQLEXEC, SQLCODE and SQLMSG. The SQLEXEC statement causes all SQL statements to be executed. The SQLCODE statement returns the return code associated with the most recently executed SQLEXEC statement. The SQLMSG statement returns the English language message of the return code associated with the most recently executed SQLEXEC statement. For example: CUSTNUM INIT "12345" RETVALUE NUM 6 SQLEXEC "DELETE CUSTOMER WHERE CUSTOMER.NUMBER=:1" FROM CUSTNUM SQLCODE RETVALUE DISPLAY RETVALUE These statements cause the customer or customers with a customer number equal to 12345 to be deleted from the CUSTOMERS table. The result code will be displayed by the DISPLAY statement. Here is another example: NUM CHAR 10 SQLEXEC "SELECT NAME FROM CUSTOMER WHERE NUMBER = 123 INTO :1" INTO NUM This statement moves the customer name associated with customer number 123 into the variable NUM. Here is an example that shows retrieval of multiple records: NAME CHAR 30 SQLEXEC "SELECT USING C1 NAME FROM CUSTOMER WHERE STATE = 'IL'" LOOP SQLEXEC "FETCH USING C1" INTO NAME BREAK IF OVER DISPLAY NAME REPEAT This example displays the names of all customers whose state is Illinois. The USING C1 clause is the cursor clause and C1 is the cursor. A cursor is used to associate the FETCH statement with the right SELECT statement. The BREAK statement could be replaced by using the SQL return code value like this: NAME CHAR 30 RETVALUE NUM 5 SQLEXEC "SELECT USING C1 NAME FROM CUSTOMER WHERE STATE = 'IL'" LOOP SQLEXEC "FETCH USING C1" INTO NAME SQLCODE RETVALUE BREAK IF (RETVALUE = 100) DISPLAY NAME REPEAT Here is an example of an UPDATE: NAME CHAR 30 SQLEXEC "SELECT USING C1 NAME FROM CUSTOMER WHERE STATE = 'IL'" LOOP SQLEXEC "FETCH USING C1" INTO NAME BREAK IF OVER DISPLAY "UPDATING ", NAME SQLEXEC "UPDATE CUSTOMER SET STATUS = 'A' WHERE CURRENT OF C1" REPEAT COMMIT WORK These are typical examples of the use of DB/C and SQL. COMPARISON OF SQL AND DATABUS FILE MANAGEMENT In some respects, DB/C file I/O verbs like READ, WRITE, UPDATE, etc. are similar to their SQL counterparts. If you are attempting to use existing DB/C code to access an SQL database, or if you want to have your code work with both SQL and DB/C data files, you might be tempted to use the same approach to write the program logic. This is not a good idea. For example, lets assume we need to put all orders for a given customer on hold. We need to change the status of all orders for the customer to 'H'. Here is what the code might look like when using DB/C files: CUSTNUM CHAR 5 ORDER LIST ORDERNUM CHAR 5 ORDERCUST CHAR 5 ORDERSTAT CHAR 1 LISTEND KEY1 CHAR 5 ORDER1 IFILE KEY=5, FIX=11 . KEY IS ORDERNUM KEY2 CHAR 10 ORDER2 IFILE KEY=10, FIX=11 . KEY IS ORDERCUST, ORDERNUM OPEN ORDER2, "ORDER2" KEYIN "ENTER CUSTOMER NUMBER TO PUT ON HOLD:", *JR, CUSTNUM MOVE (CUSTNUM + " ") TO KEY2 READ ORDER2, KEY2;; LOOP READKS ORDER; ORDER.RECORD BREAK IF OVER MOVE "H" TO ORDERSTAT UPDATE ORDER; ORDER.RECORD REPEAT A straightforward conversion to SQL might look like this: CUSTNUM CHAR 5 ORDERSTAT CHAR 1 KEYIN "ENTER CUSTOMER NUMBER TO PUT ON HOLD:", *JR, CUSTNUM SQLEXEC "SELECT USING C1 ORDER.STATUS FROM ORDER WHERE ORDER.CUST = :1": FROM CUSTNUM LOOP SQLEXEC "FETCH USING C1" INTO ORDERSTAT BREAK IF OVER MOVE "H" TO ORDERSTAT SQLEXEC "UPDATE WHERE CURRENT OF C1", FROM ORDERSTAT REPEAT Unfortunately, this code is not very efficient. Here is a better way to write the same logic: CUSTNUM CHAR 5 KEYIN "ENTER CUSTOMER NUMBER TO PUT ON HOLD:", *JR, CUSTNUM SQLEXEC "UPDATE ORDER SET ORDER.STATUS = 'H' WHERE ORDER.CUST = :1": FROM CUSTNUM Other problems arise because of the transaction processing features of SQL (COMMIT/ROLLBACK). One situation is the when the program displays a scrolling window that allows access to many or all of the records in a file. With DB/C file systems, this is easily accomplished with READKS and READKP, or if AIM is used, with READKG and READGP. The problem with SQL is that when the initial SELECT selects many or all of the rows in a table, all of those rows are locked from use by other users. There have been attempts to deal with the browse issue, but implementations of solutions vary. One of the major advantages of DB/C is AIM. SQL has floating search key capabilities, but on large files, DB/C performance is typically many times faster. SQL floating key query performance is highly dependent on table size and on memory available for caching. In addition, some SQL products are much faster than others on floating key lookups. A major advantage of SQL is for summary type queries on a network. The example of counting the rows in a table is a good one. On a client-server network, SQL does not have to move any data, except for the result, across the network. DB/C has to move every record across the network. In reality, summary type queries are not heavily used in data processing systems, although they may be used frequently in an ad-hoc reporting environment. SQL is notorious for requiring huge resources. When you compare DB/C and SQL performance on a small system (e.g. 500K memory per user, 30 users on a 25 Mhz 486), DB/C will provide much faster response times. But when you provide larger amounts of memory and CPU cycles (e.g. 1.5 MB memory per user, 30 users on a 33/66 Mhz 486), SQL has the potential to outperform DB/C. The question comes down to this: If the price to performance ratio is a major consideration, don't even think about SQL. However, if politics or shared access with other software systems is important, SQL may be the way to go. In summary, SQL is the way the world is moving, but there is going to be a large cost for everybody to get there. DB/C Class Schedule The next DB/C class is scheduled for June 7-10, 1993. It is held in the Oak Brook, Illinois office of Subject, Wills & Company. For more information, contact Judi Tamkevic at (708) 572-0240. Another class is tentatively scheduled for September.