Special Cloudscape Programming,
Page 4 of 7
|

|

[top]  [prev]
  [next]
Documentation Top

Global Index
Reference Manual
TOC Index
Grammar index
Developer's Guide
TOC
Index
Tuning Cloudscape
TOC Index
|
Programming VTIs
Cloudscape provides a construct for presenting external datafrom a flat file, from another vendors database, from a news feedas a virtual table to an SQL-J SELECT statement. Access to external data presented in this way constitutes an external virtual table that can be used within the scope of an SQL-J statement in the same way as other virtual or derived tables (see ExternalVirtualTable in the Cloudscape Reference Manual).
An external virtual table is created by an instance of any class that fulfills the requirements of Cloudscapes virtual table interface (VTI).
This section discusses how to create classes of that type.
Requirements and Options for VTI Classes
To qualify as a VTI class, Java class must fulfill the VTI requirements:
Optionally, the class can provide costing information to the Cloudscape optimizer or prevent Cloudscape from instantiating it more than once. See Providing Costing Information.
Implementing ResultSet
A VTI class implements java.sql.ResultSet. A ResultSet consists of rows and columns of data, the methods for stepping through them, and a method for getting the ResultSetMetaData (getMetaData()).
At compile time, Cloudscape needs only the metadata. Depending on how you program the class (see Providing the ResultSetMetaData at Compile Time), Cloudscape may call only a static method or the constructor at compile time, so it is typically not necessary to do all the work required for a ResultSet (stepping through all the rows) in the constructor. That work can be done in the next() method. (This may not be possible in some cases.)
ResultSet and ResultSetMetaData Methods Used by Cloudscape
Both the java.sql.ResultSet and java.sql.ResultSetMetaData interfaces have a lot of methods. Cloudscape calls only a small subset of these methods within an SQL-J statement that uses ExternalVirtualTable. The list of these methods, which are methods from the 1.2 version of JDBC (not 2.0), follows. Cloudscape does not call any methods from the 2.0 version of JDBC.
NOTE: This list is subject to change.
-
ResultSet
-
ResultSetMetaData
-
getColumnCount()
-
isNullable(int column)
-
getColumnDisplaySize(int column)
(This method call is meaningful only for Types that can vary in sizeTypes.BIT, Types.NUMERIC, Types.DECIMAL, Types.CHAR, Types.VARCHAR, Types.LONGVARCHAR, Types.BINARY, Types.VARBINARY, Types.LONGVARBINARY, and Types.OTHER.)
-
getColumnName(int column)
-
getPrecision(int column) (for NUMERIC or DECIMAL columns)
-
getScale(int column)
-
getColumnType(int column)
-
getXXX method calls
The getXXX method called on a column returned by a VTI is determined by the JDBC type of the column. Table 5-1 shows which getXXX method is called for each type.
If the statement selects data from the ExternalVirtualTable to insert it into a table, Cloudscape does not need to call the appropriate getXXX method based on the type of the target table; Cloudscape handles the conversion.
In addition, there is nothing to prevent application code from using an entirely different getXXX call on the column when stepping through the ResultSet returned by Cloudscape.
For example, a VTI could return a column of JDBC type BIGINT:
SELECT cn FROM
NEW ExternalQuery(
'jdbc:cloudscape:toursDB',
'SELECT CONGLOMERATENUMBER FROM SYS.SYSCONGLOMERATES') AS EQ
(cn)
When evaluating the following statement, Cloudscape calls the VTIs getLong on the ResultSet returned by the VTI to create the ResultSet returned to the user when the query is executed.
SELECT cn
FROM NEW ExternalQuery(
'jdbc:cloudscape:toursDB',
'SELECT CONGLOMERATENUMBER FROM SYS.SYSCONGLOMERATES')
AS EQ (cn)
However, the users application could call a getInt on that column instead:
ResultSet rs = s.executeQuery( "SELECT * FROM NEW ExternalQuery" + "'jdbc:cloudscape:toursDB', " +
"'SELECT CONGLOMERATENUMBER FROM SYS.SYSCONGLOMERATES') " +
"AS EQ");
while (rs.next()) {
System.out.println(rs.getInt(1));
}
In this case, the getInt method is the one implemented by Cloudscape, not by your VTI class. Cloudscape only calls the getLong method on the VTI.
Table 5-1 getXXX Methods Called for JDBC Types
ExternalVirtualTables Column JDBC Data Type
|
getXXX Methods Called
|
BIGINT
|
getLong
|
BINARY
|
getBytes
|
BIT
|
getBoolean
|
CHAR
|
getString
|
DATE
|
getDate
|
DECIMAL
|
getBigDecimal
|
DOUBLE
|
getDouble
|
FLOAT
|
getDouble
|
INTEGER
|
getInt
|
LONGVARBINARY
|
getBytes
|
LONGVARCHAR
|
getString
|
NUMERIC
|
getBigDecimal
|
OTHER
|
getObject
|
REAL
|
getFloat
|
SMALLINT
|
getShort
|
TIME
|
getTime
|
TIMESTAMP
|
getTimestamp
|
TINYINT
|
getByte
|
VARBINARY
|
getBytes
|
VARCHAR
|
getString
|
Providing the ResultSetMetaData at Compile Time
Cloudscape needs the ResultSetMetaData for the ResultSet returned by the VTI at compile time. By default, Cloudscape instantiates the class in order to call the non-static getResultSetMetaData method on the object. Instantiating the class may be expensive, so for simple VTI classes that can determine the shape of the ResultSetMetaData based on the parameters, you can provide a static version of the method which allows you to avoid instantiation at compile time. To do this you must:
-
Provide the static method public static java.sql.ResultSetMetaData getResultSetMetaData(parameterList), for which the parameterList signature is the same as for the constructor for that object that appears in the SQL-J statement.
The ResultSetMetaData object returned by this method is used by Cloudscape in compiling the SQL-J statement that instantiates the class. It is up to the implementer to determine what work needs to be done in order to return the correct ResultSetMetaData. The contents of the ResultSetMetaData may need to be dynamic depending on the parameters or, or maybe the ResultSetMetaData are always is the same, no matter what the parameters.
One example of when this method is a good choice is when your class always returns a ResultSet of the same shape given the same parameter signature. In this case, it is a lot less expensive for the Cloudscape compiler to get the ResultSetMetaData using the static method than to instantiate the class.
NOTE: If your VTI class provides costing information (see Providing Costing Information), Cloudscape always instantiates the class at compile time to get that costing information, so the static getResultSetMetaData is ignored.
NOTE: In a client/server environment, the constructor and static method are called from within Cloudscape, so the resources needed by these methods must be available to Cloudscape, not just the client application.
Compile Time vs. Execution Time
When Cloudscape needs to instantiate the VTI class at compile time to determine the shape of its ResultSet, such a class will be instantiated twiceonce at compile time and once at runtime. You may want the VTI class to be able to distinguish between those two different situations. For example, for performance reasons, at runtime you may want the class to begin some external actions such as reading files, connecting to a mainframe, and so on, when it is constructed. At compile time, you may not want it to carry out those actions.
Heres how the VTI class can determine the situation (compile time or runtime):
Add an object parameter (such as a string) to the constructor.
When invoking it within SQL-J, pass in a valid parameter.
Inside the VTI class, check the value of the parameter. If the object parameter is null, it is compile time.
Rules for Parameters
Constant Parameters
The currently supported constant expressions for compilation time are:
-
GETCURRENTCONNECTION()
-
literals (for example, 1, 'asdf', 1.1, or DATE'1996-09-09')
-
CAST of a constant expression (for example, CAST 1.1 AS REAL)
Non-Constant Parameters
SQL-J statements, as you know, often include non-literals. For example, an SQL-J statement may include dynamic parameters or column references. For example:
SELECT * FROM a, b
WHERE a.col1 = b.col1
AND a.col1 = ?
In this example, the ? is a dynamic parameter that the application fills in at runtime, and a.col1 and b.col1 are column references that Cloudscape evaluates at runtime.
Your VTI class must be able to handle such non-literal expressions. For example:
SELECT COUNT(*) FROM NEW jarvti(jar) AS jars, myjars
WHERE myjars.jar = ?
SELECT name FROM NEW jarvti(?) AS jars
As discussed elsewhere, Cloudscape may need to instantiate the VTI class at compile time. What value does it pass for these parameters at compile time (before literal values are substituted for the non-literals)? That depends on the data type of the parameter. In the above examples, the data type of the parameter is a String, which is a Java object, so Cloudscape passes in null. You must program your VTI class to handle a null value for Object parameters.
For Java primitives, Cloudscape passes in arbitrary default values. The default values are shown in Table 5-1.
Table 5-1 Default Values for Parameter Data Types
Data Type
|
Default Value
|
Object
|
null
A VTI must be prepared to handle a null parameter for an Object in its constructor and in the optional getResultSetMetaData method, if implemented.
|
byte
|
0
|
short
|
0
|
int
|
0
|
long
|
0L
|
float
|
0.0
|
double
|
0.0
|
boolean
|
false
|
char
|
'\u0000'
|
NOTE: You may want to read Method Resolution and Type Correspondence for information on working with parameters that are Java primitives.
Providing Costing Information
If a VTI class implements COM.cloudscape.vti.VTICosting, Cloudscapes optimizer can use the information it provides when optimizing an SQL-J statement that references it. This may be useful in determining join order if the SQL-J statement involves a join.
For example, if it is very expensive to iterate through the rows in a VTI, Cloudscape would probably not choose it as an outer table in a join. If on the contrary it is very cheap to iterate through the rows in the VTI, Cloudscape would definitely choose it as the inner table in a join if it has a choice.
By default, Cloudscape assumes that it is very expensive to instantiate a VTI and to iterate through its rows. If you know more about a VTIs cost than Cloudscape does, provide that information by implementing this interface.
NEW: The ability to have a VTI class return costing information is new in Version 3.0.
This interface also provides a way for you to tell Cloudscape that the class cannot be instantiated more than once.
To implement the interface, you must provide three methods. (The methods take a parameter of type COM.cloudscape.database.VTIEnvironment, which you can ignore for the current release.)
-
public double getEstimatedRowCount( COM.cloudscape.database.VTIEnvironment)
The estimated number of rows returned by a particular instance of the VTI.
-
public double getEstimatedCostPerInstantiation( COM.cloudscape.database.VTIEnvironment)
The estimated cost of instantiating the VTI and iterating through the rows of that instantiation. Usually the cost of iterating through the rows constitutes the greatest part of the cost. (For more information, see Estimating the Cost).
-
public boolean supportsMultipleInstantiations( COM.cloudscape.database.VTIEnvironment)
Whether the VTI can be instantiated more than once during execution. (most VTIs do not have this limitation.) Some unusual VTIs may open read-once streams; those VTIs should be instantiated only once by Cloudscape. Otherwise, Cloudscape may instantiate the VTI more than once in evaluating the statement.
When this method returns false, if Cloudscape chooses the VTI instantiation as the inner table of a join, it must materialize the VTI instantiation into a temporary table.
NOTE: You may wish only to inform Cloudscape that the VTI class can be instantiated only once and have no need to provide costing information. Unfortunately, this interface is all or nothing; you have to implement the cost-related methods anyway. If so, simply have the cost-related methods return the pre-defined variables defaultEstimatedRowCount or the defaultEstimatedCost.
Estimating the Cost
To estimate a VTIs cost, your best bet is to work with RunTimeStatistics. (For more information about RunTimeStatistics, see Tuning Cloudscape.) This is done most easily from Cloudview, which provides an interface for viewing RunTimeStatistics.
In the SQL window, check the Use Statistics box.
Run a query that does a table scan.
Look at the Statistics tab, and select that node that begins with the words Table Scan (this is the TableScanResultSet). Right-click and select Inspect to display the TableScanResultSet object.
Write down the following values for the TableScanResultSet, which are displayed in the left-hand window:
optimizers estimated cost
total time spent in the result set (inspectOverall time, which is displayed in milliseconds)
Calculate the value of an optimizer unit for your environment:
(optimizer's estimated cost)/
(total time spent in the result set)
Run the query again twice, re-calculating the value of an optimizer unit for your environment each time.
Calculate the average of the three calculations and use this as the value of an optimizer unit. This value does not have to be precise. It is used to provide a rough estimate only.
Run a query that instantiates and selects from your VTI class.
Look at the Statistics tab for the statement.
Select the node that begins VTI, and right-click to inspect the VTIResultSet.
Look through the text and write down :
total time spent in the result set (inspectOverall time, which is displayed in milliseconds)
the number of rows seen
Then calculate:
(optimizer unit) * (total time spent in the result set) /
(number of rows)
This figure is the inexact cost per row for your VTI class. If your class is able to determine or estimate the number of rows in a particular instantiation, it should be able to return an estimated cost per instantiation (cost per row * number of rows).
Estimate whether your VTI has an additional significant cost per instantiation over and above cost per row. For example, your VTI class may have a cost even if it returns 0 or 1 rows. If such a cost is significant, add this cost to (cost per row * number of rows) when returning an estimate of total cost. To determine whether such a cost is significant is to select from an instantiation of the VTI that returns 0 or 1 rows, then inspect the statistics. If the statistics show any time at all in the inspectOverall field, the cost is significant. In such a case, the VTI class should use the following slightly different formula to estimate cost:
((optimizer unit) *
((total time spent in the result set) - (total time spent in
empty result set)) /
(number of rows)) +
(total time spent in empty result set)
For an example of a VTI that provides costing information, see VTI class JBMSTours.vti.jdbc1_2.jarvti provided in the sample application.
Templates for Creating VTIs
The main requirement for a VTI class is that it implement java.sql.ResultSet, which has quite a few methods. Often a class uses only three or four of those methods but it must implement the methods for them to compile correctly. Cloudscape has provided two template classes to make it easier to develop VTI classes:
-
COM.cloudscape.vti.VTITemplate
This class implements most of the methods of ResultSet, each one throwing an SQLException with the name of the method being called. A class that extends this template can simply implement the methods it needs that are not implemented in the template and override any methods it needs to implement for correct functionality. (Compiles in JDBC 1.2 only.)
-
COM.cloudscape.vti20.VTITemplate
This class implements most of the methods of ResultSet, each one throwing an SQLException with the name of the method being called. A class that extends this template can simply implement the methods it needs that are not implemented in the template and override any methods it needs to implement for correct functionality. (Compiles in JDBC 2.0.)
-
COM.cloudscape.vti.VTIMetaDataTemplate,
An abstract implementation of ResultSetMetaData (JDK1.1/JDBC 1.2). This class implements most of the methods of ResultSetMetaData, each one throwing an SQLException with the name of the method. A class that extends this template can simply implement the methods not implemented here and override any methods it needs to implement for correct functionality. (Compiles in JDBC 1.2 only.)
-
COM.cloudscape.vti20.VTIMetaDataTemplate,
An abstract implementation of ResultSetMetaData (JDK1.1/JDBC 1.2). This class implements most of the methods of ResultSetMetaData, each one throwing an SQLException with the name of the method. A class that extends this template can simply implement the methods not implemented here and override any methods it needs to implement for correct functionality. (Compiles in JDBC 2.0.)
Cloudscape provides the source for these classes in demo/programs/tours/JBMSTours/vti/jdbc1_2; see VTITemplate.txt, VTIMetaDataTemplate.txt., VTITemplate.txt, and VTIMetaDataTemplate.txt
Built-In VTIs and Example VTIs
The Cloudscape engineers have found VTI classes an extremely useful way of presenting information. Whereas another DBMS may come with clunky system functions, Cloudscape comes with a number of built-in VTIs that provide internal system information to the end-user (you) in an elegant way. For example:
The JBMSTours sample application comes with a sample VTI class, JBMSTours.vti.jarvti, if you want to look at another example VTI class.
|