Creating and using temporary/volatile database tables In Stata

Dimitriy V. Masterov picture Dimitriy V. Masterov · Feb 13, 2013 · Viewed 9.9k times · Source

Addendum: As of Stata 14, volatile tables work without any hacks.

Is there a way to tweak Stata to work with temporary volatile tables? These tables and the data are deleted after a user logs off the session.

Here's an example of a simple toy SQL query that I am using in Stata and Teradata:

odbc load,  exec("
    BEGIN TRANSACTION;
    CREATE VOLATILE MULTISET TABLE vol_tab AS (
        SELECT TOP 10 user_id
        FROM dw_users
    ) WITH DATA
    PRIMARY INDEX(user_id)
    ON COMMIT PRESERVE ROWS;

    SELECT * FROM vol_tab;
    END TRANSACTION;
") dsn("mozart");

This is the error message I am getting:

The ODBC driver reported the following diagnostics
[Teradata][ODBC Teradata Driver][Teradata Database] Only an ET or null statement is legal after a DDL Statement.
SQLSTATE=25000
r(682);

The Stata error code means:

error . . . . . . . . . . . . . . . . . . . . . . . . Return code 682 could not connect to odbc dsn; This typically occurs because of incorrect permissions, such as a bad User Name or Password. Use set debug on to display the actual error message generated by the ODBC driver.

As far as I can tell permission are fine since I can pull data if I just execute the "SELECT TOP 10..." query. I set debug on, but it did not produce any additional information.

Session mode is Teradata. ODBC manager is set to unixODBC. I am using Stata 13.1 on an Ubuntu server.

I believe the underlying issue may be that separate connections are established for each SQL statement, so the volatile table evaporates by the time the select is issued. I am waiting on tech support to verify this.

I tried using the odbc sqlfile command well, but this approach does not work unless I create a permanent table at the end of it. There's no load option with odbc sqlfile.

Volatile tables seem to work just fine in SAS and R. For example, this works perfectly:

library("RODBC")
db <- odbcConnect("mozart")
sqlQuery(db,"CREATE VOLATILE MULTISET TABLE vol_tab AS (
         SELECT TOP 10 user_id
         FROM dw_users
     ) WITH DATA
     PRIMARY INDEX(user_id)
     ON COMMIT PRESERVE ROWS;
")
data<- sqlQuery(db,"select * from vol_tab;",rows_at_time=1)

Perhaps this is because the connection to the DB remains open until close(db).

Answer

BellevueBob picture BellevueBob · Feb 13, 2013

I'm not familiar with Stata, but I'm guessing that your ODBC is connecting in "ANSI" mode. Try adding this between the create volatile table and the select statements:

commit work;

If that doesn't work, you may need to make two separate calls somehow.

UPDATE: Thinking a bit more about this, perhaps you can try this:

odbc load, exec("select distinct user_id from dw_users where cast(date_confirm as
date) > '2011-09-15'") clear dsn("mozart") lowercase;

In other words, just execute the query in one step; don't try to create a volatile table.