How do I change data type of column in Apache Phoenix without losing HBase data?

Matt picture Matt · Feb 12, 2015 · Viewed 7.8k times · Source

I have an HBase installation with a bunch of existing tables, with a bunch of data in them (which I would like not to be deleted). I recently came across using Apache Phoenix to be able to query HBase data with a SQL-like syntax, and so far it is brilliant. However, since I'm still not entirely familiar with the various data types and how they map to Java types that I stored in HBase, sometimes I get it wrong. If I had an HBase column that I stored as

Bytes.toBytes(Long long)

and I accidentally created my Phoenix column as a varchar, how can I now change the Phoenix column without losing all my data? If I do

alter table "mytable" add "myfamily"."mycolumn" unsigned_long

and then

alter table "mytable" drop column "mycolumn"

and then afterwards

alter table "mytable" add "myfamily"."mycolumn" varchar

all of my HBase data gets wiped from the underlying "mycolumn" column! What is the best way to get around this?

I read about setting the parameter

phoenix.schema.dropMetaData

to false, which sounds like exactly what I want, but I can't figure out where I actually set this. Is there some command I can issue in Squirrel when logged on to Phoenix to set this to false? Or is it an environment variable?

Thanks in advance for any help!

EDIT 1

I created a dummy table in Phoenix with a column of each type, and then looked it up in the system.catalog table to get the number to represent each data type. I then used the following command

upsert into system.catalog (table_name, column_name, column_family, data_type) values ('mytable','mycolumn','mycf',3)

to change the type of mycolumn to a decimal. However, when I query mytable, mycolumn is still of type unsigned_long. Maybe after updating this value I need to restart HBase?

FYI:

COLUMN_NAME DATA_TYPE
MYTINYINT   -6
MYBIGINT    -5
MYDECIMAL   3
MYINTEGER   4
MYFLOAT 6
MYDOUBLE    8
MYUINTEGER  9
MYULONG 10
MYUTINYINT  11
MYVARCHAR   12
ROWID   12
MYUFLOAT    14
MYUDOUBLE   15
MYBOOLEAN   16
MYUTIME 18
MYUDATE 19
MYUTIMESTAMP    19
MYDATE  91
MYTIME  92
MYTIMESTAMP 93

EDIT 2

It appears to be possible to get rid of a table from Phoenix without touching the underlying HBase table by deleting all the entries for the table from SYSTEM.CATALOG and SYSTEM.STATS. The table can then be recreated with the desired column types. No idea yet if this might have any horrific adverse effects on my HBase table!

Answer

Magham Ravi picture Magham Ravi · Feb 13, 2015

Phoenix stores the metadata of all the tables in another HBase table called SYSTEM_CATALOG . The parameter phoenix.schema.dropMetaData
forces the metadata to be deleted from the catalog table when an ALTER command is given.

If you set it to false, the next time you create a table with the same name, you will keep getting TableAlreadyExistsException.