In this post, we will discuss about Hive Database Commands (Create/Alter/Use/Drop Database) with some examples for each statement. All these commands and their options are from hive-0.14.0 release documentations. So, in order to use these commands with all the options described below we need at least hive-0.14.0 release.
Hive Database Commands
From Hive-0.14.0 release onwards Hive DATABASE is also called as SCHEMA. So, Both SCHEMA and DATABASE are same in Hive. All the commands discussed below will do the same work for SCHEMA and DATABASE keywords in the syntax.
Introduction to Hive Databases
As given in above note, Either SCHEMA or DATABASE in Hive is just like a Catalog of tables. With the help of database names, users can have same table name in different databases, So thus, in large organizations, teams or users are allowed create same table by creating their own separate DATABASE, to avoid table name collisions.
The default database in hive is default. We do not need to create this database. Any table created without specifying database will be created under this.
Following are the sections on operations on Hive databases.
Below is the syntax for creation of Databases in Hive.
Description of Arguments
- IF NOT EXISTS – This optional but recommended to use, so that, if a database with same name already exists, then it will not try to create it again and will not show any error message.
- COMMENT – It is also optional. It can be used for providing short description or any comment to database
- LOCATION – It is also optional. By default all the hive databases will be created under default warehouse directory (set by the property hive.metastore.warehouse.dir) as /user/hive/warehouse/database_name.db . But if we want to specify our own location then this option can be specified.
- DBPROPERTIES – Optional but used to specify any properties of database in the form of (key, value) separated pairs.
Below is the example database creation HiveQL, which covers all the above options.
Simple Syntax for creating database after omitting all the optional arguments is given below.
This uses all the default values for omitted arguments.
Lets verify the creation of these databases in Hive CLI with show databases command. It will list down the databases in hive.
By default, SHOW DATABASES lists all of the databases defined in the metastore.
- LIKE – It is optional. But it allows us to filter the database names using a regular expression.
Wild cards in the regular expression can only be ” (single quotes) for any character(s) or ‘|’ for a choice.
Examples are ’employees’, ’emp’, ‘emp*|*ees‘, (emp* or *ees), all of which will match the database named ’employees’.
Below is the sample output of show databases command after execution above two creation commands.
We can also verify the properties of databases with Describe command in hive.
We can observe the differences between Describe with Extended and without Extended argument in the below examples. Describe extended will also shows database’s properties given under DBPROPERTIES argument at the time of creation.
Describe command will also list down the directory location of the database.
Observe the locations of test_db (created with our own LOCATION in CREATE statement) and test_db2 (without location argument while creating it).
We can alter the databases with Alter command in hive but it provides very minimal alterations.
- Assign any new (key, value) pairs into DBPROPERTIES
- Set owner user or role to the Database
But below are its limitations
- We can’t unset any property using Alter command
- No other meta-data about the database can be changed, including its name and directory location.