Hive Database Commands


In this post, we will discuss about Hive Database Commands (Create/Alter/Use/Drop Database) with some examples for each statement. All these commands and their options are from hive-0.14.0 release documentations. So, in order to use these commands with all the options described below we need at least hive-0.14.0 release.

Hive Database Commands

Note

From Hive-0.14.0 release onwards Hive DATABASE is also called as SCHEMA. So, Both SCHEMA and DATABASE are same in Hive. All the commands discussed below will do the same work for SCHEMA and DATABASE keywords in the syntax.

Introduction to Hive Databases

As given in above note, Either SCHEMA or DATABASE in Hive is just like a Catalog of tables. With the help of database names, users can have same table name in different databases, So thus, in large organizations, teams or users are allowed create same table by creating their own separate DATABASE, to avoid table name collisions.

The default database in hive is default. We do not need to create this database. Any table created without specifying database will be created under this.

Following are the sections on operations on Hive databases.

Create Database

Below is the syntax for creation of Databases in Hive.

Description of Arguments
  • IF NOT EXISTS – This optional but recommended to use, so that, if a database with same name already exists, then it will not try to create it again and will not show any error message.
  • COMMENT – It is also optional. It can be used for providing short description or any comment to database
  • LOCATION – It is also optional. By default all the hive databases will be created under default warehouse directory (set by the property hive.metastore.warehouse.dir) as /user/hive/warehouse/database_name.db . But if we want to specify our own location then this option can be specified.
  • DBPROPERTIES – Optional but used to specify any properties of database in the form of (key, value) separated pairs.
Example

Below is the example database creation HiveQL, which covers all the above options.

Simple Syntax for creating database after omitting all the optional arguments is given below.

This uses all the default values for omitted arguments.

Show Databases

Lets verify the creation of these databases in Hive CLI with show databases command. It will list down the databases in hive.

Syntax

By default, SHOW DATABASES lists all of the databases defined in the metastore.

  • LIKE – It is optional. But it allows us to filter the database names using a regular expression.

Wild cards in the regular expression can only be (single quotes) for any character(s) or ‘|’ for a choice.

Examples are ’employees’, ’emp’, ‘emp*|*ees,  (emp* or *ees), all of which will match the database named ’employees’.

Examples

Below is the sample output of show databases command after execution above two creation commands.

Describe Databases

We can also verify the properties of databases with Describe command in hive.

Syntax

We can observe the differences between Describe with Extended and without Extended argument in the below examples. Describe extended will also shows database’s properties given under DBPROPERTIES argument at the time of creation.

Describe command will also list down the directory location of the database.

Examples

Observe the locations of test_db (created with our own LOCATION in CREATE statement) and test_db2 (without location argument while creating it).

Alter Databases

We can alter the databases with Alter command in hive but it provides very minimal alterations.

We can

  • Assign any new (key, value) pairs into DBPROPERTIES
  • Set owner user or role to the Database

But below are its limitations

  • We can’t unset any property using Alter command
  • No other meta-data about the database can be changed, including its name and directory location.
Syntax

Examples

Lets add new property ‘modified by’ to the above created database test_db and see the result in ‘describe extended’.

In this example, we can also observe that SCHEMA and DATABASE can be used interchangeably, As we have used SCHEMA in Alter command and DATABASE in DESCRIBE command.

Use Databases

We can set the database on which we need to work with USE command in hive. It sets the current database to be used for further hive operations.

As, by default, we enter into default database in Hive CLI, we need to change our database if we need to point to our custom database.

Note

To know the current database under which we are working we need to the below property in Hive shell. There is no command to show the current database, but this property will keep printing the current database name as suffix in the command line prompt as shown in below examples.

Examples

Updating the conf property via set hive.cli.print.current.db=true will show the current DB for current hive session only, but if we update the .hiverc file with the above property will keep showing the current db for all the sessions.

Drop Databases

When we no longer need a database, we can drop such databases with Drop command in hive.

By default, Hive doesn’t allow us to drop databases that contain at least one or more tables. In this case, we need to either drop the tables first and then drop database or we need to provide CASCADE argument to DROP command.

Syntax

  • IF EXISTS – It is optional but used to suppresses warnings if database_name doesn’t exist.
  • RESTRICT – This is optional and even if it is used, it is same as default hive behavior, i.e. it will not allow database to be dropped until all the tables inside it are dropped.
  • CASCADE – This argument allows to drop the non-empty databases with single command. DROP with CASCADE is equivalent to dropping all the tables separately and dropping the database finally in cascading manner.
Examples

Observe the hive error message ” InvalidOperationException(message:Database test_db2 is not empty. One or more tables exist.)”  when trying to delete test_db2 which has one test_table with no CASCADE argument.

Drop Database in Hive

When a database is dropped, its directory is also deleted

References

https://cwiki.apache.org/confluence/display/Hive/Home


Profile photo of Siva

About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.

Leave a comment

Your email address will not be published. Required fields are marked *


Review Comments
default image

I am a plsql developer. Intrested to move into bigdata.

Neetika Singh ITA Hadoop in Dec/2016 December 22, 2016

.