Impala Commands Cheat Sheet


This is quick touch on Impala commands and Functions. Impala accepts basic SQL syntax and below is the list of a few operators and commands that can be used inside Impala. This is just a quick cheat sheet.

Databases

In Impala, a database is a logical container for a group of tables. Each database defines a separate namespace. Within a database, you can refer to the tables inside it using their unqualified names. Different databases can contain tables with identical names. Creating a database is a lightweight operation.

There are no database-specific properties to configure. Therefore, there is no ALTER DATABASE.

Typically, you create a separate database for each project or application, to avoid naming conflicts between tables and to make clear which tables are related to each other.

Each database is physically represented by a directory in HDFS.

There is a special database, named default, where you begin when you connect to Impala. Tables created in default are physically located one level higher in HDFS than all the user-created databases.

Tables

Tables in impala are very similar to hive tables which will hold the actual data.

There are two types of tables

  1. Internal table:

These are managed by Impala, use directories inside the designated Impala work area.

  1. External table

They use arbitrary HDFS directories, where the data files are typically shared between different Hadoop components

To see whether a table is internal or external, and its associated HDFS location, issue the statement DESCRIBE FORMATTED table_name. The Table Type field displays MANAGED_TABLE for internal tables and EXTERNAL_TABLE for external tables. The Location field displays the path of the table directory as an HDFS URI.

Views

Views are lightweight logical constructs that act as aliases for queries. You can specify a view name in a query (a SELECT statement or the SELECT portion of an INSERT statement) where you would usually specify a table name.

Take a complicated reporting query, plug it into a CREATE VIEW statement…

and now you can produce the report with 1 line of code.select * from v1;

DDL

Create Database

To get List Databases

Use Database

Create Table

Creating table syntax in Impala is same as hive table creation syntax. Below is example table creation

We can create tables in many ways as shown below,

  • Create external table
  • Create table with Column definitions inferred from data file.

  • CREATE TABLE AS SELECT
  • Create table using PARTITIONED BY clause
  • Cloning table: To create an empty table with same columns and other properties of already existing table we LIKE

Show Tables

will show the list tables available in the particular database (Here it will show list of tables in “impala_test" database)

Load Data into table

Note: if you get below exception,

Then you have to change the permission of the file which you are loading into table

Ex:

We can copy data from one table to another table using below syntax.

Select