Hive Interview Questions and Answers for experienced Part – 4

Below are some the of important hive Interview Questions and Answers for experienced hadoop developers.

Hive Interview Questions and Answers for experienced
1. What is the Hive configuration precedence order?

There is a precedence hierarchy to setting properties. In the following list, lower numbers take precedence over higher numbers:

  1. The Hive SET command
  2. The command line -hiveconf option
  3. hive-site.xml
  4. hive-default.xml
  5. hadoop-site.xml (or, equivalently, core-site.xml, hdfs-site.xml, and mapred-site.xml)
  6. hadoop-default.xml (or, equivalently, core-default.xml, hdfs-default.xml, and mapred-default.xml)
2. How do change settings within Hive Session?

We can change settings from within a session, too, using the SET command. This is useful for changing Hive or MapReduce job settings for a particular query. For example, the following command ensures buckets are populated according to the table definition.

To see the current value of any property, use SET with just the property name:

By itself, SET will list all the properties and their values set by Hive. This list will not include Hadoop defaults, unless they have been explicitly overridden in one of the ways covered in the above answer. Use SET -v to list all the properties in the system, including Hadoop defaults.

3. How to print header on Hive query results?

We need to use following set command before our query to show column headers in STDOUT.

4. How to get detailed description of a table in Hive?

Use below hive command to get a detailed description of a hive table.

5. How to access sub directories recursively in Hive queries?

To process directories recursively in Hive, we need to set below two commands in hive session. These two parameters work in conjunction.

Now hive tables can be pointed to the higher level directory. This is suitable for a scenario where the directory structure is as following: /data/country/state/city

6. How to skip header rows from a table in Hive?

Suppose while processing some log files, we may find header records.


Like above, It may have 3 lines of headers that we do not want to include in our Hive query. To skip header lines from our tables in Hive we can set a table property that will allow us to skip the header lines.

7. Is it possible to create multiple table in hive for same data?

As hive creates schema and append on top of an existing data file. One can have multiple schema for one data file, schema will be saved in hive’s metastore and data will not be parsed or serialized to disk in given schema. When we will try to retrieve data, schema will be used. For example if we have 5 column (name, job, dob, id, salary) in the data file present in hive metastore then, we can have multiple schema by choosing any number of columns from the above list. (Table with 3 columns or 5 columns or 6 columns).

But while querying, if we specify any column other than above list, will result in NULL values.

8. What is the maximum size of string data type supported by Hive?

Maximum size is 2 GB.

9. What are the Binary Storage formats supported in Hive?

By default Hive supports text file format, however hive also supports below binary formats.

Sequence Files, Avro Data files, RCFiles, ORC files, Parquet files

Sequence files: General binary format. splittable, compressible and row oriented. a typical example can be. if we have lots of small file, we may use sequence file as a container, where file name can be a key and content could stored as value. it support compression which enables huge gain in performance.

Avro datafiles: Same as Sequence file splittable, compressible and row oriented except support of schema evolution and multilingual binding support.

RCFiles: Record columnar file, it’s a column oriented storage file. it breaks table in row split. in each split stores that value of first row in first column and followed sub subsequently.

ORC Files: Optimized Record Columnar files

10. is HQL case sensitive?

HQL is not case sensitive.

11. Describe CONCAT function in Hive with Example?

CONCAT function will concatenate the input strings. We can specify any number of strings separated by comma.

Example: CONCAT (‘Hive’,’-‘,’is’,’-‘,’a’,’-‘,’data warehouse’,’-‘,’in Hadoop’);
Output: Hive-is-a-data warehouse-in Hadoop

So, every time we delimit the strings by ‘-‘. If it is common for all the strings, then Hive provides another command CONCAT_WS. Here you have to specify the delimit operator first.

Syntax: CONCAT_WS (‘-‘,’Hive’,’is’,’a’,’data warehouse’,’in Hadoop’);
Output: Hive-is-a-data warehouse-in Hadoop

12. Describe REPEAT function in Hive with example?

REPEAT function will repeat the input string n times specified in the command.

Example: REPEAT(‘Hive’,3);
Output: HiveHiveHive.

13. Describe REVERSE function in Hive with example?

REVERSE function will reverse the characters in a string.

Example: REVERSE(‘Hive’);
Output: eviH

14. Describe TRIM function in Hive with example?

TRIM function will remove the spaces associated with a string.

Example: TRIM(‘ Hadoop ‘);
Output: Hadoop.

If we want to remove only leading or trailing spaces then we can specify the below commands respectively.

LTRIM(‘ Hadoop’);
RTRIM(‘Hadoop ‘);

15. Describe RLIKE in Hive with an example?

RLIKE (Right-Like) is a special function in Hive where if any substring of A matches with B then it evaluates to true. It also obeys Java regular expression pattern. Users don’t need to put % symbol for a simple match in RLIKE.

Examples: ‘Express’ RLIKE ‘Exp’ –> True
‘Express’ RLIKE ‘^E.*’ –> True (Regular expression)

Moreover, RLIKE will come handy when the string has some spaces. Without using TRIM function, RLIKE satisfies the required scenario. Suppose if A has value ‘Express ‘ (2 spaces additionally) and B has value ‘Express’. In these situations, RLIKE will work better without using TRIM.

‘Express ‘ RLIKE ‘Express’ –> True

Note: RLIKE evaluates to NULL if A or B is NULL.

Hive Interview Questions and Answers – Part 3

In this post, we will discuss about a few more hadoop hive interview questions and answers for hadoop freshers and experienced developers.

Hive Interview Questions and Answers
1. What are the types of tables in Hive?

There are two types of tables.

  • Managed tables
  • External tables

Only while dropping tables these two differentiates. Otherwise both type of tables are very similar.

2. What kind of data warehouse application is suitable for Hive?

Hive is not a full database. The design constraints and limitations of Hadoop and HDFS
impose limits on what Hive can do.
Hive is most suited for data warehouse applications, where

  • Relatively static data is analyzed,
  • Fast response times are not required, and
  • When the data is not changing rapidly.
3. Does Hive provide OLTP or OLAP?

Hive doesn’t provide crucial features required for OLTP, Online Transaction Processing.
It’s closer to being an OLAP tool, Online Analytic Processing. So, Hive is best suited for
data warehouse applications, where a large data set is maintained and mined for insights, reports, etc.

4. Does Hive support record level Insert, delete or update?

No. Hive does not provide record-level update, insert, or delete. Henceforth, Hive does not
provide transactions too. However, users can go with CASE statements and built in functions of Hive to satisfy the above DML operations. Thus, a complex update query in
a RDBMS may need many lines of code in Hive.

5. How can we change a column data type in Hive?

We can use below command to alter data type of a column in hive.

Example: If we want to change the data type of empid column from integer to bigint in a
table called employee.

6. How can we copy the columns of a hive table into a file?

By using awk command in shell, the output from HiveQL Describe command can be written to a file.

7. How to rename a table in Hive?

Using ALTER command with RENAME, we can rename a table in Hive.