In this post we will discuss about basic introduction to Qlikview BI tool and Qlikview Integration with hadoop hive. In this post we will use Cloudera Hive and its jdbc drivers/connectors to connect with Qlikview and we will see sample table retrieval from cloudera hadoop hive database.
Table of Contents
- QlikView Overview
- QlikView Installation & Configuration With Hadoop
- Modes of ingesting data
- Qlikview JDBC Connector Configuration
- Start Hive Services on Hadoop Cluster
- Provide Connection String in QlikView
- Select tables and columns
- Execute SQL Script
- Build Graphs
What is QlikView?
QlikView is one of the famous business intelligence and visualization software/tool build by Qlik (previously known as QlikTech) company for turning data into knowledge.
Qlikview is more or less similar to Tableau BI visualization tool.
- It provides connectivity with various data sources like SQL Server, Oracle, SAP, Excel, XML or text files
- QlikView offers a wide variety of graphs, charts and tables in different formats to
present our data
- Supports views, zoom, grouping and animation.
- Powered by Next generation In-Memory Technology
- QlikView has a unique drill feature Automatic Associative Search which provides great business insights by understanding how data is associated.
- When the user clicks on a data point in a field, no queries are fired. Instead, all the other fields instantaneously filter themselves based on the selection the user made.
- It is Fast to deploy and IT efficient
- As Qlikview is based on In-memory technology, QlikView Applications are constrained by RAM size. So, scalability and performance is a limitation.
- No real time analysis and reporting
- Lacks richness in OLAP style analysis
QlikView Product Suite
Qlikview product suite offers various products.
- QlikView Desktop – With QlikView Personal Edition we can make full use of the QlikView functionality, but it is not possible to open documents created by other users. To do this, we need a QlikView license.
- Qlikview Server – For centralized repository of applications, for production usage.
- Qlikview Publisher
For more information on QlikView see www.qlikview.com.
QlikView Installation & Configuration With Hadoop
Modes of ingesting data
QlikView supports two modes of data ingesting data.
Data loaded to QlikView
Data will be loaded into QlikView In-Memory Associative Data Store
- Sub Second (Quick) Response Time
- Data will be Compressed while loading into Memory
- Only accesses Data from Hadoop at the time of load only. So there will be less Load on Hadoop
- As data is loaded into In-memory, Data Size will be limited
- Data Duplication (Because data is present in both hadoop and qlikview in-memory)
QlikView Direct Discovery on top of Hadoop
It is also know as QlikView Hybrid Solution, In this data will reside on hadoop only but processing will be pushed down to hadoop from QlikView.
- Ability to analyze Big Data.
- Data is left in Hadoop
- Workload is pushed down to Hadoop
- Response time is higher as processing will be pushed down to hadoop instead of in qlikview.
- We need to use Tez on Hive to improve response time on large datasets
This post will cover loading data into Qlikview In-Memory Associative Data Store; with simple modifications to the load script, the application can be configured to run the Qlikview Hybrid Solution where data resides inside Hadoop.
Qlikview desktop (personal edition) version currently supports installation on windows operating system only. There is no Linux flavor available as of now. So we need to download and install Qlikview on windows operating system only.
- Download Qlikview personal edition from Qlikview downloads page and install it (file QlikViewDesktop_x64Setup.exe) on windows machine. It is free to download and use for personal use with full functionality of qlikview but we can not load reports of other users or we can’t share our reports with others. For sharing with other users we need license.
- Now, we need to download the QlikView JDBC Connector , it is provided by TIQ solutions, which provides the opportunity to connect JDBC data sources directly from QlikView.
- With the help of Qlikview JDBC connector, qlikview can be integrated with any of the below big data hadoop tools.
- Hadoop HDFS, Hive and HiveServer2 (Cloudera, MapR, Amazon EMR)
- Cloudera Impala
- Apache HBase (via Phoenix)
- Apache Cassandra CQL
Demo version of it is available for download at Qlikview JDBC connector download page.
Extract the zip file (Probably file name will be JDBCConnector_Setup.zip) Install it (JDBCConnector Setup.exe) on windows machine.
Now we can see the JDBCConnector.dll option enabled in Databases drop down box as shown in the below screen. (Open Qlikview –> File –> New File —> Edit Script –> Data)
- We can connect Qlikview with Cloudera hadoop hive either by using ODBC drivers or by JDBC drivers. In this post, we will discuss about using JDBC driver connectors for Cloudera hadoop hive.
Cloudera Hadoop Hive JDBC drivers are available for free download at Cloudera connectors download page . Download the latest version of Hive JDBC drivers based on the version of our operating system.
Extract the downloaded file if it is zipped file (Cloudera_HiveJDBC_22.214.171.1246.zip) and keep the below jar files available for Qlikview JDBC connector. The Cloudera_HiveJDBC4_<version>.zip archive contains the following files and folder structure:
Qlikview JDBC Connector Configuration
Building JDBC Library
As shown in the below screen shot, we need to include all the above jars in JDBC connector configuration and need to provide the JDBC driver name as in the following screens.
From Edit Script –> Data –> (JDBCConnector) Configuration –> JDBC drivers –> Add Library –> Browse the jar files and add them to library.
Defining JDBC Driver Class
It is very important to give the correct class name in the JDBC driver classname otherwise Qlikview will through Specified JDBC drivers are not found error message. So, if we are using Cloudera Hive Server 1 or Hive Server 2 then driver class name should be one of the below two classes respectively.
By default some times this would be org.apache.hive.jdbc.HiveDriver so, be careful to change this.
Start Hive Services on Hadoop Cluster
Now, before trying to pull the tables from Hadoop Hive, we need to make sure Hive Server and Metastore are running properly on hadoop cluster. Hive service runs on default port 10000. Use the below commands to start hive services.
$ hive --service metastore &
$ hive --service hiveserver &
Provide Connection String in QlikView
Now connect to the Hive Server in QlikView and select your table for retrieving records. In building connection URI or string, we need to provide the hostname and port number and default hive database. Also we need to provide user credentials to hadoop cluster machine to access the tables. Custom connection string will be as shown in the below screen.
It will generate below script:
CUSTOM CONNECT TO "Provider=JDBCConnector_x64.dll;jdbc:hive://192.168.113.139:10000/default;XUserId=KVPKRRR
Select tables and columns
Successful connection will show the tables of default Hive database as shown below.
We can also select the columns in the tables as shown in below, preceding load will add additional Load statements of columns in the script. Usually it preceding load is preferable that we can change the column names later in the script itself instead of loading them again as shown in the above screen with other column names.
This will create SQL Select statements as shown in the script tab in the below table.
Execute SQL Script
Now we can click OK and execute the script. If you get any error message saying that database is incorrect then put the hive database name in double quotes as Hive.”default”.tablename as shown in below screen instead of Hive.default.tablename
Click Okay to exit Script Editor or Reload to Load Data immediately into QlikView (Reload will force user to save the QlikView document with a new name).
Now Qlikview will execute the script and pull the records into main dashboard.
In the above script only two records are pulled from the table into qlikview in-memory.
Now Right click in main screen and choose “select fields” and select
Click Add All to add all fields to current QlikView Sheet and. Select Items inside QlikView to see the Associative nature of QlikView. Below is the sample bar chart view of our test table records from hive database.
Thus we can connect to Cloudera Hive with Qlikview and perform analytics. For playing around Qlikview tool please refer the training at Qlikview free training
References: Qlikview Community