QlikView Integration with Hadoop 2


In this post we will discuss about basic introduction to Qlikview BI tool and Qlikview Integration with hadoop hive. In this post we will use Cloudera Hive and its jdbc drivers/connectors to connect with Qlikview and we will see sample table retrieval from cloudera hadoop hive database.

QlikView Overview

What is QlikView?

QlikView is one of the famous business intelligence and visualization software/tool build by Qlik (previously known as QlikTech) company for turning data into knowledge.

Qlikview is more or less similar to Tableau BI visualization tool.

QlikView Strengths
  • It provides connectivity with various data sources like SQL Server, Oracle, SAP, Excel, XML or text files
  • QlikView offers a wide variety of graphs, charts and tables in different formats to
    present our data
  • Supports views, zoom, grouping and animation.
  • Powered by Next generation In-Memory Technology
  • QlikView has a unique drill feature Automatic Associative Search which provides great business insights by understanding how data is associated.
  • When the user clicks on a data point in a field, no queries are fired. Instead, all the other fields instantaneously filter themselves based on the selection the user made.
  • It is Fast to deploy and IT efficient
Qlikview Limitations
  • As Qlikview is based on In-memory technology, QlikView Applications are constrained by RAM size. So, scalability and performance is a limitation.
  • No real time analysis and reporting
  • Lacks richness in OLAP style analysis
QlikView Product Suite

Qlikview product suite offers various products.

  • QlikView Desktop – With QlikView Personal Edition we can make full use of the QlikView functionality, but it is not possible to open documents created by other users. To do this, we need a QlikView license.
  • Qlikview Server – For centralized repository of applications, for production usage.
  • Qlikview Publisher

For more information on QlikView see www.qlikview.com.

QlikView Installation & Configuration With Hadoop

Modes of ingesting data

QlikView supports two modes of data ingesting data.

  • Data loaded to QlikView

Data will be loaded into QlikView In-Memory Associative Data Store

    • Advantages
      • Sub Second (Quick) Response Time
      • Data will be Compressed while loading into Memory
      • Only accesses Data from Hadoop at the time of load only. So there will be less Load on Hadoop
    • Disadvantages
      • As data is loaded into In-memory, Data Size will be limited
      • Data Duplication (Because data is present in both hadoop and qlikview in-memory)
  • QlikView Direct Discovery on top of Hadoop

It is also know as QlikView Hybrid Solution, In this data will reside on hadoop only but processing will be pushed down to hadoop from QlikView.

    • Advantages
      • Ability to analyze Big Data.
      • Data is left in Hadoop
      • Workload is pushed down to Hadoop
    • Disadvantages
      • Response time is higher as processing will be pushed down to hadoop instead of in qlikview.
      • We need to use Tez on Hive to improve response time on large datasets

This post will cover loading data into Qlikview In-Memory Associative Data Store; with simple modifications to the load script, the application can be configured to run the Qlikview Hybrid Solution where data resides inside Hadoop.

Qlikview desktop (personal edition) version currently supports installation on windows operating system only. There is no Linux flavor available as of now. So we need to download and install Qlikview on windows operating system only.

  • Download Qlikview personal edition from Qlikview downloads page and install it (file QlikViewDesktop_x64Setup.exe) on windows machine. It is free to download and use for personal use with full functionality of qlikview but we can not load reports of other users or we can’t share our reports with others. For sharing with other users we need license.
  • Now, we need to download the QlikView JDBC Connector , it is provided by TIQ solutions, which provides the opportunity to connect JDBC data sources directly from QlikView.
  • With the help of Qlikview JDBC connector, qlikview can be integrated with any of the below big data hadoop tools.
    • Hadoop HDFS, Hive and HiveServer2 (Cloudera, MapR, Amazon EMR)
    • Cloudera Impala
    • Apache HBase (via Phoenix)
    • Apache Cassandra CQL

Demo version of it is available for download at Qlikview JDBC connector download page.

Extract the zip file (Probably file name will be JDBCConnector_Setup.zip) Install it (JDBCConnector Setup.exe) on windows machine.

Now we can see the JDBCConnector.dll option enabled in Databases drop down box as shown in the below screen. (Open Qlikview –> File –> New File —> Edit Script –> Data)

JDBC Connector

  • We can connect Qlikview with Cloudera hadoop hive either by using ODBC drivers or by JDBC drivers. In this post, we will discuss about using JDBC driver connectors for Cloudera hadoop hive.

Cloudera Hadoop Hive JDBC drivers are available for free download at Cloudera connectors download page . Download the latest version of Hive JDBC drivers based on the version of our operating system.

Extract the downloaded file if it is zipped file (Cloudera_HiveJDBC_2.5.4.1006.zip) and keep the below jar files available for Qlikview JDBC connector. The Cloudera_HiveJDBC4_<version>.zip archive contains the following files and folder structure:

HiveJDBC4.jar
hive_metastore.jar
hive_service.jar
libfb303-0.9.0.jar
libthrift-0.9.0.jar
log4j-1.2.14.jar
ql.jar
slf4j-api-1.5.8.jar
slf4j-log4j12-1.5.8.jar
TCLIServiceClient.jar

Qlikview JDBC Connector Configuration
Building JDBC Library

As shown in the below screen shot, we need to include all the above jars in JDBC connector configuration and need to provide the JDBC driver name as in the following screens.

From Edit Script –> Data –> (JDBCConnector) Configuration –> JDBC drivers –> Add Library –> Browse the jar files and add them to library.

Adding jar files to JDBC drivers

Defining JDBC Driver Class

It is very important to give the correct class name in the JDBC driver classname otherwise Qlikview will through Specified JDBC drivers are not found error message. So, if we are using Cloudera Hive Server 1 or Hive Server 2 then driver class name should be one of the below two classes respectively.

  • com.cloudera.hive.jdbc4.HS1Driver
  • com.cloudera.hive.jdbc4.HS2Driver

By default some times this would be org.apache.hive.jdbc.HiveDriver so, be careful to change this.

JDBC driver classname

Start Hive Services on Hadoop Cluster

Now, before trying to pull the tables from Hadoop Hive, we need to make sure Hive Server and Metastore are running properly on hadoop cluster. Hive service runs on default port 10000. Use the below commands to start hive services.

Provide Connection String in QlikView

Now connect to the Hive Server in QlikView and select your table for retrieving records. In building connection URI or string, we need to provide the hostname and port number and default hive database. Also we need to provide user credentials to hadoop cluster machine to access the tables. Custom connection string will be as shown in the below screen.

JDBC connection uri

It will generate below script:

Select tables and columns

Successful connection will show the tables of default Hive database as shown below.

Hive default database tables

We can also select the columns in the tables as shown in below, preceding load will add additional Load statements of columns in the script. Usually it preceding load is preferable that we can change the column names later in the script itself instead of loading them again as shown in the above screen with other column names.

This will create SQL Select statements as shown in the script tab in the below table.

Creating Select Statements

Execute SQL Script

Now we can click OK and execute the script. If you get any error message saying that database is incorrect then put the hive database name in double quotes as Hive.”default”.tablename as shown in below screen instead of Hive.default.tablename

Edit database name

Click Okay to exit Script Editor or Reload to Load Data immediately into QlikView (Reload will force user to save the QlikView document with a new name).

Now Qlikview will execute the script and pull the records into main dashboard.

Executing script

In the above script only two records are pulled from the table into qlikview in-memory.

Build Graphs

Now Right click in main screen and choose “select fields" and select

Qlikview main screen

Click Add All to add all fields to current QlikView Sheet and. Select Items inside QlikView to see the Associative nature of QlikView. Below is the sample bar chart view of our test table records from hive database.

Bar charts in qlikview

Thus we can connect to Cloudera Hive with Qlikview and perform analytics. For playing around Qlikview tool please refer the training at Qlikview free training

References: Qlikview Community


About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.


Leave a comment

Your email address will not be published. Required fields are marked *

2 thoughts on “QlikView Integration with Hadoop

  • peter

    Hi there,
    I already installed JDBC-connector in my computer, but I still cannot see “JDBCConnector.dll” on QV. What should I do?

  • vijay

    Hi,

    I tried following the same steps as mentioned above, but still I could not retrieve the table list in Qlikview 12.0. Is there any change that needs to be done for QV 12.0? Please help.


Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017

.