Tracing Logs Through YARN Web UI 2


Hadoop provides the ability to browse the logs of the user applications through YARN Web UI. System logs, Standard error logs and Standard Output messages can be accessed from Tools –> Local Logs Path on YARN Web UI.

Tracing Logs for failed/killed Jobs:

In the below screen, we are running aggregatewordcount map reduce program for building aggregate of counts of words in input text file. This example can be found in hadoop-mapreduce-examples-2.3.0.jar in share/hadoop/mapreduce directory.

Actually the program aggregatewordcount expects a input file in sequence file format instead of text format, but we have taken text input file to make the job to fail.

Fail job 1   Fail job 2 Here the advantage of accessing logs is error messages are available on terminal as long as it is open but even after some time closing the terminal, the same logs can be browsed through Web UI. To check the logs of the above failed job through Web UI.

1. Open Local Logs from Tools menu on the front page of YARN Cluster.

Local Log1   Local Logs

2. Search for userlogs/ directory under /logs/ directory and open it.

User logs     App Logs   The above directory contains logs for all the applications run by the user. For checking logs of our failed job with id application_*_0009, open its corresponding log directory. and open any container for browsing the actual syslogs . Log Job 0009   Log Job 2

3. Open the syslogs for detailed log information.

SysLog1

i) Job ID initialization and status transition from NEW to INITED can be seen as shown in below.

Sys log job transition status

ii) Later the job status is changed from INITED to SETUP and SETUP to RUNNING as shown in above screen.

iii) Once the job starts running, its Map and Reduce tasks will be initiated. And Map and Reduce tasks status are changed from NEW to SCHEDULED.

Job transition2

iv) Exception messages listed as

Final Message

From the above messages we can understand that aggregatewordcount program expects a sequence file as input instead of normal text file.

Below is the snapshot of final status of the job:

Final status

Thus we can analyze the logs to find out the status of the map/reduce tasks, job status transitions, java exception messages and any kind of informational or warning messages by tracing syslogs of applications.


About Siva

Senior Hadoop developer with 4 years of experience in designing and architecture solutions for the Big Data domain and has been involved with several complex engagements. Technical strengths include Hadoop, YARN, Mapreduce, Hive, Sqoop, Flume, Pig, HBase, Phoenix, Oozie, Falcon, Kafka, Storm, Spark, MySQL and Java.


Leave a comment

Your email address will not be published. Required fields are marked *

2 thoughts on “Tracing Logs Through YARN Web UI

  • Ajay

    Hi Siva ,
    Hope you are doing well .
    We are working on application to implement data warehouse on hdfs.
    flow is like below ,
    Oracle—> ETL –>Edge Node —> Hive tables –> Final Tables
    we are using spark for moving files to hdfs and transformations on tables in hive. For transformations we are using spark-sql .. but in transformations there are various cross-joins and non-equi joins, it causes many performance issues while running spark. it always fails saying physical memeory exceeded .. killing container ..

    Could you please help me to understand whats happening internally yarn log.

    • Siva Post author

      You need to find the application master some thing like app_387453498303_0002 and find the logs or you may need to find the corresponding spark job log UI which will be like http://master:4040/ and see at which action its failing and try to find alternative big data style approach instead of typical traditional rdbms style joins.


Review Comments
default image

I have attended Siva’s Spark and Scala training. He is good in presentation skills and explaining technical concepts easily to everyone in the group. He is having excellent real time experience and provided enough use cases to understand each concepts. Duration of the course and time management is awesome. Happy that I found a right person on time to learn Spark. Thanks Siva!!!

Dharmeswaran ETL / Hadoop Developer Spark Nov 2016 September 21, 2017

.