Generating reports from Unimus job failures
How to build failure reports from Unimus device jobs. Review large scale job reports for better understanding of why jobs fail on your devices.
“Why do our backups fail, Bruce? So that we can learn to fix them.”
Intro
With hundreds or even many thousands of jobs daily for some Unimus users, there are bound to be a few failed ones. Jobs such as Discovery
or Backup
fail for various reasons related to connection errors, refused credentials or unsupported devices. The results of failed jobs are all captured and error logs are viewable in the web GUI. From error logs one can learn the details of why a discovery or a backup for a device failed and use the information to remedy the situation.
In this brief article we will go over generating job report exports. These export files can then be processed externally by a monitoring system, reporting system, by an external parser, or by hand. But first, the error log data needs to be retrieved. The data is stored in the database
, so to access it we need the credentials to the database and a few well constructed queries. We will be showing how to generate reports of failed jobs and review them by hand in a spreadsheets editor (Excel) for its advanced data processing functions.
Getting to the data
Unimus supports MySQL
, PostgreSQL
and MSSQL
relational databases in addition to file based HSQL
. The former ones store data in tables with columns and rows. Databases use structured query language (SQL) for data manipulation and querying. We'll use such queries to extract the data about failed jobs and export it to a text file.
Connecting to the DB
In our test scenario an arbitrary Unimus server is using MariaDB, a database system, which is a fork of MySQL. For accessing MariaDB we are going to simply run the mariadb client from shell. Need to specify the host IP, database name and credentials:
mariadb --host=127.0.0.1 --database=unimusmdb --user=will -password
Constructing the query
Let's say we are interested in failed Discovery jobs. So for each device we want to get some info about the device itself and error logs for last job that was a failed Discovery
.
We will be using data from tables device
, device_history_job
and zone
. The device
table contains useful columns like id, address, description, model, type and vendor. The device_history_job
table is populated by useful data in the create_time, error_log, info, job_type, device_id and successful columns and the zone
table is used to describe device zone membership.
We SELECT
columns we want displayed FROM
the tables and LEFT JOIN
table device_history_job ON
id of 'last device job that was a Discovery' via a subquery and table zone ON
id of the zone. Then we filter the results with WHERE
by 'failed jobs'. And let's say we want to limit the results to recent ones, e.g. ones that took place in the last week. Our query then might look something like this:
We have used REPLACE
to output the error log on a single line for a more comprehensible way to display results.
Let's name the columns properly and put the query in quotation marks. Now we can feed the query into the mariadb command via --execute
option and write the output into a local file:
For failed Backup
jobs we would just change the job_type
in the WHERE section of the query to 'BACKUP'
. Also, since a successful Discovery is a prerequisite for a Backup job, we can select additional columns, like vendor, type and model, to describe the discovered device in more detail. Queries and shell commands for failed backups, along with Postgres and MSSQL versions can be found on our GitHub.
Reviewing the data
We are using OnlyOffice Excel in our scenario because it is free and gets the job done. To view the data simply open the file in your favorite spreadsheets tool and choose 'Tab' as the delimiter:
Alternatively you can import the data using the Data Import feature:
Now select the data and format it as a table for advanced filtering capabilities. You can then filter the results by devices, zones or specific keywords in the error log messages. Here is an example of how to filter the logs based on multiple criteria:
Conclusion
And that's basically it! This short guide should provide a starting point for generating and viewing failed job reports from Unimus. There is also a thread going on our Forum for any feedback, questions or possible improvements.