Allari - Providing IT as a Service

Incident Brief: EnterpriseOne Job Scheduler Outage

Executive Summary

Users at a global producer of cable management products running accounting, manufacturing and distribution operations on JD Edwards EnterpriseOne version 9.2 reported an outage of the application's scheduler function. From first report until service restoration hundreds of jobs failed to process throughout the business. 

Key Failure Points 

Following troubleshooting the approximate time of the scheduler's failure was identified, but a specific reason for the failure could not be determined beyond a possible isolated network issue.   

Corrective Action Taken

The scheduler was restarted manually and failed jobs were re-submitted. Additionally, a process was setup to automatically submit a test job every 60-minutes on a continuous basis with an alert triggering if any test job fails to complete without error within five minutes of submission.

Lessons Learned

Despite a morning systems check which showed the scheduler was running and operating properly a failure of the function still occurred. Implementing an automated test/alert process on a repeating basis enables a quicker response to future scheduler failures and provides a smaller window of time to isolate the root cause of the incident.

 

RELATED SERVICES

Guard against unplanned downtime by proactively scheduling the daily, weekly & monthly tasks designed to keep your business systems operating at peak performance year-round with Allari's best-practice Support Plans. 

Review the tasks and calculate your monthly cost for the following technologies: 

EnterpriseOne Support Plan

SQL Server Support Plan

Oracle Database Support Plan

MORE ARTICLES YOU MAY LIKE

How to Leverage Your IT Services Provider to Help Transform the Organization

As the digital revolution continues personnel inside the IT organization are on the front lines in efforts to make the tech-centric strategies of the business become a reality.

 

5 Reasons Multitasking is Bad for IT Productivity

When it comes to IT operations, multitasking seems to be a prerequisite. Quite often it's even written into the job posting. However, research is revealing that multitasking may do more damage than good.

It's the Process, Stupid!

he last 200 years isn't a product, but rather the scientific method, the process which has been used to create millions of products. Today, when change is exponential, a focus on process over products is even more important.

Challenge

UNPLANNED DOWNTIME

Definition
Downtime is a term used to describe when a service is unavailable to its intended recipients. While downtime can be planned months in advance, it is typically not and is often a surprise.

Most downtime events are unplanned and caused by a failure or are triggered on short notice and occur as a result of an attempt to fix a service that is not performing at its optimal level.


Signs & Symptoms
Downtime is the number one cause of financial harm yet most IT leaders don't understand the signs and symptoms of an environment that experiences too much unplanned downtime.

Sure it's easy to surmise that the systems are offline more than they should be especially when management is enraged but there are legitimate signs and symptoms which will allow you to reduce the frequency and impact of unplanned outages.

  • Unauthorized Changes
  • High amounts of Unplanned Work
  • Low Throughput of Effective Change
  • Server to Administrator Rations < 100:1
  • Lack of Indicator Measurements
  • SLA Commitment Breaches
Top 3 Ways To Prevent Downtime

1. Implement Preventive Maintenance Schedules

2. Execute Pre-Business System Checks

3. Automate Measurements & Indicators