Our client utilized an advanced batch scheduling system to automate their EOD processes. A majority of the batch jobs were created and designed by the client’s internal IT team with varying levels of collaboration and input from our team ranging from none to moderate.
Our role was to monitor, support, and continuously improve our client’s EOD system. In order to provide the level of service our firm holds ourselves accountable for, we put together automated monitoring controls such as success/failure messages which are emailed out following execution. We also maintain extensive documentation detailing every aspect of the client’s workflows both from a business (the reason) and functional (the solution) perspective. When a breakdown in the system occurs, it is this level of understanding that allows us to swiftly implement the appropriate resolution.
Batch job creation and monitoring
Schedule setup and enhancement
Increase job run efficiency
Deep understanding of systems
While 100% uptime is always the desired, there are inevitable system breakdowns. Our client had a failure during a process which pulls data from a trade management system in order to create and distribute reports for end users. This process involves generating multiple PDF reports, merging them into a single report, and emailing out this report to a company-wide distribution list.
Upon failure, our firm took immediate steps toward resolving this issue. Through our deep understanding of our client’s batch scheduling system, we were able to utilize its logging feature to identify the breakdown had occurred when multiple reports were merging into a single report.
Our team then located the directory where the reports are stored. By referencing the documentation we had assembled, we quickly recognized that one of the reports was missing and error had actually occurred during the report generation. Next, we located the stored procedure used to compile the reports. After reviewing the stored procedure, we realized it was calling an SSIS program that was managed by one of the client’s internal teams.
We communicated the system failure to the client and worked with them to fix the case logic that resulted in the report not being generated properly.Our team was effectively able to recognize a failure had occurred, locate the failure, trace the breakdown back to its source, and work with our client to rectify the issue in a matter of minutes.
For our firm, however, the work doesn’t stop there. We then took a step back and considered how we could improve the system and process. One of the ways we improved this specific process was to make modifications to the process so that if a failure occurs when reports are being generated, we recognize it before the reports are merged. We accomplished this by building an additional task to make sure all reports exist before attempting to merge. Once we had a long term improvement to the report building process, we then implemented checks for other similar processes.