Top 10 tips for Priority Issue Handling in IT Operation Support


At the time of writing this blog i was 17 years of experience in IT operation support , centered & conquered many fire fighting situations.

Over the years while troubleshooting those priority issues i came across very interesting human mindsets, flaws in infrastructure & processes and it built my confidence at certain extent that I started enjoying it 😊

Every priority issue brings in so much to your instincts that one who comprehend will dives and survives.

Although following tips are baselined around databases technology but they are not limited to it and can be referred by anyone.

Priority Issue means something in business critical serving application is not working as expected causing unavailability and revenue loss.

Assuming you are a technical support engineer or manager who's primary responsibility to resolve the situation for your customer , then let's go..

1. Get All Questions Answered

Lucky if you are the first one to sense that issue has happened but fine even if it's not.

As soon as you enter the battle ground which could be in the form of conference meeting or group chat or set of impatient minds around your workstation , keep in mind that first few minutes of your questions to audience will be answered with huge respect and with good response time.




Questions should contain exact time when issues triggered , recent changes to the system by anyone in arena ( application developer, system engginner. , cloud admin, dba , storage, network guys etc.) , errors or slowness or any observation from owners of respective domains etc.

Half of the battle is won if you get near to accurate answers to your questions from all angles which is less likely 😄


2. Fact Check to Relate the Cause

As of now , you must have got at least 50% understanding of the issue but too early to jump on exact cause.

Heard enough from others , now it's time to check the facts by mining through your own system.

This is the time you will realize if issue lies in your area or not but never disclose it too soon as it may return back like our neighboring enemy's.

If issue has most likely implications due to something wrong happening in your system then be ready with your weapons of information.


3. Known Issue Scan

Each one of us have a great computing and storage mind and nothing misses out of it if scanned wisely.



Try correlating the symptoms to the issues you handled in past, although there should be KEDB (Known Error Database) in problem management world but finding where it is kept may need more braining.

If the issue is known (or near) then you are close to finding the possible causes of it.


4. Bring All Resources on Table



This is the time to take little step aside and think outside your system i.e. how you will find the resolution ? what resources you will need like more pair of eyes , vendor support , more logs / statistics from somewhere else , Change process , downtime etc.

List them all and throw it infront of stakeholders to keep them engaged via helping hands.

5. Multiple direction research for common causes

Start indulging your resources in parallel.


If root cause is not identified till now then research work should begin at all possible levels.

Quick turnaround is seen If each domain involved put their 100% effort in issue finding generously by owning the situation.


6. Handling the uncertainty & queries

Outcome of the previous step will give rise to many questions but handling them sportingly keeps the motive of everyone to solve problem collaboratively.

Anyone might feel the troubleshooting done till now is such a waste of time and did not resulted in anything but it's not true.

Consider it as a just a checklist that you tick out and narrowed it down towards actual resolution.


7. Keeping to timelines


Pressure building can be seen and every task suggested will be timeline by the incident management ( as they have already started thinking of post mortem)

Avoiding over expectations in this situation is a trick by committing less and delivering more.


8. Transparent approach & Action planning

When your are near to finding resolution but fixes are complex and can wide open the gaps in process , human error or system bugs ; it is important to be transparent with audience with proper action plan with risk associated.



More you tolerate during issue time will help you to relax during after effects.


9. Timely updates on progress

By now resolution actions have begin which could bring unpredictable outcomes sometimes but their supplementary plan is required.

You will notice that in this phase every activity is communicated to very high level as a business Communication and adds more importance.

Implemented solution should be judged by all round testing before concluding.


10. Documentation & Learning or take always for future

Documentation should consist of action sequences , time taken in every task , involvements , external supports and process 
followed.



Every such issues brings new learning to the engineers/managers approach for future and it should be analyzed deeply and shared across to evolve your best practices.


Best of Luck !



No comments:

Post a Comment