How to effectively troubleshoot product configuration errors, crashes, and bugs?
In a recent podcast, I interviewed Luigi Silipo, the support manager globally here at Boyum IT. In this post, I will share the highlights of his recommendation to troubleshoot a support case effectively.
Ineffective troubleshooting often involves asking many details and questions, many tests, and information that sometimes might not even be relevant to fix the problem. And of course, it takes a lot of time and effort for a customer that wants a quick fix to the problem.
So, how can support and implementation consultants understand how they can quickly troubleshoot the problem in less time or make it more effective and ask the right questions?
Luigi, please share a bit about what you're doing as a support manager?
Well, we are, we are handling global requests for all our international support requests for all our products, which are different software solutions.
As you know, we have vertical solutions, which address the logistics and the manufacturing markets. Then we have horizontal solutions, which are mainly used in SAP Business One for general usability improvements.
We usually answer support requests on these kinds of products and support requests and support tickets. They range from general usability questions to more complex configurations and problem-solving like errors, bugs, or unmanaged functionalities that throw out something on the user's screen.
Luigi, you've been in support for long enough, and you've seen many, many tickets. So what would say the range of tickets that you receive?
The most common type of tickets we're getting is for complex configurations. Both WMS and Beas Manufacturing are highly flexible software solutions, highly configurable.
You might be trying to achieve a specific result. You might be expecting that the result to be shown in a particular way. And the software is not performing as expected.
In general, our horizontal solutions are very well established on the market in terms of technology foundation. And the time they'd been around in the SAP business one world, makes them stable and reliable.
When someone excels in troubleshooting, they know what questions to ask. They don't waste time collecting information they don't need besides the fact that it's a better experience for the customer. It's also far more efficient for the customer support and the consultant team.
The question is how to best figure out the root cause of the problem before contacting volume support. How to make it more efficient to expedite the resolution time. What would you say is the best troubleshooting process for those consultants and support to go through?
One of the most critical steps when running a live environment is to set up a replica (test system) of that live environment and try to have the most recent copy of customer data in that test environment. So, suppose anything is happening in the live environment. In that case, you can quickly test it in your test environment and try to reproduce the issue.
Another thing that is extremely important for more complex scenarios is to provide a detailed document (a word document). The document must include the exact steps that a person has been performing to get to the point where you see an error or don't get to the expected result.
Why am I stressing out these?
Boyum vertical products are highly flexible and configurable software solutions. Therefore, we must try to remove anything that is not relevant to the business scenario.
So, any third-party add-ons, which might be running, on your customer environment. Any customization, any custom programming that has been added.
This is one of the most challenging parts to get through with our partners because, of course, one of the objections is, 'but I need that custom code. So why should I be turning it off?
It's not because we ask you to run your environment without that. Still, it's just to get to a faster and more effective root cause identification.
Last, has something changed in the system that you're using? I mean, did you install something? Have you done something different from the way working before? That's important. We do get cases where we often read, 'it was up and was working up until yesterday, and now it's not working'. And that's exactly where it, these kinds of question come into place.
What changed in the software since the last time it was working and now that it's not working anymore? Is there anything new that happening?
It might even be a third-party software, which is not even related to SAP Business One. It could be, for example, an antivirus that has been suddenly starting to do something different or a new software has been installed on your server. That could be unpredictable and conflicting with one of our solutions.
So, these are the four leading suggestions:
- Create a test system of your live environment
- Please provide us with a detailed document including steps to reproduce the issue
- Trying to remove out of the way anything that is not relevant.
- What was changed in the system?
What happens when sometimes you cannot reproduce the issue? What should a person do at that stage?
Well, that's where the trickiest conversations honestly start. As you might clearly understand, we can't report to our developers an issue once that is not reproducible. It's tough to ask somebody to fix something that they don't know how it is happening. So, my suggestion is, as I said, is to have a replica of your live environment. If, in such a scenario, you cannot reproduce the issue, what we can do as Boyum support consultant, is to join into a session with our partners. Try to replicate the business scenario along with the person that is reporting the error. Try to understand the problem that is making the software not perform in the right way. But if something is not reproducible, it's complicated for us at the support desk to take on the issue and report it to our developers.
How important is the process of elimination of the problem?
Is it the most important thing you can do to troubleshoot an issue? Sometimes you might believe that a setting you just touched in the software is entirely irrelevant is affecting a completely different area.
And then, you end up realizing that instead, it was just that switch that you touched that is creating the problem in a completely different area of the software. So, the elimination process is absolutely one of the most important things you have to do. Track backward, step by step, for as long as you can remember what you've been doing in the last few hours, in the previous few days in the software. Try to revert every step you have been doing and cancel it. You can try to revert that setting or change back the configuration to what it was before the issue was happening to identify the potential root cause.
If you have just introduced the new software, switched it off, and test the process again, is the process still throwing up the error?
No. Okay. Then most likely is the new software that you activated is creating the problem. That's sometimes a time-consuming process that requires a lot of patience, attention, and open-minded, critical thinking.
We often get supports ticket where the problem is described as a bug or as an error. Still, it ends up being a configuration issue. What can you say to help them identify the root cause of a configuration vs. an actual bug in the software? What would you say would be the easiest way to find out?
We sometimes get questions, which are just general configuration issues or reports of what the requester believes to be a bug. Our job in support is also to provide advice to the best of our understanding of the problem. We usually try to stop at the basic settings as support consultants. We do not offer implementation support. We just keep basic advice, switch this on, or switch that off to get to the results you want. What we try to do always is to refer our requesters either to online documentation or YouTube videos.
We strongly advise that they follow our e-Learning programs and get certified on our vertical and horizontal solutions. As we do have plenty of material, which covers even complex business scenarios, more tricky configurations.
Our best suggestion is to try to follow the eLearning and enablement programs to remove out of the way all those cases, which are not related to fundamental knowledge gaps.
When facing an issue that brings the company down, for example, not executing sales orders? What would you say would be the first step to analyze a ticket like that for it?
It is generally the same process of going backward to the last time you change something and then removing anything that might have been creating the issue.
It's easier to find the root cause for such cases because such a business-critical functionality is typically used by all users every day. Most of the time, so it usually working. So, you should be able to identify why that has suddenly stopped working.
If you cannot solve the issue, open a support ticket in an urgent status. Urgent status means your system is down, and there is no reasonable workaround in a business-critical functionality.
The immediate action we do is to have a remote session and understand the cause of the problem. Then, try to get you up and running again. Suppose that it is not possible to solve the problem with the support agent. In that case, we do have an SLA policy to try to get you back up and running in two hours from receiving the request to try and to involve our senior developers.
Before a partner logs in a ticket to support, what would you say is the most mandatory information to fill in through the website. Is there any information that is missing that must be on a ticket?
One of my goals is to make some improvements to that webform. Today, all the information that the partners are inserting there is primarily manual.
My suggestion is to try to follow the guidelines. First, set the ticket's priority correctly. Second, attach a word document to the ticket, with the steps to reproduce the problem.
By steps I mean, the screens, the clicks, and the single options the person has been doing to get from point a to point B. That's extremely important.
What is also essential that's another information that we need to get on the ticket is the installation numbers. So again, please try to follow the format we are requesting because that also speeds up the ticket processing time.
Once we get the correct installation number, we can immediately check who the partner is? Who's the customer, what's the maintenance status, what's the version they're running. So, everything goes smoother and faster.
It would be great to improve the web form. It's been a progress from email but making more improvements would be great.
You're touching on an excellent point. We have seen an outstanding improvement in terms of the content of the support requests we're receiving. It's been one year since we moved from email to web form. And I think we can safely say that there has been a significant improvement for our partners and our support agents.
Yeah.
We've been doing textual support for a while. Sometimes, going back and forth with partners for a long time. Should we do screen-share for most tickets coming from our vertical solutions to speed up the resolution time by seeing the problem in our own eyes because it's sometimes tricky to document it?
As a general rule of thumb, once support agents exchange the second and third mail on a ticket, they would go into a session to avoid long chains of messages. Of course, a remote session is something that helps in communicating and usually helps also in understanding. But I must tell you that at the same time, our partners are being very, very precise.
Most of the time, especially the more experienced partners, the ones with certified consultants who know the software and know the processes, usually submit the documents, and give us the steps to reproduce. So, in such a case, a session is not needed because the issue reported is clear. The support agent could pick it up and reproduce it immediately in one of our environments and see if the error, of course, needs a fix from our developers.
You've been managing support for the past two years or more. So, what's coming up for the Boyum and support department?
We are heavily working to improve our answer bot. This is an automated system for pulling out information from the knowledge base for providing answers to the requester while submitting the ticket.
Also making sure agents propose articles or troubleshooting guides when they receive a ticket.
We are also working heavily in strict collaboration with development to try to increase and improve the quality of the content that we have in the knowledge base.
We identify software areas where it could perform entirely differently depending on your setting and start to provide some sort of business scenario configuration advice. For example, what should you be setting the system if you want to achieve specific results?
That's to avoid wrong interpretation of some functionalities, leading them to errors or long support tickets. So that's something we're currently doing.
Before we conclude, do you have anything else to say to our partners that are maybe struggling day-to-day on their support tickets?
Keep up the excellent work. Thanks much for the corporation you're putting into these.
I know that sometimes it's challenging. We sometimes ask you to upgrade, but not because we want to deny your case or turn it off. But just because we know that there is fix and improvement in the latest software versions, a better technology, more stability, or we are better aligned with the latest releases of SAP. So sometimes we ask you to make the additional effort.