Yeah, it’s another post on that hottest of topic – maintenance! I’m still waiting for my conference keynote…
Over the last few weeks I’ve watched a familiar story play out in one of my teams and it’s worth capturing at high level because I think it lays bare one of the big fallacies I see in the commercial versus open source space. Namely that support contracts are a reliable safety net.
The story goes like this: A user has reported a problem to us. The specific nature of the problem isn’t very important, but the key thing is that there’s 2 different systems in play and the problem is somewhere in between them. One is a commercial piece of software and one is open source.
My team did what they always do. They verified the error was real, captured some information about the nature of it, and then reported it to the commercial supplier, as we were pretty certain the bug was there. As always the immediate suggestion was that the error lies in the other system. The onus is on us now to prove that it’s not in order to access any substantial further assistance.
My team and some other colleagues knuckle down to days of work to prove that the error is where we think it is. In this case to do that we did the following:
- Because the other system is open source, we put some additional debug in place on that side to get more information about what was happening.
- We created test scenarios and used automated testing tools (Selenium) to load test the error condition since it was intermittent.
- We systematically tested modified configurations within the commercial system to verify that it wasn’t a misconfiguration of the integration.
- We contacted other users of the commercial system to see if any of them had experienced the same issue.
- We put a workaround in place on the other system (it’s not elegant but prevents the worst user experiences from happening by gracefully failing).
In short we’ve done virtually everything you might if you were running an open source product and solving the issue in conjunction with your community. The only thing we won’t be able to do is actually effect a fix in our own timescales once the issue is found…
In fairness, after a chunk of the actions above, we have been able to get help from our commercial support to get debug in place there too and also help looking at the log files. In general though, we are still working on the basis that we have to put substantial effort into proving that the error exists in that system before we can access any assistance towards resolving it.
I do understand that we need to play an active role in finding these kinds of issues when they occur. However, it is our experience with many (but not all) suppliers that where we have support contracts in place, we aren’t working partnership to find problems when they occur, but rather are doing a heck of a lot of work up front to prove that the error exists in order to get traction.
Thankfully I have colleagues with the skills required to tackle some of the kinds of investigation described above. What it’s like for others who don’t have this kind of support is a worrying thought. It’s yet another argument for not using outsourcing of your edtech to gut your IT capacity and skills. Support contracts are not worth the paper they are written on most of the time. And don’t get me started on trying to manage service level agreements…