Do not Assume. Discuss and Add Test Cases

Sometimes, we all assume things would work in a certain manner. This assumption may be true at a given time but can come back and bite us any day.

Take a simple client-server use case. There would be some documentation around what the service looks like and basic configuration around that. But most of the time, some minor details, every developer takes for granted and the client-side and the server-side developer would never discuss that and just Assume.

Let us take an example of one such incident.

Scenario

There was a client-server communication code and it was working smoothly for years. One fine day, it stopped. There were multiple clients using the same service and it started failing for only one of them.

There were no intentional specification changes on the service or client. But we had upgraded springframework version from 4.2.9 to 4.3.17

The Change

Following was the change in service due to the Spring version upgrade Before the spring update, the service was giving out the response with

Content-Type : application/xml;

but, after the upgrade, it got changed to

Content-Type : application/xml;charset=ISO-8859-1

Here, if we observe, charset is added to the Content-Type. This broke one of our clients.

The Root Cause

AbstractHttpMessageConverter of spring-web has changed between spring 4.2.9 and 4.3.17 to add charset if not present. Spring Doc

Why did it go wrong for 1 client?

The service was not doing anything specific to handle the charset.

@RequestMapping(value = "<URL>",
    method = RequestMethod.POST, consumes = MediaType.APPLICATION_FORM_URLENCODED_VALUE,
    produces = MediaType.APPLICATION_XML_VALUE)
@ResponseBody
public String processRequest(HttpServletRequest request) {
    // do something 
}

Here it relied on MediaType.APPLICATION_XML_VALUE from spring to take care.

Now, the one client that failed had ASSUMED that 'Content-Type' would be exact application/xml; and others which did not fail, had ASSUMED that 'Content-Type' would contain application/xml;.

Due to AbstractHttpMessageConverter default behavior change, that assumption was no longer true and it failed.

What do we do now?

Spring upgrade was necessary and can not be reversed. Client code was a remote application so not easy to manage and an update was not in our control. We had to take care of changes on the server-side to limit Content-type with just application/xml;.

What did the Assumption cost?

Every mistake comes at a cost.

It took us a few days to identify the root cause and a whole week of efforts for 5 people for Impact analysis and regression area identification and validation activity across multiple clients and multiple client versions.

When touching the legacy code, one has to be extra cautious.

  • The team may not have full functional knowledge.
  • SME of legacy code may no longer be available.
  • Identification of impact and solution design with minimum impact

Take away from the incident

  • Never Assume. Discuss and document.
  • Never leave crucial things up to 3rd party libraries. Define it yourself.
  • Add a unit test case to validate the defined configuration for the service.
  • Add a contract test for all cross-application calls.

When we plan for a contract test, many such minor items will also get discussed and answered beforehand. It might not be a big deal at the time of adding it but if we ignore it, a few years down the line it becomes a big deal.

Happy Coding.