Optimising Legacy Systems for Performance and Reliability
So assuming that a team worked very hard to assemble legos into a train. Two years later, you were brought into the team and were asked to disassemble it and rebuild a faster and more reliable one but using the same lego parts.
Here’s what I have been through when I was tasked with a similar mission recently.
The most important KPIs I used to measure against non-functional requirements are:
- CPU and Memory usage and Network latency
- Query Performance (against the database) in monolith architectures or against each micro-service
- Request Rate
- Throughput
- Response time
- Amount of data processed on the server vs. client
I have also looked at the following smoke guns:
- Over-engineered parts
- External libraries (I reevaluated the need for each of them)
- In-memory caching for dynamic data and CDNs for static data
- Single points of failure
- Incorrect use of asynchronous calls and concurrency
- Unreliable tests on like-for-like environments (In my experience, this was due to performance tests running
stress tests
againstAPIM
with a developer license for the dev environment. This license can handle up to 1000 calls). This has given me false indication of how production performs but went unnoticed until investigated - Frontend logic vs. Backend (i.e using javascript heavily on the frontend to invoke APIs written in .NET vs. calls invoked from MVC clients)
- Quick wins: code minifying, image compression, CSS optimisation, etc.
I created a report that contained the above before making any attempt to disassemble a working system, albeit being sluggish and unreliable according to end-users. Bear in mind that it inevitable to avoid compromises, therefore, I recorded all decisions along the way using ADRs throughout the re-engineering process.
The top trade-offs that I have encountered are:
- Monetary (budgeting and costs)
- Cloud limitations and cloud patterns (for mostly hybrid applications: on-premise + cloud, or multi-cloud ones)
- Performance, and reliability despite being the main goals of this post
- Security
- Integration contracts (commonly swagger) with third-party systems and other micro-services that live under the umbrella of this project in question
- Applications’ versions and dependency versioning (backward compatibility)
- Constant changes of business requirements while refactoring the current implementation of singed-off requirements
- Management of the non-technical: such as teams’ resistance to change and teams’ dynamics
The following table shows in numbers the performance gain I managed to win by change
Technique | Gain | Comments |
---|---|---|
Replacement of ORM | Negative Gain 🙁 | milli-seconds were lost due to the fact that entity framework generates SQL where as Dapper delegates this activity to developers. As a results milli-seconds were lost |
Introducing Full-text search and indexing the relevant columns | 1 second gain per request 🙂 | Full-text search optimises string matches. Massive win. |
DNS Caching | milli-seconds (estimated) | Hard to measure but I gained reliability more than performance |
Compressing Calls | up to 500 milliseconds | Aggregate where possible. Simply return IQuerable objects and iterate over IEnumerable. Don't confuse the two. this helps EF generating optimised queries. |
Removing external libraries | Reliability and consistency gain | Only use libraries that you trust and consistent with the application's logic. I removed the library that created in-memory database for unit-tests and replaced it with Moq. This relieved the server's memory |
Remove static assets that are no longer in use, minimise the code, and CDN it. | milliseconds | All these static assets can be safely cached. |
Database structure, pool, and locking mechanism | up to 1 second | depends how many calls exceed the limit for APIM (due to licensing) that await for to be served. |
Overall | up to 2 seconds with max of 7 seconds gained | great for applications with limited bandwidth and budget. Also, for applications running low-end cloud licenses. |
P.S: all my posts are driven from experience not AI tools like ChatGPT.