Blog Post

Optimising Legacy Systems for Performance and Reliability

September 16, 2024 Azure, Software Engineering, Tools by majdyafi

So assuming that a team worked very hard to assemble legos into a train. Two years later, you were brought into the team and were asked to disassemble it and rebuild a faster and more reliable one but using the same lego parts.

Here’s what I have been through when I was tasked with a similar mission recently.

The most important KPIs I used to measure against non-functional requirements are:

CPU and Memory usage and Network latency
Query Performance (against the database) in monolith architectures or against each micro-service
Request Rate
Throughput
Response time
Amount of data processed on the server vs. client

I have also looked at the following smoke guns:

Over-engineered parts
External libraries (I reevaluated the need for each of them)
In-memory caching for dynamic data and CDNs for static data
Single points of failure
Incorrect use of asynchronous calls and concurrency
Unreliable tests on like-for-like environments (In my experience, this was due to performance tests running stress tests against APIM with a developer license for the dev environment. This license can handle up to 1000 calls). This has given me false indication of how production performs but went unnoticed until investigated
Frontend logic vs. Backend (i.e using javascript heavily on the frontend to invoke APIs written in .NET vs. calls invoked from MVC clients)
Quick wins: code minifying, image compression, CSS optimisation, etc.

I created a report that contained the above before making any attempt to disassemble a working system, albeit being sluggish and unreliable according to end-users. Bear in mind that it inevitable to avoid compromises, therefore, I recorded all decisions along the way using ADRs throughout the re-engineering process.

The top trade-offs that I have encountered are:

Monetary (budgeting and costs)
Cloud limitations and cloud patterns (for mostly hybrid applications: on-premise + cloud, or multi-cloud ones)
Performance, and reliability despite being the main goals of this post
Security
Integration contracts (commonly swagger) with third-party systems and other micro-services that live under the umbrella of this project in question
Applications’ versions and dependency versioning (backward compatibility)
Constant changes of business requirements while refactoring the current implementation of singed-off requirements
Management of the non-technical: such as teams’ resistance to change and teams’ dynamics

The following table shows in numbers the performance gain I managed to win by change

Technique	Gain	Comments
Replacement of ORM	Negative Gain 🙁	milli-seconds were lost due to the fact that entity framework generates SQL where as Dapper delegates this activity to developers. As a results milli-seconds were lost
Introducing Full-text search and indexing the relevant columns	1 second gain per request 🙂	Full-text search optimises string matches. Massive win.
DNS Caching	milli-seconds (estimated)	Hard to measure but I gained reliability more than performance
Compressing Calls	up to 500 milliseconds	Aggregate where possible. Simply return IQuerable objects and iterate over IEnumerable. Don't confuse the two. this helps EF generating optimised queries.
Removing external libraries	Reliability and consistency gain	Only use libraries that you trust and consistent with the application's logic. I removed the library that created in-memory database for unit-tests and replaced it with Moq. This relieved the server's memory
Remove static assets that are no longer in use, minimise the code, and CDN it.	milliseconds	All these static assets can be safely cached.
Database structure, pool, and locking mechanism	up to 1 second	depends how many calls exceed the limit for APIM (due to licensing) that await for to be served.
Overall	up to 2 seconds with max of 7 seconds gained	great for applications with limited bandwidth and budget. Also, for applications running low-end cloud licenses.

P.S: all my posts are driven from experience not AI tools like ChatGPT.

Taggs: architecture azure design docker software engineering

Write a comment