Learn how to build a scalable software architecture that grows with your web application. Discover practical strategies, cloud tips, tech stack insights, and real-world examples from JavaScript and .NET projects.
A QUICK SUMMARY – FOR THE BUSY ONES
Before implementing changes, analyze metrics to identify real bottlenecks. Avoid premature optimization by starting with high-level data and narrowing down to granular insights. This ensures your fixes are targeted and impactful.
Cloud platforms offer cost-effective, flexible scaling with pay-as-you-go models and auto-scaling features. However, legal and data privacy constraints may require on-premise setups in rare cases.
If you’re not ready for microservices, modular monoliths deliver the best of both worlds – simplicity of monoliths with scalable modularity – without the full overhead of distributed systems.
TABLE OF CONTENTS
Many tech leaders can relate to this story – you choose a tech stack that seems like the perfect fit for your early-stage application – it’s cost-effective, fast to deploy, and easy to manage. However, as the user base expands, the platform starts showing signs of strain. App load times are increasing and your users are experiencing more instances of downtime. Scaling horizontally is also becoming a major challenge.
Finding the best way to move forward is difficult. The good news is, we’re here to help.
In this article, we share our take on how to create a scalable software architecture, based on our experience of building JavaScript and .NET based web application projects.
From our experience, before implementing any form of strategy or solution to tackle problems, you should dive into their data, starting with a high-level overview and then diving into a more granular look. This helps avoid engaging in optimization too early in the process and focus on real-life, actual problems.
Once, we faced a situation where we had a complex process in the system that was, essentially, the backbone of the entire application. It caused many issues and was highly inefficient, which – as you can imagine – affected the flow of tasks. Instead of attracting users, it discouraged them from using the platform.
We knew that things had to change but before making any changes, we defined a set of metrics to figure out what to focus on. We started monitoring things like how long specific parts of the process took. By doing this, we broke down a complex process into measurable numbers that we could work with.
This approach was valuable because it helped us spot the biggest bottlenecks. Additionally, as we implemented changes or fixes, we could verify if we were moving in the right direction.
Regardless of which strategy you choose, you must validate it in practice to ensure it delivers real value. That’s why having metrics in place is crucial before making any changes.
From a purely technical standpoint, scaling tends to be easier on the cloud. These services offer a more efficient and flexible approach to scaling compared to on-premise solutions, as they follow a pay-as-you-go approach.
With a scalable software architecture, you can quickly adjust your infrastructure as your needs change, without the upfront costs and maintenance overhead of physical hardware. This means you can scale up during traffic spikes (think Black Friday for e-commerce platforms) and scale down during quieter periods, optimizing your resource usage and costs.
That said, not all companies will be able to maintain their infrastructure on the cloud due to legal constrictions, such as data privacy and security policies. While rare, sometimes it might be easier, safer, and more affordable to rely on on-premise infrastructure. If that’s not your business case, we strongly encourage you to go with the cloud.
It’s important to understand how the system works and look into its specifications. Performance metrics are certainly important, but it's also necessary to recognize that some parts of an application are used infrequently. For example, you might only have to generate financial reports once every six months or once a year.
In such a case, keeping this functionality in the core part of the application doesn’t make sense. Instead, we could separate it into a standalone service or a different part of the application.
This approach allows us to better understand when and how we will use certain functionalities. Not every application is used daily in a straightforward way. Sometimes, you need to trigger a process that runs for several hours to get the required results. If this was part of the main application, it could use up resources that other functionalities need, causing problems not because it's slow but because it takes away resources from other tasks.
You’ll likely see situations where there’s a sudden surge in demand for your system – be it the above-mentioned example of Black Friday if you’re an e-commerce, or users accessing their bank accounts on pay day. In any case, you’ll need mechanisms that will help you maintain performance during traffic spikes, while optimizing costs for ‘quieter’ periods.
Without a cloud-based approach, we would literally have to purchase additional servers, only to shut them down afterward. While Black Friday is an extreme case, scalable software architecture enables adaptability to different business contexts and strategies, ensuring seamless scalability as demands fluctuate.
Cloud solutions significantly support this because, instead of going through the hassle of ordering hardware, bringing in an administrator and dealing with the entire setup overhead, we can simply scale up in a matter of minutes. The difference in ease and efficiency is incomparable.
Auto-scaling automatically adjusts your web apps’ resources based on current demand. Using it, you can monitor key metrics like CPU usage, memory consumption, or request rates, as well as add/remove instances as needed.
Choosing the right tech stack is crucial for scalability, but using microservices or building APIs in a technology like Go, for example, isn't always the best choice. In our opinion, popular solutions like single-threaded Node.js or monolithic architectures are often sufficient for scalable software architecture, especially if the team has experience in that area.
The key is to match the technology with the team's skills and the project’s actual needs, ensuring clear, well-structured, and secure code. It’s important to remember that each strategy comes with additional costs – often significant ones like maintenance, future development, or team training.
You shouldn't fall into the trap of using certain strategies just because they’re trendy. Trends always come at a price. Even if a strategy aligns with your metrics and solves a problem, it also introduces new challenges. You can’t ‘just’ implement it and expect it to work without considering potential consequences.
In the case of niche programming languages or microservices, these could take the form of high costs due to a more complex implementation process and less market popularity (which makes finding Go developers challenging).
Meanwhile, microservices could bring challenges with deployment, communication between services, infrastructure, and maintenance. It could also come at tradeoffs in local development, latency, and data consistency.
Prisma, a well-known library in the JavaScript and TypeScript ecosystem, can act as a cautionary tale of taking on a technology that isn’t an ideal fit. A while ago, the platform founders decided to go with Rust (getting a lot of hype), and rewrote their engine in the language. However, they recently reversed course, going back to TypeScript.
It turned out that while Rust is incredibly fast and efficient, it also introduced significant challenges – primarily in terms of maintainability. It’s our guess that either no one knew how to manage it properly, or the supposed performance gains weren’t as impactful in practice.
When a company already has an established team of specialists, choosing a language or technology that the team knows often makes more sense. For example, at Brainhub, when a client approaches us about building an application, we usually aim to educate them on why it’s better to use something that both their team and the market are already familiar with.
After all, what’s the point of using a language or an ecosystem that’s theoretically 10% more efficient if it’s much harder to find qualified specialists? Just because we start writing in an ultra-efficient language or use such an ecosystem, which we don’t fully understand, doesn’t mean we’ll actually achieve high efficiency.
Experience and expertise matter far more than raw theoretical optimization. An A-player, even in a relatively slow language or technology, will typically be a lot more productive than someone struggling with a new, unfamiliar stack.
Sharding involves splitting large datasets into smaller, more manageable chunks, called shards, each stored on different database servers. This improves database performance by distributing read and write operations across multiple servers.
For example, you might share a user database by region, ensuring that queries for one region don’t overload a single server. This enables horizontal scaling and reduces the strain on a single database instance.
Failover systems ensure that if one component (e.g., a server or database) fails, another component automatically takes over with minimal disruption. Redundancy involves maintaining duplicate systems or servers that mirror the primary system.
Some examples of this are multiple data centers or cloud regions, which ensure that if one goes down, the other can handle the load. This increases uptime and prevents service disruptions due to failures.
When a monolithic architecture is no longer sufficient to keep up with the app’s scalability, leaders often face the dilemma of whether they should go all-in and adopt a microservices approach. At the same time though, they worry that they won’t be able to onboard it quickly enough and will struggle with managing it.
This is a multi-tier problem. On the one hand, it’s true that a monolithic legacy system becomes difficult to maintain or scale, and embracing modularity is the right move. On the other hand, going with a large-scale move to microservices (especially, if you decide to do so due to popularity, not verified project-fit) can lead to unnecessary complexity and costs.
A modular monolith can be a viable alternative. It offers the straightforward deployment of a monolith while providing the flexibility and scalability of microservices. This makes it an ideal choice for businesses seeking to modernize legacy systems without the added operational complexity of a full microservices architecture.
Modular monoliths protect you from scenarios, where you build microservices from the start, divide the system in a certain way, only to realize that some components actually belong together as part of the same concept.
If we make these architectural decisions too soon, merging services later becomes significantly more difficult than splitting them at the right time.
We recommend giving our dedicated piece on modular monolith architecture a read to learn more about the approach.
The truth is that cloud bills can skyrocket fast if you don’t manage your spending properly. Most cloud providers offer a free tier that lets you use a limited set of services, like hosting one app or a database, without paying anything. This is a great option for getting started so if your app isn’t quite production-ready yet, then this is definitely worth considering.
Another option for startups is AWS Startup Credits. AWS invests in startups by giving them virtual credits – sometimes as much as $100,000. Startups can use these credits before they start paying for services. However, this strategy also encourages vendor lock-in, as companies get comfortable using AWS services for free. From the provider’s perspective, this tactic is genius but it means that startups have to use the credits wisely, otherwise the potential switch cost will be massive.
Our advice to you – track your resource usage and check if you truly need everything you’ve provisioned. It’s like choosing a car wash package – you can spend a little or a lot depending on your needs. Just avoid overpaying for capacity you don’t use, like hosting an infrastructure for a million users when you only have 10,000.
Metrics and monitoring are important, but analyzing the results and using them to apply improvements is equally crucial. We faced a situation with a core data analysis process which meant to engage users, but it was taking too long, so instead it led to frustration.
We started by tracking metrics and identifying the bottleneck. The obvious solution was to separate the process into another service or scale it on the cloud, but these would have been expensive fixes. Sometimes the most obvious solutions are not necessarily the best ones.
Instead, we took a simpler approach that proved effective and cost-efficient. Originally, the user clicked "Analyze" and saw a loading screen until the process finished, which took a long time. As we looked into the issue, we found that it wasn’t the length of the process, per se, that was the problem. It was more about informing the end user that it would take a while to complete their request. And so, we decided to address this challenge through informing users that the analysis was running in the background, and that they would receive a notification when it was ready. This allowed them to continue using the app without feeling stuck.
This change improved user experience without costly architectural changes.
What can we learn from that situation? Sometimes the most obvious solutions are not necessarily the best ones. It’s not only important to spot bottlenecks but also to understand what truly affects users. Often optimizing UX and adjusting workflows can solve the problem just as effectively as technical changes.
This practice helps uncover potential weaknesses and scalability bottlenecks for heavy usage, before real-world traffic hits. For starters, it’s important to define what "heavy usage" actually means for your company. This involves setting clear benchmarks.
For example, in a project we worked on, we started off with a user base of 10,000. We knew that the company planned to acquire six new clients this year, which would bring in around 50,000 additional users. With such an estimate, we were able to test against this target number – and even push it further. We also checked what would happen at 100,000 to prepare for unexpected growth.
Another crucial aspect is analyzing the application's usage patterns, which links to the previously-mentioned auto-scaling. In Poland, for example, banks scale up automatically on the 10th of each month because they know people will log in en masse to check if their salaries have arrived or to pay bills. You could also adjust resources based on the time of day or region.
By studying user behavior, we can anticipate peak usage times and optimize resource allocation accordingly.
Efficiency in database design starts with the basic question of choosing the right database type per your unique needs. For example, if your website requires heavy use of search bars, you could bet on NoQSL, or implement read models/materialized views to serve “ready-made” answers instead of crunching the numbers for each query. We recommend using complex joins or heavy read/write operations sparingly, as they may degrade performance as data grows.
Complex joins come into play when we need to quickly answer questions like, “How many orders has a given user placed?”. However, if this query is being executed repeatedly – potentially, thousands or even millions of times a day – it becomes inefficient to compute it from scratch every time.
Instead, we can precompute and store the value, rather than performing complex calculations and data retrieval on demand.
Here’s where caching will work well to handle such repetitive requests. A good example could be the cost of petrol on a gas station’s website – it could only refresh with new data once a day to display the new market price. Instead of processing requests and serving the same data for each user, you display the same value, sparing your database the burden.
If you think that a modular monolith is the right choice to support your scalable software architecture, then check our article, which discusses the use of microservices and monoliths (or a blend of both approaches) in detail.
Our promise
Every year, Brainhub helps 750,000+ founders, leaders and software engineers make smart tech decisions. We earn that trust by openly sharing our insights based on practical software engineering experience.
Authors
Read next
Popular this month