Chapter 1

Scalable Development#

As systems evolve, they tend to become more difficult to develop as the existing requirements and code may work against you. This problem has been amplified with the introduction of so-called agentic coding, where computer agents leveraging large language models (LLMs) essentially do the coding work for us. While the abstraction level has shifted, arguably this hasn't changed the fundamentals behind software engineering. We are still in a position where we have to figure out different types of requirements and map them to our applications.

In this chapter, I will bring you up to speed with agentic development while considering its impact on software engineering and web development in particular. Note that the space is moving fast at the moment, so some of the content may already be outdated by the time you read it so make sure to do your own research on top!

1.1 How are agents changing web development?#

For a long time, web applications were written manually using a suitable editor. Of course visual editors able to generate HTML has existed since the 90s, but for any serious development often code has had to be written unless a specific platform abstracting it out has been used. Since late-2022 with the introduction of ChatGPT, the way we program has begun to shift, first with so-called copilots that complete sections of code as you write it, and then with agents that can develop what you want based on prompting. On top of this we have whole systems that can orchestrate how multiple agents develop a system even independently. Even then, the challenge becomes how to ensure the system is producing right outputs.

In the context of this book, I'll focus primarily on agentic development at the level of a single agent since it is possible to take those ideas to scaling agents as you'll hit the same problems, but at a pronounced rate. The ideas behind effective agentic development are good development practices in general, and I'll provide a good list of those for you in this chapter.

Since agentic development allows you to produce vast amounts of code fast through prompting, this brings you to the same old practices encountered by developers for decades at an elevated rate. The term vibe coding coined by Andrej Karpathy in February 2025 captures well the dark side of this kind of development and for this reason it is occasionally used as a pejorative against development practices that focus on visual results while forgetting what is underneath. This doesn't mean vibe coding cannot be useful as a prototyping technique, but that it can be dangerous to forget to design your codebase so that it follows good practices and follow security practices to avoid leaking client data or causing unnecessary cost or reputation loss to the developer.

It turns out, to get most out of agentic development, you will need a certain amount of discipline. That is the main reason why it is important to understand technical details as in what should good architecture look like and how to develop software sustainably. In Shift-Up: A Framework for Software Engineering Guardrails in AI-native Software Development - Initial Findings, Lipsanen et al. consider this issue and propose structured development as an alternative to unstructured vibe coding. On top of structured development, the authors propose the Shift-Up approach that trades development speed for human control. Practices are still emerging, and I have no doubt we'll find better and worse ways to work machines. The question is what values we want to emphasize.

In short, agentic development has changed web development by shifting the abstraction level. Instead of writing code, you massage the codebase into your preferred shape. The big question is, how to do this effectively. I'll consider this question next based on my own experience gained through the development of my own projects, especially Slideotter, a large web application designed around slide authoring.

1.2 How to develop web applications using agents sustainably?#

As mentioned, giving into the vibes and prompting your application can be seductive in sense that it allows you to achieve visible results fast. For a simple project that is easy to specify, sometimes even a single prompt can be enough to produce something usable even with the current tools. For example, the layout of the AI meets SDLC was produced as one-shot using a carefully crafted prompt that provided the model enough context and constraints. Both context and constraints are at the core of using these tools. Commonly constraints are also called guardrails, and for a good reason. A big part of agentic engineering and working with AI systems in general is exactly about figuring out boundaries for the systems and constraining how it can work.

Roughly put, when you develop with agents you follow a loop where you first constrain, then generate, and finally validate the results before starting again. The overall efficiency of your development efforts is defined by the efficiency of this loop and I believe mastering this loop is also the key to enabling effective agentic development at scale. Even in a personal context, it is worth considering the loop and how you are contributing towards its operation so let's consider each part in isolation.

By constraining process I mean defining the constraints of your systems, or even an individual prompt. Constraints are useful in sense that they allow less and effectively remove options. A good example of a constraint is for example defining what kind of development stack to use when starting a project. In addition, constraints can be stored at your project itself and commonly agentic projects include a lot of documentation that describes the meta-level of your project so it is easy for an agent to understand the context where it operates. As you look at projects developed by agents, you'll notice standard files, such as AGENTS.md, that are exactly about constraining how an agent can work within a project.

The second part of the loop, generation, is where the agent is doing its work based on your prompt. Although it sounds trivial, it is an important part since it is easy to get this one wrong or end up with higher expenses than you would like. You should consider which model you use and why since so-called reasoning models are better for more complex tasks while smaller models can be useful for simple implementation level tasks. For this reason, people often use different models for planning and implementation. You should also consider what kind of context your provide to your model, how you provide it, and in which kind of context your agent is operating. As you work with models, you will hit concepts of context window and token consumption and both are worth optimizing since the models have limited capabilities and can suffer "memory loss" while working which in turn wastes resources and makes you wait longer. For this reason, people have developed tools, such as rtk, that allow you to optimize token consumption. Many other solutions exist as well and they are worth researching.

The final part of the loop, validation, is perhaps the most important one since it's the part that defines the overall performance of your loop in sense that with good validation you can get decent results even from poor models as validation helps to keep them on track. Interestingly enough you can have your agents develop a big chunk, or even all, of validation logic for you. The idea is to give your agents tools to let them validate their work. The most obvious thing to do is to set up strict validation rules for your projects, but the idea scales beyond development. For slideotter, one of the smartest decisions I did was to define layout rules for my LLMs early on to enable generation of sensible visual layouts that don't overlap or look otherwise weird. A big part of working with agents is getting better at figuring out how to set up effective validation rules since this can define whether your project stays on rails or not. If you don't take care, you can end up with a lot of working code that has not been structured well and in turn will be slow and expensive for agents to maintain and develop further. Technical debt can accrue faster with these tools than you can possibly imagine.

While agents can allow you to develop entire web applications fast, they also require responsible usage as otherwise you might end up with what some people call derogatively as slop, something that was artificially generated and looks such. When used responsibly and with good taste, I argue that AI tools can help you to elevate your own skills by a magnitude. Therefore, your task is to become a strong engineer that understands both the domain and engineering since it is this combination that enables productivity with AI. Most importantly you have to be able to judge what the tools emit and where to take your project. A big chunk of this comes with experience, but to avoid some of the pain and speed up your personal development, it is a good idea to pick up at least some basic ideas so you don't have to go through the trouble and find the lessons on your own. That is the main reason why we write a big project in three stages in this book, and I'll describe the first part of the project next. While people complete the project in groups during the course related to the book, you can also complete the project alone and cut the scope for example. Although you can learn a lot by reading, it is hard to beat learning by doing as then you can experience the problems related to this type of development concretely.

1.3 Book project - Event management platform#

TODO

1.4 Themes#

Codebase structure and module boundaries
Composition of features, services, user interfaces, and infrastructure
Local development environments and reproducible setup
Testing strategy and confidence when changing existing behavior
Continuous integration and review practices
Dependency management and upgrade pressure
Documentation as support for future change
AI-assisted development and the need to review, constrain, and maintain generated code

1.5 Learning goals#

After this chapter, students should be able to:

Explain why development scalability is different from runtime performance scalability.
Identify codebase and workflow choices that make future changes easier or harder.
Use composition to manage complexity without hiding important system behavior.
Evaluate AI-assisted code as maintainable software rather than as a one-time output.
Connect development practices to the later project phases: construct, reconstruct, and deconstruct.