<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Hypha]]></title><description><![CDATA[Hypha]]></description><link>https://hypha.pub</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1769387944316/89221c93-02fc-4d25-b28a-42e3bf9e1eb3.png</url><title>Hypha</title><link>https://hypha.pub</link></image><generator>RSS for Node</generator><lastBuildDate>Fri, 10 Apr 2026 19:36:33 GMT</lastBuildDate><atom:link href="https://hypha.pub/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Back to FreeBSD: Part 2 — Jails]]></title><description><![CDATA[Before we explore FreeBSD jails, it is worth refreshing our understanding of how Linux solved the same problem with LXC (Linux Containers). Clearly inspired by jails, they are conceptually all about t]]></description><link>https://hypha.pub/back-to-freebsd-part-2</link><guid isPermaLink="true">https://hypha.pub/back-to-freebsd-part-2</guid><category><![CDATA[Docker]]></category><category><![CDATA[FreeBSD]]></category><category><![CDATA[Linux]]></category><category><![CDATA[Jails]]></category><dc:creator><![CDATA[Roman Zaiev]]></dc:creator><pubDate>Thu, 26 Mar 2026 18:34:47 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/68bca2462d48bd639b98e819/bdec27ee-4225-4087-832a-d9f5b777bbf3.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Before we explore FreeBSD jails, it is worth refreshing our understanding of how Linux solved the same problem with LXC (Linux Containers). Clearly inspired by jails, they are conceptually all about the same thing in essence. But the implementation difference is striking.</p>
<p>Linux containers are not a single kernel feature. They are a combination of several independent primitives added to the kernel over a number of years. In a nutshell:</p>
<ul>
<li><p><strong>namespaces</strong> — isolate what a process can see: its own PID space, network interfaces, mount points, hostname, users</p>
</li>
<li><p><strong>cgroups</strong> — limit and account for what a process can consume: CPU, I/O, memory, network bandwidth</p>
</li>
<li><p><strong>seccomp</strong> — restrict which system calls a process is allowed to make</p>
</li>
</ul>
<p>None of these were originally designed together as a container system. They were added by different people at different times for different reasons. You would be amazed how many steps it takes to get anywhere working with them directly.</p>
<p>LXC was the first serious attempt to glue them together and make them usable. Released in 2008, it gave you a relatively straightforward way to define and launch a system container using those kernel primitives. Early Docker was quite literally built on top of LXC — until 2014, when Docker replaced it with their own runtime called <code>libcontainer</code> and cut the dependency.</p>
<p>Ironically, by introducing <code>libcontainer</code>, Docker effectively did to LXC what LXC did to the raw kernel primitives — added another abstraction layer on top. And then the OCI came along and standardised that layer, and now you have runc, containerd, and so on.</p>
<h2>LXC container from scratch</h2>
<p>I will use Fedora 40 in my examples because that is what is installed on my old MacBook Pro 2015 right now (yeah, second life).</p>
<p>So let's do it — create the container, wire up networking so it can actually reach the internet, and verify it works.</p>
<pre><code class="language-bash"># install lxc and the networking helper
sudo dnf install lxc lxc-templates

# lxc-net manages a private bridge (lxcbr0) with dnsmasq for DHCP
# and NAT rules so containers can reach the outside world
sudo systemctl enable --now lxc-net
</code></pre>
<p>Now tell LXC to use the bridge. Edit <code>/etc/lxc/default.conf</code>:</p>
<pre><code class="language-plaintext">lxc.net.0.type = veth
lxc.net.0.link = lxcbr0
lxc.net.0.flags = up
</code></pre>
<p>The <code>lxcbr0</code> bridge was created by <code>lxc-net</code>. It sits on <code>10.0.3.0/24</code> by default, runs a dnsmasq instance to hand out DHCP leases, and has iptables masquerade rules so container traffic exits through the host's physical interface.</p>
<pre><code class="language-bash"># create a Fedora 40 container
sudo lxc-create -n mycontainer -t download -- -d fedora -r 40 -a amd64

# start it
sudo lxc-start -n mycontainer

# get a shell inside
sudo lxc-attach -n mycontainer
</code></pre>
<p>From inside the container:</p>
<pre><code class="language-sh"># should have a 10.0.3.x address from DHCP
ip addr show eth0

# reaches the internet through NAT on the host
ping -c 3 1.1.1.1

curl -s https://jail.run | head -5
</code></pre>
<p>It works! Notice what had to happen to get here: a bridge interface, a DHCP server, iptables NAT rules, a <code>veth</code> pair connecting the container to the bridge. This is the shape of the Linux approach throughout: composable primitives, flexibility, and a noticeable amount of glue.</p>
<pre><code class="language-bash"># when done
sudo lxc-stop -n mycontainer
sudo lxc-destroy -n mycontainer
</code></pre>
<h2>FreeBSD jail from scratch</h2>
<p>A FreeBSD jail is not built from composable primitives. It is a first-class kernel concept and a single syscall that says "run this subtree as an isolated environment". Let's build one.</p>
<p>A jail needs a root filesystem first of all. The cleanest way to get one is to fetch and extract the FreeBSD base distribution:</p>
<pre><code class="language-bash">mkdir -p /jails/myjail

fetch https://download.freebsd.org/releases/amd64/15.0-RELEASE/base.txz -o /tmp/base.txz

tar -xf /tmp/base.txz -C /jails/myjail
</code></pre>
<p>This gives you a minimal but complete FreeBSD userland — libc, basic utilities, everything needed to run processes inside the jail. Copy the host's DNS configuration so the jail can resolve names:</p>
<pre><code class="language-bash">cp /etc/resolv.conf /jails/myjail/etc/resolv.conf
</code></pre>
<p>Let's keep it simple — no bridge, no DHCP server, no NAT rules. We just add an IP alias to the host's existing network interface, and the jail gets its own address on the same network the host is already on.</p>
<pre><code class="language-bash"># add an alias to the host's external interface
ifconfig em0 10.0.0.10 alias

# make it persist across reboots
sysrc ifconfig_em0_alias0="inet 10.0.0.10"
</code></pre>
<p>Replace <code>em0</code> with your actual interface — <code>vtnet0</code>, <code>igb0</code>, whatever <code>ifconfig</code> shows. The jail will use this address, and outbound traffic routes through the host's existing gateway.</p>
<p>This networking setup sits directly on the same network as the host, visible on the LAN like any other machine. There is no network isolation here. If you need the jail on its own private subnet, unreachable from outside without explicit port forwarding, that is what VNET is for — a fully virtualised network stack per jail, with its own bridge and NAT through PF. For now, shared IP is enough to show how jails work.</p>
<p>Jails in FreeBSD are configured in <code>/etc/jail.conf</code>. Notice <code>exec.start</code>, <code>exec.stop</code>, <code>exec.prestart</code>, <code>exec.poststop</code> — they give you clear lifecycle control and handy automation hooks.</p>
<pre><code class="language-plaintext">myjail {
    host.hostname = "myjail.local";
    path = "/jails/myjail";
    ip4.addr = 10.0.0.10;
    interface = em0;
    exec.start = "/bin/sh /etc/rc";
    exec.stop = "/bin/sh /etc/rc.shutdown jail";
    mount.devfs;
}
</code></pre>
<p>One command registers the jail with the kernel and it is ready:</p>
<pre><code class="language-bash"># start it
jail -c myjail

# list running jails
jls

# get a shell inside
jexec myjail /bin/sh
</code></pre>
<p>From inside:</p>
<pre><code class="language-sh">ifconfig             # shows 10.0.0.10
ping -c 3 1.1.1.1    # reaches the internet through the host's gateway
fetch -o - https://jail.run | head -5
</code></pre>
<p>The host is invisible — the jail sees only what you gave it. And it started instantly, because there is nothing to boot.</p>
<pre><code class="language-bash"># de-register the jail
jail -r myjail

# remove it entirely
rm -rf /jails/myjail
</code></pre>
<h2>Jail managers</h2>
<p>The manual process above is not complicated and pretty straightforward, but it is verbose. Fetching a base system, extracting it, managing IP aliases, writing <code>jail.conf</code> stanzas — you see the repeatable routine, and it is all automatable. So the community has built several tools to do exactly that. <a href="https://github.com/cbsd/cbsd?tab=readme-ov-file#the-freebsd-ecosystem-today">Dozens</a> of them, actually.</p>
<p>The reason there are multiple tools rather than one is partly history, partly differing opinions about what "managing a jail" should mean, and partly the fact that FreeBSD doesn't ship an official one. The base OS gives you <code>jail(8)</code> and leaves the rest to you.</p>
<p>Here are three worth your attention today.</p>
<h3>Bastille</h3>
<p>The current community favourite. <a href="https://bastillebsd.org/">Bastille</a> handles the full lifecycle — bootstrapping release archives, creating jails, managing PF rules, templating — with sensible defaults for the most common case. Can work without ZFS, though it benefits from it. Clean CLI, active development, good documentation.</p>
<p>Templates live in a <strong>Bastillefile</strong>: a list of instructions describing what gets installed and configured inside a jail.</p>
<h3>AppJail</h3>
<p><a href="https://appjail.readthedocs.io/en/latest/">AppJail</a> introduces a compositional model — jails are assembled from stages and instructions defined in a <strong>Makejail</strong>. It has thought carefully about how jails should be reused and composed into larger systems, and handles complex multi-jail setups well.</p>
<h3>Pot</h3>
<p>Where Bastille and AppJail are primarily FreeBSD-native tools, <a href="https://pot.pizzamig.dev/">pot</a> has first-class integration with <a href="https://developer.hashicorp.com/nomad">HashiCorp Nomad</a> to mimic modern Kubernetes-style orchestration with pods in mind. If your infrastructure uses Nomad for scheduling, pot fits naturally. Templates live in a <strong>Potfile</strong>.</p>
<hr />
<p>All three have independently converged on a Dockerfile-inspired template format. Bastillefile, Makejail, Potfile — different names, same shape, same goals. A sequence of imperative instructions in a small specialised DSL: install this package, copy this file, run this command.</p>
<p>In the case of Bastille, for example:</p>
<pre><code class="language-plaintext">CP usr /
PKG ca_root_nss unbound
SYSRC unbound_enable=YES
CMD chown unbound:wheel /usr/local/etc/unbound
CMD /usr/local/sbin/unbound-control-setup
CMD /usr/local/sbin/unbound-checkconf &amp;&amp; echo "nameserver 127.0.0.1" &gt; /etc/resolv.conf
SERVICE unbound restart
CMD host bastillebsd.org
</code></pre>
<p>The intent is obvious: lower the barrier for people coming from Docker. Familiar syntax, familiar mental model. A reasonable goal. Does it make any sense?</p>
<p>One of the core reasons Dockerfile format exists in the form it does is that Docker had to solve the layering problem on top of filesystems that had no native concept of it. So Docker invented its own layer model, and the Dockerfile format maps directly onto it — each instruction is potentially a new layer.</p>
<p>FreeBSD with ZFS does not have this problem. Layering is a first-class filesystem primitive on ZFS. Snapshots and clones have been there since 2005. You do not need a container runtime to reinvent layering on top of ZFS — it is already there, at the right level of abstraction, with better semantics.</p>
<p>Copying the Dockerfile format onto jails means importing a solution to a problem you do not have, while ignoring the tools that actually solve it better. The layering belongs at the filesystem. Not in a template DSL that cosplays as Dockerfile.</p>
<p>So all modern jail managers are trying to reinvent something just to make the look and feel similar to what is common in Linux — with no clear advantage, since the offered DSLs are still not the same as the Dockerfile format, and that difference is still friction for anyone escaping to FreeBSD.</p>
<p>We will get into this properly in the next parts. For now, let's use Bastille for what it is genuinely good at.</p>
<hr />
<h2>Lab with Bastille</h2>
<p>Bastille reduces the manual process from earlier to a handful of commands.</p>
<pre><code class="language-bash">pkg install bastille       # install bastille itself
sysrc bastille_enable=YES  # enable it as a service
</code></pre>
<p>Bastille ships with a <code>bastille setup</code> command that handles the initial host configuration in one shot:</p>
<pre><code class="language-bash">bastille setup
</code></pre>
<p>After setup, start PF manually:</p>
<pre><code class="language-bash">service pf start
</code></pre>
<p>Bastille manages release archives centrally — download once, use for as many jails as you need. Think of it as your base layer for everything built on top:</p>
<pre><code class="language-bash">bastille bootstrap 15.0-RELEASE
</code></pre>
<p>Now you have everything to create jails:</p>
<pre><code class="language-bash"># pick any free IP within the default 10.17.89.0/24 range
bastille create myjail 15.0-RELEASE 10.17.89.10
bastille start myjail
</code></pre>
<p>Behind the scenes, Bastille handled the directory structure, the <code>jail.conf</code> entry, the IP alias, and the PF table registration. Your jail is up and running on <code>10.17.89.10</code>.</p>
<pre><code class="language-bash"># jump into the jail
bastille console myjail
</code></pre>
<p>From inside:</p>
<pre><code class="language-sh">ifconfig
fetch -o - https://jail.run | head -5
</code></pre>
<p>Isolated, reachable, with internet access through NAT on the host. Exactly what we wired up manually, in just three commands.</p>
<hr />
<p>Jails give you real OS-level isolation with near-zero overhead. The kernel primitive is simple — a directory, a configuration file, a handful of hooks. The manual process takes about twenty minutes to understand end to end. A manager like Bastille reduces that to three commands and a bootstrap step that only runs once.</p>
<p>In the next part we will add ZFS into the picture — snapshots, clones, and you will see why the jail story gets considerably more interesting when the filesystem can do natively what Docker has been simulating all along.</p>
<hr />
<p><a href="https://freebsdfoundation.org/freebsd-project/resources/introduction-to-freebsd-jails/">Introduction to FreeBSD Jails</a></p>
<p><a href="https://youtu.be/vVpd34bWCJA?si=B6ADiE8_yV_eFAp9">Build Secure FreeBSD Containers in 5 Minutes</a></p>
<p><a href="https://youtu.be/hQmOc0egcl4?si=qhjCV4BSlp9AFYgX">FreeBSD Fridays: Introduction to Jails</a></p>
<p><a href="https://youtu.be/S3u8OtjfGFE?si=2BWadr3dwrEZNmzG">20 Years of FreeBSD Jails (2019)</a></p>
]]></content:encoded></item><item><title><![CDATA[Back to FreeBSD: Part 1 — Intro]]></title><description><![CDATA[A few decades ago, the only well-known way to deliver something to a server, to make it accessible over the internet, was moving files via FTP in Total Commander, FileZilla or FAR Manager, manually co]]></description><link>https://hypha.pub/back-to-freebsd-part-1</link><guid isPermaLink="true">https://hypha.pub/back-to-freebsd-part-1</guid><category><![CDATA[Docker]]></category><category><![CDATA[FreeBSD]]></category><category><![CDATA[Linux]]></category><category><![CDATA[Jails]]></category><dc:creator><![CDATA[Roman Zaiev]]></dc:creator><pubDate>Fri, 20 Feb 2026 18:33:48 GMT</pubDate><enclosure url="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/68bca2462d48bd639b98e819/a7e33aa1-1702-4eeb-ad86-28846f851640.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A few decades ago, the only well-known way to deliver something to a server, to make it accessible over the internet, was moving files via FTP in Total Commander, FileZilla or FAR Manager, manually copying files and folders from the left pane to the right one. The more advanced among us preferred standard UNIX tools like scp or rsync instead, but the process was essentially the same.</p>
<p>Not rocket science (which is the best part), and it worked! The only obvious problem was that inevitable "oops" moment we've all had — something misclicked, accidentally deleted, edited in the wrong place. No big deal when you're a solo dev on a solo project. A real disaster when you're responsible for dozens of client projects.</p>
<p>A very common backend setup involved multiple websites served by the same Apache web server instance, all sharing the same lifecycle. If Apache went down, everyone went down. If a system dependency broke, everything crashed.</p>
<p>It’s worth mentioning that a more subtle problem appeared when traffic spiked. While your superstar website consumed all available resources, every other site on the same server was quietly suffocating.</p>
<p>Sysadmins scrambled to automate routine tasks, sharing shell scripts full of clever tricks and procedural logic. There was no standard way to do anything yet, including versioning for auditing or rolling back when things went wrong. Most of us used conventions like appending an incremented number or a timestamp to the project folder name. In most cases, manual file tossing worked pretty well. Until it didn’t.</p>
<p>There were at least two clear problems that needed solving:</p>
<p><strong>Deployment.</strong> How do you deliver reliably? How do you avoid common human fat-finger mistakes? How do you implement versioning and rollbacks? And how do you make the solution generic enough to cover all business cases?</p>
<p><strong>Process isolation.</strong> How do you protect the app from the system and the system from the app? How do you handle situations where one app’s requirements silently break another’s, or when customers need slightly different versions of the same thing? How do you resolve dependencies?</p>
<p>Attempts to solve the deployment problem gave us a whole new universe of tools and approaches, eventually evolving into modern CI/CD pipelines, packaging standards, and version control. The isolation story, however, is far less well known.</p>
<p>In 1979, Bell UNIX introduced <code>chroot</code> — a way to give a process an isolated view of the filesystem, restricting it to a subtree so it couldn't touch anything above it. It was a primitive but genuinely useful idea. The limitation was that <code>chroot</code> only isolated the filesystem. The process could still interfere with the network, with other processes, with system resources. It was a partial solution, and a determined application could escape it.</p>
<p>The first serious enterprise answer was virtual machines. VMware brought VMs into mainstream use in the late 1990s, giving each application its own fully isolated OS environment. The problem was cost. Every VM carried a complete OS with significant overhead, and startup times measured in minutes. It was inefficient and expensive, though still cheaper than buying more physical servers.</p>
<p>The quiet revolution happened in 2000. Not on Windows Server, and not yet on Linux, but on <strong>FreeBSD</strong>, a UNIX-based operating system that was the default choice for IT professionals long before Linux dominated the space.</p>
<p>FreeBSD is worth a brief aside here, because it differs from Linux in a fundamental way. Linux is a kernel. What most people call "Linux" is actually that kernel combined with a GNU userland, a package ecosystem, and a set of choices that vary from distro to distro — Ubuntu, Fedora, and Arch are all running the same kernel but are meaningfully different systems underneath.</p>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/68bca2462d48bd639b98e819/d2926e62-bfd6-44ec-93db-7158b23805bb.png" alt="" />

<p><strong>FreeBSD ships as a complete, coherent OS</strong> — kernel, userland, base tools, and libraries all developed together, versioned together, and tested together as a single unit. That coherence matters. It's part of why FreeBSD solutions tend to be cleaner and why the base system behaves consistently across installations.</p>
<p>The solution FreeBSD built on top of that coherent foundation was called <strong>jails</strong>. Announced by Poul-Henning Kamp and Robert Watson and shipped as a native kernel feature in <a href="https://docs.freebsd.org/en/books/handbook/jails/">FreeBSD 4.0 in March 2000</a>, jails took the <code>chroot</code> idea and completed it, adding full network isolation, process isolation, and proper security boundaries.</p>
<p><a href="https://papers.freebsd.org/2000/phk-jails.files/sane2000-jail.pdf"><img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/68bca2462d48bd639b98e819/7f6612cd-b1ee-4c65-84b8-ef80b1fd6a9e.png" alt="" /></a></p>
<p>Each jail gets its own filesystem view, its own network stack, its own process space. The host system is invisible to it. And crucially, it shares the host kernel, meaning near-zero overhead and near-instant startup time.</p>
<pre><code class="language-plaintext">Your application
↑
Optional jail managers: cbsd, bastille, pot, appjail, etc.
↑
Jails (2000) — native OS-level containers
↑
Filesystem + userspace
↑
BSD kernel
</code></pre>
<p><strong>FreeBSD pioneered the practical implementation of what we now call containers. Not conceptually, but in production, years before the rest of the industry caught up.</strong></p>
<p>Sun Microsystems followed with Solaris Zones in 2004, adapting the jails concept for their enterprise customers, and gave back ZFS — the most advanced filesystem ever built — open sourced in 2005 and ported to FreeBSD in 2007. ZFS complemented jails with instant snapshots and efficient layering.</p>
<p>The actual timeline of the isolation problem looks like this:</p>
<pre><code class="language-plaintext">1. Shared servers with no isolation
↓
2. Virtual machines (heavy but isolated)
↓
3. Containers (lightweight and isolated)
</code></pre>
<p>FreeBSD reached that third stage in 2000. Linux wouldn't get there until 2008 with LXC. Docker — the tool most developers think of as the origin of containers — didn't appear until 2013. When Docker was being celebrated as revolutionary, FreeBSD jails were already thirteen years old, mature and battle-tested.</p>
<p>So why does nobody talk about it?</p>
<p>Technical superiority doesn't win ecosystem wars. Linux won through a combination of fast decisions, the viral GPL licence, and strong enterprise backing from Red Hat and IBM. Then Google, Facebook, and Amazon happened — hungry for datacenters, developing tools to manage growing infrastructure at scale. They set the direction for the entire industry.</p>
<p>Linux rapidly went from "the free OS for people who can't afford commercial licences" to "the only acceptable OS for servers".</p>
<p>To solve the distribution and isolation problem, Linux engineers built a set of kernel primitives (namespaces, cgroups, seccomp) and then, in a very Linux fashion, built an entire ecosystem of abstractions on top to “simplify” things:</p>
<pre><code class="language-plaintext">Your application
↑
Docker Hub — commercial third-party distribution
↑
Docker / Podman (2013 / 2018) — image builds, distribution, lifecycle, UX
↑
OCI / runc (2015) — standardised container execution
↑
LXC (2008) — system containers
↑
Namespaces + cgroups + seccomp (2006–2013) — kernel isolation primitives
↑
Linux kernel
</code></pre>
<p>Somehow we ended up with an overengineered mess of leaky abstractions for cloud-based, vendor-locked infrastructure.</p>
<p>And this complexity has quietly reshaped how the industry thinks about deploying software. Today, if you want to run an application in a larger system, the implicit assumption is that you containerise it with Docker and orchestrate it with Kubernetes. It's not presented as one option among several — it's presented as the obvious default, the thing you'd be naive or reckless to skip.</p>
<p>What Docker actually solved well was the shipping problem: a universal standard for packing an application with all its dependencies, distributing it through a registry, and running it identically on any machine anywhere. That was genuinely useful, and the OCI image format became a real industry standard.</p>
<p>Jails solve the isolation problem beautifully, but they don't have a native answer to shipping. That gap is real, and it's one of the main reasons the ecosystem around jails feels underdeveloped compared to Docker's world.</p>
<p>The community is aware of it. Some tools attempt to close the gap by mimicking what the modern container ecosystem offers, with moderate success. But there are other approaches too, utilising native FreeBSD primitives that have been quietly sitting there for many years.</p>
<hr />
<p>In the next parts, you will see how simple and elegant FreeBSD-based infrastructure can look, how jails work from the ground up, how jail managers help reduce the boilerplate, how you can use Ansible to provision and deploy, why ZFS snapshots are a killer feature worth your attention, and how we put it all together to build robust and scalable infrastructure for Hypha.</p>
<hr />
<p><a href="https://vermaden.wordpress.com/2025/04/08/are-freebsd-jails-containers/">Are FreeBSD Jails a Containers?</a> and a <a href="https://lobste.rs/s/f6wcbv/are_freebsd_jails_containers">follow-up discussion.</a></p>
<p><a href="https://papers.freebsd.org/2000/phk-jails.files/sane2000-jail.pdf">Jails: Confining the omnipotent root.</a></p>
<p>Cover image: wallpaper art by <a href="https://www.reddit.com/u/atlas-ark">atlas-ark</a>.</p>
]]></content:encoded></item><item><title><![CDATA[Postgres is Your Friend. ORM is Not.]]></title><description><![CDATA[Postgres is amazing. It’s powerful, efficient, rock-solid, and genuinely one of the finest pieces of software engineering ever created. And it’s also much more than a traditional relational database. ]]></description><link>https://hypha.pub/postgres-is-your-friend-orm-is-not</link><guid isPermaLink="true">https://hypha.pub/postgres-is-your-friend-orm-is-not</guid><category><![CDATA[PostgreSQL]]></category><category><![CDATA[ORM (Object-Relational Mapping)]]></category><category><![CDATA[SQL]]></category><dc:creator><![CDATA[Roman Zaiev]]></dc:creator><pubDate>Wed, 26 Nov 2025 12:41:00 GMT</pubDate><enclosure url="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/68bca2462d48bd639b98e819/90989c92-95b1-4f7d-b976-5dd5c12a85fc.jpg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Postgres is amazing. It’s powerful, efficient, rock-solid, and genuinely one of the finest pieces of software engineering ever created. And it’s also much more than a traditional relational database. It’s a complete, rich ecosystem packed into a single engine.</p>
<p>Beyond the classic CRUD operations everyone expects, Postgres lets you do far more. You can store and query JSONB fields as if they were structured tables, and you can even build efficient indexes on top of them. It blurs the line between relational and semi-structured data in a way that’s both elegant and practical.</p>
<pre><code class="language-sql">CREATE TABLE IF NOT EXISTS videos (
    id           UUID NOT NULL PRIMARY KEY,
    profile_id   UUID NOT NULL REFERENCES profiles (id) ON DELETE CASCADE,
    title        TEXT,
    state        VARCHAR(32) NOT NULL,
    metadata     JSONB NOT NULL,
    created_at   TIMESTAMPTZ DEFAULT now() NOT NULL,
    updated_at   TIMESTAMPTZ DEFAULT now() NOT NULL
);

CREATE INDEX idx_videos_tags_gin
ON videos USING GIN ((metadata-&gt;'tags') jsonb_path_ops);
</code></pre>
<pre><code class="language-sql">SELECT * FROM videos WHERE metadata-&gt;'tags' @&gt; '["personal"]';
</code></pre>
<p>You can easily implement a simple yet efficient full-text search using the <code>pg_trgm</code> extension. So you probably don’t need that Elasticsearch instance for your mid-sized project.</p>
<pre><code class="language-sql">CREATE EXTENSION IF NOT EXISTS pg_trgm;

CREATE INDEX IF NOT EXISTS idx_videos_title_trgm
ON videos USING GIN (lower(title) gin_trgm_ops)
WHERE state = 'LIVE' AND title IS NOT NULL;
</code></pre>
<pre><code class="language-sql">SELECT * FROM videos
WHERE
    title ILIKE '%' || $1 || '%'
    OR similarity(title, $1) &gt; 0.12;
</code></pre>
<p>And you have <code>pg_partman</code> for automated <a href="https://www.postgresql.org/docs/current/ddl-partitioning.html">partitioning</a>!</p>
<pre><code class="language-sql">CREATE EXTENSION pg_partman;

CREATE TABLE events (
    id          UUID NOT NULL PRIMARY KEY,
    created_at  TIMESTAMPTZ DEFAULT now() NOT NULL,
    payload     JSONB
);

SELECT partman.create_parent(
    p_parent_table := 'public.events',
    p_control      := 'created_at',
    p_type         := 'native',
    p_interval     := 'monthly'
);

UPDATE partman.part_config
   SET premake = 3,
       retention = '12 months'
 WHERE parent_table = 'public.events';

SELECT partman.run_maintenance();
</code></pre>
<p>There are many more features worth your attention: views and materialised views, window functions for efficient sliding aggregations, CTEs for clearer complex queries, etc.</p>
<p>One of the most underrated features in Postgres is the <code>FOR UPDATE SKIP LOCKED</code> mechanism, combined with Postgres’ native <strong>LISTEN/NOTIFY</strong> pub/sub. Using these two together, you can build robust concurrent processing that scales horizontally, coordinates work safely, and never loses a record.</p>
<p>Here’s how it looks on the Postgres side — it simply emits a notification on each insert to a dedicated channel. Your workers can subscribe to it and react instantly.</p>
<pre><code class="language-sql">CREATE TABLE IF NOT EXISTS tasks (
    id            UUID NOT NULL PRIMARY KEY,
    task_name     VARCHAR(128) NOT NULL,
    payload       JSONB,
    state         VARCHAR(32) NOT NULL,
    created_at    TIMESTAMPTZ DEFAULT now() NOT NULL,
    updated_at    TIMESTAMPTZ DEFAULT now() NOT NULL
);

CREATE OR REPLACE FUNCTION notify_task()
RETURNS TRIGGER AS $$
DECLARE
    payload JSON;
BEGIN
    IF NEW.state = 'PENDING' THEN
        payload = json_build_object(
            'id', NEW.id,
            'task_name', NEW.task_name,
            'created_at', NEW.created_at
        );
        PERFORM pg_notify('tasks', payload::text);
    END IF;
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER trg_task_notify
AFTER INSERT ON tasks
FOR EACH ROW EXECUTE FUNCTION notify_task();
</code></pre>
<p>It fires native <code>pg_notify()</code> on every <code>INSERT</code> for you, keeping your repository implementation clean and single-purposed.</p>
<pre><code class="language-python">class TaskRepositoryAdapter:

    async def add(self, task: Task, connection: Connection | None = None) -&gt; None:
        query = """
            INSERT INTO tasks (
                id,
                task_name,
                payload,
                state,
                created_at,
                updated_at
            ) VALUES (\(1, \)2, \(3, \)4, \(5, \)6)
        """
        await self.db.execute(
            query=query,
            args=[
                task.id,
                task.task_name,
                dumps(task.payload) if task.payload else None,
                task.state,
                task.created_at,
                task.updated_at,
            ],
            connection=connection,
        )
</code></pre>
<p>Here’s a simplified consumer that listens for notifications and runs the corresponding task. In real life there’s more logic, of course (retry strategy, logging, health checks, etc.) but the essence is this:</p>
<ul>
<li><p>long-lived connection stays open to listen on the notification channel</p>
</li>
<li><p>another short-lived connection is used to lock the task row, update its state, and then run the actual work right after</p>
</li>
</ul>
<pre><code class="language-python">class TaskConsumerAdapter:

    async def listen(self) -&gt; None:
        self._notification_conn = await self.db.pool.acquire()
        await self._notification_conn.add_listener("tasks", self._on_notification)

    async def _on_notification(self, connection: Connection, pid: int, channel: str, payload: str) -&gt; None:
        task_data = loads(payload)
        task_id = UUID(task_data["id"])
        asyncio.create_task(self._take_task(task_id))

    async def _take_task(self, task_id: UUID) -&gt; None:
        async with self.db.pool.acquire() as conn, conn.transaction():
            # locks selected row
            task = await self.task_repo.get_pending(task_id=task_id, connection=conn)
            # marks status while locked
            await self.task_repo.update(
                task.change_state(TaskState.PROCESSING),
                connection=conn
            )

        consumer = self.consumers[task.task_name]
        await self._process_task(consumer, task)

    async def _process_task(self, consumer: TaskConsumer, task: Task) -&gt; None:
        next_state = TaskState.FAILED  # let's start pessimistically
        try:
            await asyncio.wait_for(consumer.process(task=task), timeout=1800)
            next_state = TaskState.DONE
        except asyncio.TimeoutError:
            next_state = TaskState.PENDING  # mark to retry
        finally:
            await self.task_repo.update(task.change_state(next_state))
</code></pre>
<p>The entire distribution logic is controlled by just one line:</p>
<pre><code class="language-python">class TaskRepositoryAdapter:

    async def get_pending(self, task_id: UUID, connection: Connection | None = None) -&gt; Task | None:
        query = """
            SELECT * from tasks
            WHERE id = $1
            AND state = $2
            FOR UPDATE SKIP LOCKED
        """
        rec = await self.db.fetchone(query=query, args=[task_id, TaskState.PENDING], connection=connection)

        return TaskFactories.task_from_record(rec)
</code></pre>
<p>And literally means:</p>
<ul>
<li><p><strong>FOR UPDATE</strong> locks the selected rows</p>
</li>
<li><p><strong>SKIP LOCKED</strong> makes others skip already-locked rows</p>
</li>
</ul>
<p>This way it’s safe to add more consumers later, they’ll simply compete for the next available pending task, naturally distributing work among themselves.</p>
<p>We use this approach in Hypha to register and process video-transcoding tasks. It’s simple and reliable.</p>
<p>Postgres can handle around 10k messages per second, which is more than enough in most cases.</p>
<p>If you need higher throughput or stronger delivery guarantees, that’s when you bring in tools like Kafka, RabbitMQ, or Redis Streams. But for the vast majority of applications, a properly designed App + Postgres combo is more than enough. Using this alone already helps reduce the number of moving parts and the overall infrastructure complexity.</p>
<p>Know Postgres. Use Postgres. Postgres is your friend.</p>
<h2>ORM is not your friend</h2>
<p>TL;DR: avoid ORMs, you don't need an ORM.</p>
<p>Surprisingly many people don’t see the difference between a <strong>Query Builder</strong> and an <strong>ORM</strong>, so let’s clarify.</p>
<p>An ORM is an Object–Relational Mapper. It maps database tables to classes and rows to objects. It lets you use a language-specific DSL instead of SQL, “hiding” SQL “complexity“ behind its own abstractions.</p>
<p>When you fetch a row with <code>asyncpg</code>, you get back a <code>Record</code> — a Python object representing a Postgres row. Is that an ORM? No. An ORM goes much further.</p>
<ul>
<li><p>it performs <strong>bidirectional mapping</strong> between relational data and domain objects</p>
</li>
<li><p>it tracks object identity</p>
</li>
<li><p>it synchronises state changes</p>
</li>
<li><p>it flushes updates automatically</p>
</li>
<li><p>it maintains internal caches and a unit of work</p>
</li>
</ul>
<p>A lot of complex work happens here. But do we really need all of it? Why? How is it actually helpful?</p>
<p>In practice, all these “features for free” are more like “shooting yourself in the leg for free”. You get N+1 queries, accidental writes on flush, hidden internal caches of loaded entities, and a lot of invisible behaviour you never asked for.</p>
<p>Speaking of hidden SQL complexity, let’s compare. Here is a raw SQL to fetch a list of videos matching a condition:</p>
<pre><code class="language-sql">SELECT id, title
FROM videos
WHERE state = 'PUBLISHED'
  AND metadata-&gt;'tags' ? 'python'
  AND created_at &gt;= NOW() - INTERVAL '30 days'
ORDER BY created_at DESC
LIMIT 20 OFFSET 0;
</code></pre>
<p>The same query using SQLAlchemy:</p>
<pre><code class="language-sql">tag_filter = func.jsonb_exists(Video.metadata["tags"], "python")

query = (
    session.query(Video.id, Video.title)
    .filter(Video.state == "PUBLISHED")
    .filter(tag_filter)
    .filter(Video.created_at &gt;= func.now() - text("INTERVAL '30 days'"))
    .order_by(Video.created_at.desc())
    .limit(20)
    .offset(0)
)

results = query.all()
</code></pre>
<p>Roughly the same query in Django ORM:</p>
<pre><code class="language-python">now_30_days_ago = timezone.now() - timedelta(days=30)

qs = (
    Video.objects
    .filter(state="PUBLISHED")
    .filter(metadata__tags__contains=["python"])
    .filter(created_at__gte=now_30_days_ago)
    .order_by("-created_at")
    [:20]
)
</code></pre>
<p>“Roughly” because Django ORM doesn’t support the JSONB <code>?</code> operator. And if you need real SQL intervals, Django pushes you towards raw expressions or <code>Func()</code> wrappers. Of course, you have to know the tool really well to pull off tricks like this:</p>
<pre><code class="language-python">videos = Video.objects.annotate(
    seven_days_ago=Func(
        Value('7 days'),
        function='NOW() - INTERVAL',
    )
).filter(created_at__gt=F('seven_days_ago'))
</code></pre>
<p>But that’s more about expressiveness and implementation limitations. The deeper issue lies elsewhere, so let’s take a step back.</p>
<p>What we actually need is simple — persist aggregates and restore them later, right?</p>
<p>That responsibility belongs inside the aggregate’s repository. And yes, that can (and should) be done directly in plain SQL.</p>
<blockquote>
<p>SQL is already an excellent DSL for relational data</p>
</blockquote>
<p>It might look like this:</p>
<pre><code class="language-python">@dataclass(kw_only=True)
class Video:
    id: UUID = field(default_factory=uuid7)
    owner_id: HyphaID

    title: str | None = None

    playlist: VideoPlaylist | None = None
    video_tracks: list[VideoPlaylist] = field(default_factory=list)
    audio_tracks: list[AudioPlaylist] = field(default_factory=list)

    state: VideoState

    created_at: DateTime = field(default_factory=now)
    updated_at: DateTime = field(default_factory=now)


class VideoRepositoryAdapter:

    async def add(self, video: Video, connection: Connection | None = None) -&gt; None:
        # handy pattern to implement a 'unit of work'
        if connection:
            await self._add(video, connection=connection)
        else:
            async with self.db.pool.acquire() as conn, conn.transaction():
                await self._add(video, connection=conn)

    async def _add(self, video: Video, connection: Connection) -&gt; None:
        query = """
            INSERT INTO videos (
                id,
                profile_id,
                title,
                state,
                created_at,
                updated_at
            )
            VALUES (
                $1,
                $2,
                $3,
                $4,
                $5,
                $6
            );
        """
        await self.db.execute(
            query=query,
            args=[
                video.id,
                video.profile_id,
                video.title,
                video.state,
                video.created_at,
                video.updated_at,
            ],
            connection=connection,
        )

        # some aggregate attributes are stored in a separate table,
        # but all inserts still happen within the same transaction

        traits_query = """
            INSERT INTO video_traits (
                id,
                video_id,
                trait_type,
                trait_data,
                created_at
            ) VALUES (\(1, \)2, \(3, \)4, $5)
        """

        if video.playlist:
            playlist_trait = video.playlist
            await self.db.execute(
                query=traits_query,
                args=[
                    playlist_trait.id,
                    video.id,
                    VideoTraitType.MASTER_PLAYLIST,
                    playlist_trait.serialize(),
                    video.created_at,
                ],
                connection=connection,
            )

        for video_track_trait in video.video_tracks:
            await self.db.execute(
                query=traits_query,
                args=[
                    video_track_trait.id,
                    video.id,
                    VideoTraitType.STAGING_VIDEO,
                    video_track_trait.serialize(),
                    video.created_at,
                ],
                connection=connection,
            )

        for audio_track_trait in video.audio_tracks:
            await self.db.execute(
                query=traits_query,
                args=[
                    audio_track_trait.id,
                    video.id,
                    VideoTraitType.STAGING_AUDIO,
                    audio_track_trait.serialize(),
                    video.created_at,
                ],
                connection=connection,
            )

    async def get_multi(
        self,
        *,
        owner_id: HyphaID,
        states: list[VideoState] | None = None,
        limit: int = 10,
        offset: int = 0,
        connection: Connection | None = None,
    ) -&gt; list[Video]:
        select_query = """
            SELECT
                v.id,
                p.public_id as owner_id,
                v.title,
                v.state,
                v.created_at,
                v.updated_at
            FROM videos v
        """
        query_args = [owner_id, limit, offset]
        conditions = [f"p.public_id = $1"]

        if states:
            query_args.append(states)
            conditions.append(f"v.state = ANY(${len(query_args)})")

        # all parts together
        select_query = f"""
            {select_query}
            JOIN profiles AS p ON v.profile_id = p.id
            WHERE {" AND ".join(conditions)}
            ORDER BY v.updated_at DESC
            LIMIT \(2 OFFSET \)3;
        """

        recs = await self.db.fetchmany(
            query=select_query,
            args=query_args,
            connection=connection,
        )
        if not recs:
            return []

        # the most practical way to fetch all traits we need,
        # only one extra query for N videos
        traits_query = """
            SELECT
                video_id,
                jsonb_agg(
                    jsonb_build_object(
                        'id', id,
                        'trait_type', trait_type,
                        'trait_data', trait_data
                    )
                ) AS traits
            FROM video_traits
            WHERE video_id = ANY($1)
            GROUP BY video_id;
        """

        trait_recs = await self.db.fetchmany(
            query=traits_query,
            args=[[rec["id"] for rec in recs]],
            connection=connection,
        )

        return list(VideoFactories.videos_from_record(recs, trait_recs))


class VideoFactories:
    @classmethod
    def videos_from_records(cls, recs: list[Record], trait_recs: list[Record]) -&gt; Iterator[Video]:
        for rec in recs:
            video_attrs = {
                "id": UUID(str(rec["id"])),
                "owner_id": HyphaID(rec["owner_id"]),
                "title": rec["title"],
                "state": VideoState(rec["state"]),
                "created_at": pendulum.instance(rec["created_at"]),
                "updated_at": pendulum.instance(rec["updated_at"]),
            }
            # enrich the aggregate with data from traits
            ...
            yield Video(**video_attrs)
</code></pre>
<p>Quite a few very important details here.</p>
<ol>
<li><strong>The aggregate can be persisted in more than 1 DB table</strong></li>
</ol>
<p>It’s fairly common to use several tables to persist an aggregate for the sake of flexibility and query efficiency. You may notice the <code>video_traits</code> table, which stores “schemaless” objects (parts of the aggregate), in addition to the main <code>videos</code> table, which stores the more surface-level values.</p>
<ol>
<li><strong>SQL tables don’t necessarily map 1-to-1 to the domain model</strong></li>
</ol>
<p>In Hypha, we rarely have a one-to-one match between aggregates and tables. Using an ORM would mean translating domain models to ORM models and back again just to reassemble the final shape. If you follow the snippets carefully, you’ll notice this applies to both the underlying tables and the naming conventions. The database schema uses a generic <code>profile_id</code> column name (closer to the actual table name) while the Video aggregate uses <code>owner_id</code>, a more precise term for the specific bounded context.</p>
<ol>
<li><strong>Raw SQL is not a sin</strong></li>
</ol>
<p>Might be shocking to some of you, but that’s exactly what you need. <strong>Your repository is your aggregate-scoped query builder</strong>. There’s nothing wrong with having SQL strings there — it doesn’t break Separation of Concerns and it follows Locality of Behaviour perfectly. It’s safe, because of course we use the driver-native query escaping for arguments.</p>
<p>Yes, you must be careful here with all the commas and quotes. Yes, you must follow a certain self-discipline. But you write it once, test it well, and reuse it everywhere afterward. In return, you get:</p>
<ul>
<li><p>full control over the resulting SQL, allowing you to use the full power of SQL as a relational data DSL</p>
</li>
<li><p>transparency and flexibility in how you build queries</p>
</li>
<li><p>scoped, highly-efficient, purpose-built queries</p>
</li>
</ul>
<p>There are two clear concerns experienced developers might raise here, though: how to deal with really big, complex analytical queries, and what to do about composability and reusability?</p>
<p>Surprisingly, the “workarounds” are just as simple and straightforward as the overall solution itself.</p>
<p>When a query starts growing far beyond the physical screen and makes the assembly logic harder to follow, you can template it. Literally move the query out of the method and into a Jinja template. The template can be as rich as you need, with parent CTEs as a base and conditional parameterisation. It’s a practical, pragmatic way to manage extra-long, heavy analytical queries.</p>
<p>Here comes the second one — how to reuse parts of the query? ORMs shine here because of method chaining, letting you extract common parts as a basis for other queries, right?</p>
<p>Even though this is possible (and often encouraged), blindly following the DRY principle can easily lead you to violate another, more important one:</p>
<blockquote>
<p>Prefer Locality of Behaviour over DRY</p>
</blockquote>
<p>By splitting the query chain into parts, you end up spreading the query composition across the repository (in the best case) or across several modules (in the worst), undermining the main advantage of the Locality of Behaviour principle — keeping behaviour scoped and contained.</p>
<p>If you start noticing repetitive parts of your SQL across several methods and feel the natural urge to refactor them into something reusable — stop right there and don’t. The optimisation might look more concise on the screen, but it won’t be in your head, ultimately making it harder to follow and maintain.</p>
<p>It’s perfectly fine to have repetitions across queries. It may take a bit longer to review and update the affected repositories and services when underlying schemas or base conditions change, but that process is infrequent and very straightforward. Your tests should reveal any gaps and guide you.</p>
<p>In return: each query remains isolated and atomic, with nothing external able to break it or implicitly change its behaviour. You can always review it visually without extra hopping and rebuild the mental context immediately. And that’s what truly matters.</p>
<p>No Alchemy needed when you have real Chemistry.</p>
]]></content:encoded></item><item><title><![CDATA[Rethinking How We Build Python Applications]]></title><description><![CDATA[Intro
Let’s start with an axiom, well-known to experienced engineers and somehow well-ignored by beginners.

Your job is not to write code. Your job is to solve problems through software, and coding i]]></description><link>https://hypha.pub/rethinking-how-we-build-python-applications</link><guid isPermaLink="true">https://hypha.pub/rethinking-how-we-build-python-applications</guid><category><![CDATA[DDD]]></category><category><![CDATA[Python]]></category><category><![CDATA[Hexagonal Architecture]]></category><category><![CDATA[Ports and Adapters]]></category><dc:creator><![CDATA[Roman Zaiev]]></dc:creator><pubDate>Wed, 19 Nov 2025 15:03:24 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1770301707090/ebe60300-2880-45d6-9b88-0845888043e0.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Intro</h2>
<p>Let’s start with an axiom, well-known to experienced engineers and somehow well-ignored by beginners.</p>
<blockquote>
<p>Your job is not to write code. Your job is to solve problems through software, and coding is just one of the aspects.</p>
</blockquote>
<p>When you start building a brand-new project, chances are your time and budget are fixed. If that’s the case, the only practical approach to get things done is the well-known <a href="https://basecamp.com/gettingreal/02.4-fix-time-and-budget-flex-scope">FFF methodology</a>: Fix time, Fix budget, Flex scope, originally formulated by 37signals in their bestseller <a href="https://basecamp.com/gettingreal"><strong>Getting Real</strong></a>.</p>
<p>Having a well-coordinated team where each member knows their own limits and strengths (a rare thing in real life but still possible) such a team naturally strives toward the actual boundaries of FFF, where the first two Fs are constant basic requirements and the third one, the scope, is the <em>best effort F</em>.</p>
<p>How can we fit the most into this FFF trade-off from a technical perspective? By using a well-known set of efficient tools, combined with real experience and intuition, right?</p>
<p>Something that will help us reach the main goal: building a product on time, with fewer developers, less effort, lower costs, and as few errors as possible.</p>
<p>Programming is a bit like fashion — a constantly changing mix of overhyped technologies. How do we actually decide what’s worth using?</p>
<p>You either trust yourself, relying on your own professional experience, or you trust someone else who uses or promotes it. Maybe you even have your own favourite evangelist or influencer?</p>
<p>And, of course, last but not least, sometimes you’re simply forced to. There’s a good chance your company or team already has a standardised tech stack with no real reason to review it, and in most cases, you just use it as is. That’s not necessarily bad and there’s always a chance that the guy on the square-wheeled bicycle is actually you, not your stubborn teammate.</p>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/68bca2462d48bd639b98e819/0e3b5082-1d50-417b-a1cf-f1777aa99c8e.jpg" alt="" />

<p>At the same time, we all need to periodically revisit the tools and approaches we use on a daily basis. We should be sceptical of all of them, because every tool was created by ordinary people solving their own problems, under specific conditions, in a particular historical context.</p>
<p>Maybe those people were simply wrong, selling us an idea that became widely adopted and still shapes our thinking years later. What if it was a popularised mistake? We must doubt.</p>
<p>That’s why good engineers constantly try out different new ideas.</p>
<p>That’s why open knowledge sharing is so important! It sparks innovation and supports personal growth.</p>
<p>That’s also why good team leaders encourage experimentation and the rotation of fresh ideas, while staying sceptical of all of them by default.</p>
<p>And you never know when the different fragments in your mind will suddenly begin to resonate, when the moment arrives and synthesis happens, and all the pieces of your interdisciplinary knowledge connect, creating a new mindset — your new Method.</p>
<h2>Retrospective</h2>
<p>I started my professional career as a Django web developer about 14 years ago, earning my first coding money by building websites in a tiny web studio located in a basement with no windows.</p>
<p>It was a time when Django and Python was nowhere near as popular as it is today. Most developers were into Ruby on Rails — the go-to web framework for startups back then. Ruby was sexy. Python was not.</p>
<p>Python had only a small share of the web development world compared to Ruby, PHP, and JavaScript at that time. But after trying a bit of everything, I became confident that Python was a more universal tool with greater potential.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763560543408/85849e2e-477b-413e-b04d-0e34756a86ec.png" alt="" />

<p>14 years later, I still use Python as my primary language, even though it’s definitely not the best fit for everything. When you learn more different languages and paradigms, it becomes very clear.</p>
<p>Python is good enough, though. It allows you to move fast and its trade-offs are well balanced.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763560564113/9b4a981a-471b-4a6c-bdee-bcb688160e77.png" alt="" />

<p>Back to Django. At the time, it seemed like a good idea to blindly follow the MVC pattern to build layered apps — or MVT, as Django prefers to call it. It felt natural.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763554132008/57c98ae9-65fe-4d51-96d6-6d1c11661200.jpeg" alt="" />

<p>Django itself encourages the MVT approach. And its fantastic ORM, full of dark magic. And signals. And fat models (or shall we call them abstract-positive)?</p>
<p>The thing is, Django was still quite new back then, and we didn’t have well-established practices for building complex systems that we could actually maintain later. And of course, we made plenty of design mistakes — spending days debugging here and there, sending emails directly from model classes, and doing many other barely legal things.</p>
<p>I recall a project where we literally trapped ourselves. Everything was so ugly under the hood, and so slow on the surface, that we ended up using aggressive Redis caching almost everywhere, with signal-based invalidation of course and constantly dying Celery workers rebuilding our heavy search indexes.</p>
<p>You may think that we simply didn’t have enough expertise to build anything more complex than blogs — and you would be right. Of course we didn’t.</p>
<p>Years later, now that I have more practical experience and have learned better practices, it’s very clear what went wrong and when. MVT, first of all. It doesn’t scale well. And...</p>
<pre><code class="language-python">class ListBaseView(
    QueryMixin,
    PaginationMixin,
    TemplateNameMixin,
    JsonResponseMixin,
    BaseView,
):
    """
    Generic list view with way too many responsibilities.
    """
    def get(self, request, *args, **kwargs):
        qs = self.get_queryset()
        context = self.get_context_data(objects=qs)
        return self.render_to_response(context)
</code></pre>
<p>Django introduced class-based views, or CBVs, in version 1.3 — and everyone happily started to overuse them in their projects. That whole idea of multi-inheritance mixins seemed brilliant at first, but it turned out to be one of the worst decisions ever made in web development history. It’s still one of my favourite examples of a long-running mistake with awful consequences.</p>
<blockquote>
<p>The most treacherous metaphors are the ones that seem to work for a time, because they can keep more powerful insights from bubbling up.</p>
<p>— Alan Kay.</p>
</blockquote>
<p>We abused inheritance over composition, and our app became as “flexible” as Python itself. We had clearly sunk into a bog of leaking abstractions, and we spent our days debugging where all that crap was coming from.</p>
<p>At some point, you’re fighting the tool instead of leveraging its strengths.</p>
<p>Then Pyramid happened to me, the "wire everything yourself" web-framework — one step closer to micro-frameworks and real modular app design.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763554813096/1e800951-0dd6-469b-b144-a30b276fb97f.jpeg" alt="" />

<p>Pyramid definitely deserved more attention, but it was too flexible and too verbose. Sitting somewhere between Django and Flask, it was always a hard sell.</p>
<p>Somehow it happened, there was a marketplace project — a really big one, built with Pyramid. And I was lucky enough to see what a truly large and complex Python web application can look like.</p>
<p>Teams used every trick in the book to keep the whole thing afloat. It was phenomenal! I’m still impressed that it actually worked. From the architecture perspective It was, let’s call it, an “opinionated variation” of MVC. To add a new feature, you had to dive into all the nuances of every single layer, spreading it across the codebase — jumping from package to package like a grasshopper.</p>
<p>And there were Mako templates. A lot of them. Many hundreds. If you’ve ever worked with Mako, you know they’re not bad, they’re just too permissive. You can call almost any Python code from a template, which can easily lead to accidental API calls or extra database queries that are very hard to track down.</p>
<pre><code class="language-xml">&lt;%! from myapp.services import get_discount_for_product %&gt;

% for p in products:
    &lt;div class="product-card"&gt;
        &lt;h2&gt;${p.name}&lt;/h2&gt;
        ## more queries and API calls, why not?
        &lt;% discount = get_discount_for_product(p.id) %&gt;
        % if discount:
            &lt;p&gt;&lt;strong&gt;${discount}% off!&lt;/strong&gt;&lt;/p&gt;
        % endif
    &lt;/div&gt;
% endfor
</code></pre>
<p>Of course, that’s a huge antipattern. Everyone agreed it was bad, and we had endless discussions about logicless templates and better practices.</p>
<p>But the “broken windows theory” works well here. As a developer, you open a template, see all the mess and piles of logic inside, and you just need to add one tiny thing. To do it properly, you’d have to refactor the entire template inheritance tree and the underlying controllers — but the deadline was yesterday, as usual. So you close your eyes… and just do it.</p>
<p>I started to suspect that application architecture might actually be important. Maybe we should invest more time and effort in learning better ways to organise code?</p>
<p>You can easily google different well-known approaches for that, like Clean Architecture, Onion Architecture, and Hexagonal Architecture. They are essentially all the same: keep your core logic pure and push all the messy, changing stuff (databases, frameworks, APIs) to the edges. That’s it.</p>
<p>And there are also communication-centric architectures: Event-Driven Architecture, Service-Oriented Architecture, Actor Model, etc. These define interaction patterns between systems or services.</p>
<p>Perfectly splendid. But the suggested abstractions and tools were, well, too abstract to understand and barely seemed applicable to my day-to-day problems. And the metaphors — too heavy and never really clicked for me, even today.</p>
<p>I mean, look at that! What the heck is this supposed to be? A nuclear reactor blueprint?</p>
<img src="https://herbertograca.com/wp-content/uploads/2018/11/070-explicit-architecture-svg.png?w=1100" alt="070 - Explicit Architecture.svg" />

<p>The thing is, I believe Python and its ecosystem were never really seen as something worthy of enterprise attention. Not mature enough. Not for “serious business”. It was viewed more as a “startup” or “prototype” language. Most practical examples you could find back then were written with Java or .NET in mind, never in Python.</p>
<p>Naturally, you thought: That’s not how we do things here. It’s not Pythonic — too verbose, too complex, too many abstractions, it just doesn’t feel right. Good “Pythonic” examples were out of reach. If they even existed.</p>
<p>Then Tornado happened to me. I found myself in a brand-new world of asynchronous programming and a pretty niche framework at that time. It was fresh, it was cool, it was fast. And it was a first true micro framework, with no predefined layout or boundaries at all.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763555692680/13b13a89-f367-4f9e-9740-e43a48fe9840.jpeg" alt="" />

<p>That was the time of the boldest experiments in my dev career. I can definitely say I learned a lot. I was actively researching, trying to find that elegant, simple, scalable, modular, well-testable, <strong>Pythonic way</strong> to build apps. Web apps first of all, but any apps essentially.</p>
<p>It reminds me of the story of chemistry as a young science — full of preconceptions. While alchemists were searching for the philosopher’s stone and failing, of course, they were also discovering the useful properties of substances and how to work with them, gradually crystallising real methodologies and facts about nature.</p>
<h2>Nothing personal. Just business.</h2>
<blockquote>
<p>Always start with business requirements!</p>
</blockquote>
<p>It's an almost self-evident idea. Before any coding rush, we need to understand the business problem correctly.</p>
<p>Let’s imagine our requirement is to build a <a href="https://hypha.tv">video-on-demand platform for professional creators</a>, where they can easily monetise their work. Creators upload their content, set prices and preferences, and share it directly with their audience. Viewers pay for what they watch, supporting their favourite creators. Simple as that.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763556129763/69c84fdb-87a8-4d81-a88e-dced597e994c.png" alt="" />

<p>It looks like we have two basic user roles here: a <strong>viewer</strong> and a <strong>creator</strong>. We debit the viewer when they watch a video and credit the creator when the video is viewed.</p>
<p>But running an actual card transaction for every single playback would be inefficient for many reasons, so we need some kind of internal credit system that viewers can use to redeem content.</p>
<p>In <a href="https://hypha.tv">Hypha</a>, these are called Points — they come in Packs that users can purchase to top up their accounts. Naturally, we need <strong>ledgers</strong> to track balances, reconciliation with our payment vendor, and <strong>payouts</strong> for creators. So yes, we definitely need <strong>billing</strong> here.</p>
<p>As a stakeholder, I’d like to see a <strong>rental-based and tips-based creator economy</strong>, offering day, month, or lifetime access, with optional tips — implemented as a smooth, single-click video unlock experience.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763556288063/939143d6-308b-4d1c-a65b-5b865f0e47ed.png" alt="" />

<p>And so on, and so on. We keep collecting business requirements.</p>
<p>To translate these business needs into something we can actually discuss with the team and build, we need a shared, ubiquitous language so everyone talks about the same things in the same way.</p>
<p>This is what Domain-Driven Design (DDD) is all about.</p>
<h2>DDD 101</h2>
<p>In short, DDD ensures the code speaks the same language as the business — shared terms, clear boundaries, and models that reflect how things actually work.</p>
<p>Your first instinct might be to think about your system in terms of SQL tables and the relationships between them.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763556407723/2a32af7b-1e87-4c28-9dd0-4ff70d9b594e.png" alt="" />

<p>Don’t do that. How you store and persist your data should never dictate your application design or your domain model!</p>
<p>The same goes for OOP. As a developer, you might immediately start creating classes and factories in your head, translating business requirements directly into a tree of classes.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763556424677/a24d4fc1-ce3d-4c4b-ab77-9ede5dcc4129.png" alt="" />

<p>Then something like AbstractBaseVideo class appears. And that already means nothing — it’s just an implementation detail, and probably not even a good one. Don’t do that.</p>
<p>Core building blocks of DDD are <strong>Domain and Bounded Context</strong>.</p>
<p><strong>The domain is the area of knowledge or activity your software is built to serve</strong> — the business problem you’re solving. For our video platform, the domain includes things like: Videos, Users, Analytics, Billing, etc.</p>
<p><strong>The domain also defines the rules</strong> — how these parts interact, what’s allowed and what isn’t. For example, who can upload or access a video, what its lifecycle looks like, how payouts are calculated? These rules are the core of the system’s logic — that’s what defines the product.</p>
<p>Each major part of that world can be its own bounded context, with its own rules and data model. Why bounded contexts matter?</p>
<p>As your system grows, words like <em>Video</em>, <em>User</em>, or <em>Payment</em> start to mean different things in different parts of the system.</p>
<ul>
<li><p>In <strong>Content Management</strong>, a <em>Video</em> is a media asset with files and metadata.</p>
</li>
<li><p>In <strong>Analytics</strong>, a <em>Video</em> is a source of metrics — views, retention, engagement.</p>
</li>
<li><p>In <strong>Billing</strong>, a <em>Video</em> is something that generates revenue.</p>
</li>
</ul>
<p>If you try to use one single model for all of those meanings, you end up with entangled codebase, endless conditionals and special cases.</p>
<p>Then come <strong>Entities</strong> and <strong>Value Objects</strong>. These are the building blocks of your <strong>domain model</strong>.</p>
<ul>
<li><p><strong>Entity</strong> – an object with a unique identity that persists over time (e.g. User, Video).</p>
</li>
<li><p><strong>Value Object</strong> – a small, immutable object that represents a concept by its value, not by identity. In the case of a <em>Video</em>, it can be a video track, audio track, subtitle track, or image sprite, each with its own metadata.</p>
</li>
</ul>
<p>In DDD, you group related entities and value objects into <strong>Aggregates</strong> — clusters of objects that change together and <strong>are treated as a single unit</strong> for data updates and consistency.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763556680718/b3a11d15-f974-41da-8702-2d80f0d9518d.png" alt="" />

<p>Here you can see a simplified example of a real Video aggregate, where Video is the root entity — also known as the <strong>Aggregate Root</strong> — and attributes like SourceMeta, Playlists, CoverImage, and a few others are Value Objects, assembled together into one cohesive unit.</p>
<pre><code class="language-python">@dataclass(kw_only=True)
class Video:
    id: UUID = field(default_factory=uuid7)
    public_id: NanoID = field(default_factory=generate_nano_id)
    owner_id: HyphaID

    title: str | None = None
    description: str | None = None
    src_meta: SourceMeta | None = None

    playlist: VideoPlaylist | None = None
    video_tracks: list[VideoPlaylist] = field(default_factory=list)
    audio_tracks: list[AudioPlaylist] = field(default_factory=list)
    subtitle_tracks: list[VideoSubtitle] = field(default_factory=list)

    cover_image: CoverImage | None = None
    fallback_cover_image: CoverImage | None = None

    state: VideoState

    created_at: DateTime = field(default_factory=now)
    updated_at: DateTime = field(default_factory=now)

    @classmethod
    def build(cls, owner_id: HyphaID, state: VideoState = VideoState.INIT) -&gt; Self:
        fsm = FSM(transition_map=TRANSITION_MAP, initial_state=state)
        return cls(owner_id=owner_id, state=VideoState(fsm.current_state))
</code></pre>
<p>How do we interact with an aggregate? Through a <strong>Repository</strong>, of course.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763556743244/ef34016b-1897-4421-9bdf-751060a222ea.png" alt="" />

<p>Repositories are mediating between the domain and the data-mapping layers, providing a stable, predictable interface for accessing and persisting aggregates.</p>
<p>In coding terms, define a repository as a Protocol that specifies a contract. The actual implementation comes later.</p>
<pre><code class="language-python">class VideoRepository(Protocol):
    db: Database

    async def add(self, video: Video, connection: Connection | None = None) -&gt; None: ...

    async def update(self, video: Video, connection: Connection | None = None) -&gt; None: ...

    async def get_by_id(
        self,
        *,
        video_id: NanoID,
        states: list[VideoState] | None = None,
        owner_id: HyphaID | None = None,
        connection: Connection | None = None,
    ) -&gt; Video | None: ...
</code></pre>
<p>A valid repository has a few clear signs:</p>
<ul>
<li><p>Operates on aggregates, not on arbitrary entities.</p>
</li>
<li><p>There is one repository per aggregate root.</p>
</li>
<li><p>Returns fully built aggregates, not partial data structures.</p>
</li>
</ul>
<p>Next come the <strong>Services</strong>. And there are two types.</p>
<p>A <strong>Domain Service</strong>:</p>
<ul>
<li><p>Encapsulates domain logic that spans multiple aggregates.</p>
</li>
<li><p>Operates purely within the domain layer (no I/O, databases, or API calls).</p>
</li>
<li><p>It is stateless.</p>
</li>
</ul>
<p>While Domain Services define what the business does, <strong>Application Services</strong> define <em>when and how it happens</em>. They handle:</p>
<ul>
<li><p>I/O (calling repositories, sending events, triggering workflows)</p>
</li>
<li><p>Transactions and orchestration</p>
</li>
<li><p>Integration with external systems</p>
</li>
</ul>
<blockquote>
<p>Rule of thumb: if it can be unit-tested, it’s a domain service. If not, it’s an application service.</p>
</blockquote>
<p>Just like with repositories, you start shaping your service with a protocol first.</p>
<pre><code class="language-python">class VideoPreferenceService(Protocol):

    async def get_favourites(
        self,
        *,
        owner_id: HyphaID,
        limit: int = 12,
        offset: int = 0,
        connection: Connection | None = None,
    ) -&gt; list[Video]: ...

    async def get_picks(
        self,
        *,
        owner_id: HyphaID,
        limit: int = 12,
        offset: int = 0,
        connection: Connection | None = None,
    ) -&gt; list[Video]: ...
</code></pre>
<h2>Keep I/O at edges to reduce side effects</h2>
<p>There’s input and there’s output.</p>
<img src="https://cloudmate-test.s3.us-east-1.amazonaws.com/uploads/covers/68bca2462d48bd639b98e819/516d1695-9673-4653-b93e-e130ba6c6a95.png" alt="" />

<p>Your domain in between should remain clean — pure business logic with no side effects.</p>
<p>All data is validated at the edges, and everything inside should be as immutable as possible.</p>
<p>Use <strong>DTOs (Data Transfer Objects)</strong> on the boundaries to serialise and deserialise data between layers or systems. A DTO is just a simple data container, it’s not supposed to contain any business logic, just validated, serialisable data.</p>
<p>Sounds like Pydantic, right? Many developers abuse Pydantic models, using them for everything, including domain modelling. Don’t do that. Keep your Pydantic schemas at the edges for requests and responses, but never inside your domain.</p>
<p>Inside, <strong>prefer plain dataclasses</strong> — simple and framework-agnostic.</p>
<h2>Data flow must be one-way</h2>
<p>No layer should depend on or call a layer that depends on it. Data and control must flow in one direction — forward, from input to output.</p>
<p>Every deterministic application, no matter how complex, follows the same basic pattern:</p>
<p>$$I → System → O$$</p>
<p>It's a data-transforming <strong>Pipeline</strong>.</p>
<p>In Hypha, pipelines control the entire video lifecycle, including all transcoding processes.</p>
<pre><code class="language-python">pipeline = (
    ConvertingInitial(inputs={"video_id": video_id})
    &gt;&gt; ConvertSubtitles()
    &gt;&gt; ConvertBaseAudio()
    &gt;&gt; ConvertVideoTo480p()
    &gt;&gt; RenderFinalPlaylist()
    &gt;&gt; ImportKeys()
    &gt;&gt; PublishToIPFS()
    &gt;&gt; ConvertingTerminal()
)
result = await executor.run(pipeline)
</code></pre>
<p>A Pipeline consists of logical Units that are designed to be composed from left to right. We don’t need to go through all the implementation details now. What’s important is that this approach lets you express even the most complex logic in a <strong>linear, readable flow</strong> you can hold in your head.</p>
<p>Technically, everything can be a pipeline. But the recommended convention is: if Aggregates assemble Entities and Value objects, and Repositories sit on top of Aggregates, and Services sit on top of the Domain and I/O, then Pipelines sit above all of that.</p>
<h2>SoC + LoB = ❤️</h2>
<p>Everyone has heard about <strong>Separation of Concerns</strong>. There’s also a lesser-known concept called <a href="https://htmx.org/essays/locality-of-behaviour/">Locality of Behaviour</a> — and it’s important to note that these two are <strong>not mutually exclusive</strong>.</p>
<p>Separation of Concerns means dividing a system into distinct parts, each responsible for a single purpose. Each layer or component should focus on one thing.</p>
<blockquote>
<p>SoC isn’t about technology boundaries — it’s about responsibility boundaries.</p>
</blockquote>
<p>Locality of Behaviour means keeping all code for one behaviour in one place. You should be able to remove it, disable it, or replace it easily. It reduces context switching and is one of the key principles to keep in mind.</p>
<p>In short: a feature’s behaviour should be local. You shouldn’t have to jump across the whole codebase to understand or change it.</p>
<p>Practically speaking, instead of:</p>
<pre><code class="language-python">/models/order.py
/endpoints/order.py
/templates/order_detail.html
/services/payment.py
</code></pre>
<p>You should prefer this:</p>
<pre><code class="language-python">/orders/
    models.py
    service.py
    endpoints.py
    templates/detail.html
</code></pre>
<h2>Identify moving parts</h2>
<p>Your domain layer should never depend directly on moving parts, because that makes it fragile. When a moving part changes, you don’t want your entire domain to break.</p>
<p>The <strong>database is a moving part</strong>. It can change — its schema, technology, connection details.</p>
<p>Your HTTP client in a service is a moving part. Your standard logger is a moving part — you can replace it with something like Loguru, for example. Even your web framework is a moving part. Ideally, your entire business domain layer should stay untouched when you change frameworks.</p>
<p>Because a <strong>web framework is I/O — and I/O is always a moving part</strong>.</p>
<h2>Connecting dots</h2>
<p>Okay, finally we have all the puzzle pieces. Now we need something to wire them together. That’s where a so-called <strong>Injector</strong> comes in.</p>
<p>Instead of our objects constructing their dependencies, we construct them from the outside and pass prepared arguments into their constructors.</p>
<p>Dependency Injection isn’t a standard approach in Python, and many developers don’t even realise they’re using it — but they actually do.</p>
<p><strong>pytest</strong> has its own form of DI for fixtures and tests.</p>
<pre><code class="language-python">@pytest.fixture
async def test_client(test_app: FastAPI) -&gt; AsyncGenerator[AsyncClient]:
    async with AsyncClient(
        transport=ASGITransport(app=test_app),
        base_url="http://testserver",
    ) as client:
        yield client

# the test_client fixture can be injected into a test now

@pytest.mark.asyncio
async def test_landing_page_anon(test_client: AsyncClient) -&gt; None:
    response = await test_client.get("/")
    assert response.status_code == 200
</code></pre>
<p><strong>FastAPI</strong>, the most popular Python web framework today, has its own DI system. In fact, FastAPI has done a great job of popularising dependency injection in the Python community!</p>
<pre><code class="language-python">@router.get("/", response_class=HTMLResponse, dependencies=[Depends(require_anon_user)])
async def index(
    request: Request,
    jinja: Annotated[Jinja2Templates, Depends(get_jinja_env)],
    video_repo: Annotated[VideoRepository, Depends(get_video_repo)],
    settings: Annotated[Settings, Depends(get_settings)],
) -&gt; HTMLResponse:
    ...
    return jinja.TemplateResponse(
        request=request,
        name="pages/welcome.html",
        context=context,
    )
</code></pre>
<p>However, <strong>FastAPI’s injector is scoped only to the request–response lifecycle</strong>, and it’s tightly bound to the framework — a moving part. That means you can’t reuse it inside your Domain anyway.</p>
<p>Lucky us, there’s a simple, lightweight, framework-agnostic solution called <a href="https://injector.readthedocs.io/en/latest/">Injector</a>.</p>
<p><strong>An Injector is a wiring tool that connects Ports (your protocols) with Adapters (their concrete implementations)</strong>, forming a central registry of application dependencies.</p>
<pre><code class="language-python">class CoreModule(Module):
    @singleton
    @provider
    def provide_config(self) -&gt; Settings:
        return Settings()

    @singleton
    @provider
    def provide_db(self, settings: Settings, logger: Logger) -&gt; Database:
        return Database(db_uri=str(settings.postgres.pg_dsn), logger=logger)


class ContentModule(Module):
    @provider
    def provide_video_repo(self, db: Database) -&gt; VideoRepository:  # protocol
        return VideoRepositoryAdapter(db=db)  # a protocol-compliant implementation

    @provider
    def provide_email_service(
        self,
        settings: Settings,
        logger: Logger,
    ) -&gt; EmailService:
        if settings.environment == Environment.TEST:
            return EmailServiceStub(settings=settings, logger=logger)
        else:
            return EmailServiceAdapter(settings=settings, logger=logger)


injector = Injector(
    modules=[
        CoreModule,
        ContentModule,
    ]
)
</code></pre>
<p>Dependencies can also depend on each other. For example, <code>VideoRepository</code> might require a <code>db</code> object, which is just another dependency defined in the same injector.</p>
<p>In practice, you can have dozens of repositories and application services registered and configured this way.</p>
<pre><code class="language-python">class VideoRepository(Protocol):
    db: Database

    async def add(self, video: Video, connection: Connection | None = None) -&gt; None: ...


@inject
class VideoRepositoryAdapter:
    def __init__(self, db: Database):
        self.db = db

    async def add(self, video: Video, connection: Connection | None = None) -&gt; None:
        if connection:
            await self._add(video, connection=connection)
        else:
            async with self.db.pool.acquire() as conn, conn.transaction():
                await self._add(video, connection=conn)
</code></pre>
<p>Notice the <code>@inject</code> decorator here — it marks a class so that the <strong>injector will automatically provide its dependencies</strong> when it’s created or called. Now, whenever you need a video repository, you can simply call <code>video_repo = injector.get(VideoRepository)</code>, where <code>VideoRepository</code> is the protocol and <code>video_repo</code> is the concrete instance provided by the injector.</p>
<p>A useful side effect worth mentioning. The <code>PublishTileConsumer</code> class here isn’t a dependency and isn’t part of the injector configuration. But since all the attributes in its constructor are already known to the injector, the injection just works automagically. Handy!</p>
<pre><code class="language-python">@inject
class PublishTileConsumer(SSEConsumer):

    def __init__(
        self,
        db: Database,
        profile_repo: ProfileRepository,
        video_repo: VideoRepository,
        settings: Settings,
        logger: Logger,
    ):
        self.video_repo = video_repo
        self.profile_repo = profile_repo
        self.settings = settings
        self.logger = logger
        self.db = db

    async def process(self, event: StreamVideoEventPayloadSchema) -&gt; list[SSEEvent]:
        ...
</code></pre>
<p>DI is a great pattern, but we end up with three different DI systems! And if you want to reach the app’s dependencies inside FastAPI, you basically have to wrap one injector inside another.</p>
<pre><code class="language-python">@asynccontextmanager
async def lifespan(app: FastAPI) -&gt; AsyncGenerator[None]:
    injector: Injector = app.state.injector
    db = injector.get(Database)
    async with db:
        yield
        
app = FastAPI(lifespan=lifespan)

app.state.injector = injector  # our app-wide injector

def get_injector(request: Request) -&gt; Injector:
    return cast(Injector, request.app.state.injector)  # YEP!

def get_video_repo(
    injector: Annotated[Injector, Depends(get_injector)],
) -&gt; VideoRepository:
    return injector.get(VideoRepository)

@router.get("/me")
async def dashboard_page(
    request: Request,
	video_repo: Annotated[VideoRepository, Depends(get_video_repo)],
) -&gt; HTMLResponse:
    ....
</code></pre>
<p>Maybe one day we’ll see a proposal for a standard injector in Python, similar to Java’s JSR-330 dependency injection specification. Who knows?</p>
<h2>And that’s it!</h2>
<p>This is it! That’s Hexagonal Architecture (aka Ports &amp; Adapters) with Dependency Injection, implemented in Python.</p>
<p>By following these simple, straightforward principles, you can gradually scale the complexity of your application without degrading code quality, maintaining a clean design and well-testable components.</p>
<p>Speaking of tests. Here’s a bonus observation.</p>
<blockquote>
<p>If you have to use mocks in your tests — your code smells.</p>
</blockquote>
<p>Literally. If you find yourself relying on mocks, magic mocks, or anything similar, your composition is wrong. With proper Ports &amp; Adapters, you never need mocks. Just implement simple stubs that follow the protocol and register them as substitute implementations for your test environment via the injector.</p>
<p>Speaking of FastAPI — it’s just great. The author, the community, the documentation, and its close collaboration with the Pydantic team are all excellent. FastAPI is King for a reason and it deserves that title.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763560644767/e7492e3d-74e1-4b2e-aaa4-a5cdb1019d50.png" alt="" />

<p>And of course, use the modern Python toolset:</p>
<ul>
<li><p><a href="https://docs.astral.sh/uv/guides/install-python/"><strong>uv</strong></a> as your default package/runtime tool</p>
</li>
<li><p><a href="https://docs.astral.sh/ruff/"><strong>ruff</strong></a> for linting</p>
</li>
<li><p><a href="https://mypy-lang.org/"><strong>mypy</strong></a> for type checking</p>
</li>
<li><p><a href="https://www.python-httpx.org/"><strong>httpx</strong></a> as your default async HTTP client</p>
</li>
</ul>
<p>Solid defaults for your <code>pyproject.toml</code> in 2026.</p>
<h2><strong>Worth reading, viewing, and bookmarking</strong></h2>
<ul>
<li><p><a href="https://basecamp.com/gettingreal">Getting Real</a> and <a href="https://basecamp.com/shapeup">Shape Up</a> books by Basecamp</p>
</li>
<li><p><a href="https://www.cosmicpython.com/">Cosmic Python: Architecture Patterns with Python</a></p>
</li>
<li><p><a href="https://fsharpforfunandprofit.com/fppatterns/">Functional Programming Design Patterns</a></p>
</li>
<li><p><a href="https://fsharpforfunandprofit.com/rop/">Railway Oriented Programming</a></p>
</li>
<li><p><a href="https://fsharpforfunandprofit.com/posts/serializing-your-domain-model/">Serializing your domain model</a></p>
</li>
<li><p><a href="https://kristogodari.com/software-architecture/hexagonal-onion-clean-architecture/">Hexagonal, Onion, and Clean Architecture: A Guide to Maintainable Software Design</a></p>
</li>
<li><p><a href="https://herbertograca.com/2017/11/16/explicit-architecture-01-ddd-hexagonal-onion-clean-cqrs-how-i-put-it-all-together/">DDD, Hexagonal, Onion, Clean, CQRS, … How I put it all together</a></p>
</li>
<li><p><a href="https://www.youtube.com/watch?v=SxdOUGdseq4">Simple Made Easy – Rich Hickey</a></p>
</li>
<li><p><a href="https://www.youtube.com/watch?v=wo84LFzx5nI">Casey Muratori – The Big OOPs: Anatomy of a Thirty-five-year Mistake</a></p>
</li>
</ul>
]]></content:encoded></item></channel></rss>