Building a Resilient Background Worker in Elixir (Without Overthinking It)
Background jobs sound simple — until they aren’t.
Retries pile up.
One failure takes down the whole worker pool.
Jobs disappear silently or run twice.
What I like about Elixir is that it encourages you to design for failure early, instead of patching it later.
This post walks through a simple but production-friendly background worker using GenServer and supervision — no frameworks, no magic.

The Mental Model: Small Processes, Clear Responsibility
In Elixir, the goal isn’t to create one “smart” worker.
It’s to create many small, replaceable processes.
Each process should:
- Do one thing
- Fail loudly if it can’t
- Be restarted automatically
This is the foundation of fault tolerance on the BEAM.

Step 1: Define the Worker (GenServer)
Let’s start with a worker that processes a single job and exits.
1defmodule MyApp.Worker do
2 use GenServer
3 require Logger
4
5 ## Public API
6
7 def start_link(job) do
8 GenServer.start_link(__MODULE__, job)
9 end
10
11 ## Callbacks
12
13 @impl true
14 def init(job) do
15 send(self(), :process)
16 {:ok, job}
17 end
18
19 @impl true
20 def handle_info(:process, job) do
21 case perform(job) do
22 :ok ->
23 Logger.info("Job completed successfully")
24 {:stop, :normal, job}
25
26 {:error, reason} ->
27 Logger.error("Job failed: #{inspect(reason)}")
28 {:stop, reason, job}
29 end
30 end
31
32 defp perform(_job) do
33 if :rand.uniform() > 0.7 do
34 :ok
35 else
36 {:error, :random_failure}
37 end
38 end
39end
40Key Ideas
- The worker does its job and exits
- Success and failure are explicit
- No retry logic inside the worker
Step 2: Supervise the Worker
Now we introduce a supervisor to manage worker lifecycles.
1defmodule MyApp.WorkerSupervisor do
2 use DynamicSupervisor
3
4 def start_link(_) do
5 DynamicSupervisor.start_link(__MODULE__, :ok, name: __MODULE__)
6 end
7
8 @impl true
9 def init(:ok) do
10 DynamicSupervisor.init(strategy: :one_for_one)
11 end
12
13 def start_job(job) do
14 spec = {MyApp.Worker, job}
15 DynamicSupervisor.start_child(__MODULE__, spec)
16 end
17endWhy This Works Well
- Each job runs in isolation
- One crash doesn’t affect others
- Restart behavior is consistent and observable
What Happens When a Job Fails?
When perform/1 fails:
- The GenServer crashes
- The supervisor handles cleanup
- Logs clearly show what happened
No silent retries.
No hidden state.
No cascading failures.
This is one of the biggest advantages of building background work directly on the BEAM.
When I Use This Pattern
This approach works well for:
- API-triggered async work
- Data enrichment
- External service calls
- Internal tooling and pipelines
I wouldn’t use it for:
- Massive job queues
- Exactly-once delivery
- Persistent retries without storage
Elixir gives you primitives — not opinions.
Final Thoughts
Elixir didn’t just give me better concurrency.
It changed how I think about failure as a design input.
Small processes.
Clear supervision.
Predictable recovery.
When production gets boring, you’re doing it right.