Elixir’s “let it crash” philosophy is one of the best ideas in modern software design. Supervisors restart failed processes, the system self-heals, and life goes on. It’s like having a really good immune system.
The problem is that a really good immune system can also hide chronic conditions. A GenServer crashing and restarting is working as designed. A GenServer crashing and restarting 500 times an hour is a five-alarm fire that nobody called in because, well, the restarts kept succeeding. Web frameworks are another layer of this. Phoenix controllers can rescue exceptions and return perfectly formatted 500 pages. The system stays “up”, just not “working.”
We built error monitoring into the Scout APM Elixir agent so you can keep the resilience of “let it crash” while actually knowing what’s crashing, how often, and why.
Built In, Turned On
Error monitoring ships inside the scout_apm package itself as of v2.0.0. There’s no separate error tracking package to install. It’s enabled by default:
config :scout_apm, errors_enabled: true
For a Phoenix app, that’s it for configuration, unless you want to customize behavior. If you’re already running scout_apm, errors will start flowing as soon as you attach the telemetry handler.
Automatic Capture via Telemetry
The primary integration point is Phoenix’s telemetry system. Add one line to your Application.start/2:
def start(_type, _args) do
ScoutApm.Instruments.PhoenixErrorTelemetry.attach()
children = [
# ... your supervision tree
]
opts = [strategy: :one_for_one, name: MyApp.Supervisor]
Supervisor.start_link(children, opts)
end
This attaches handlers to two telemetry events:
[:phoenix, :router_dispatch, :exception]fires when an exception bubbles up through Phoenix’s router.[:phoenix, :error_rendered]fires when Phoenix renders an error response (filtered to 500+ status codes, so you won’t get noise from 404s).
When either event fires, we extract the exception class, message, stacktrace, request path, parameters, session data, and controller/action. The conn struct gives us all of this context automatically, so the error arrives in Scout with enough information for you to actually diagnose it.
The handler also calls TrackedRequest.mark_error(), which links the error to the performance trace for that request. More on why that matters in a moment.
Manual Capture
Automatic capture handles the exceptions that escape your code. But what about the ones you catch deliberately? If you rescue an error, log it, and return a graceful fallback, the telemetry handler never sees it.
For those cases, use ScoutApm.Error.capture/2:
try do
process_payment(order)
rescue
e ->
ScoutApm.Error.capture(e, stacktrace: __STACKTRACE__)
reraise e, __STACKTRACE__
end
The __STACKTRACE__ special form gives you the actual stacktrace from the rescue block. If you don’t pass it, we’ll grab the current process stacktrace, but that includes our internal frames and is less useful.
You can attach additional context to help with debugging:
ScoutApm.Error.capture(e,
stacktrace: __STACKTRACE__,
context: %{user_id: user.id, order_id: order.id},
request_path: "/api/orders",
request_params: %{action: "create"}
)
The context map is freeform. Put whatever you need in there: user IDs, feature flags, queue names, the phase of the moon. It all shows up in the Scout UI attached to that specific error occurrence.
LiveView Errors Too
If you’re using ScoutApm.Instruments.LiveViewTelemetry, you get error capture for free. We listen for :exception events on mount, handle_event, and handle_params:
def start(_type, _args) do
ScoutApm.Instruments.PhoenixErrorTelemetry.attach()
ScoutApm.Instruments.LiveViewTelemetry.attach()
# ...
end
When a LiveView callback raises, we capture the full error with the view module name and callback as the controller context. So an exception in MyAppWeb.DashboardLive.handle_event shows up attributed to that specific view and event, not as a generic unhandled error.
Filtering and Configuration
Not every exception deserves attention. Bots hitting nonexistent routes generate Phoenix.Router.NoRouteError at a steady clip, and you probably don’t need alerts about those. You can ignore specific exception types:
config :scout_apm,
errors_enabled: true,
errors_ignored_exceptions: [Phoenix.Router.NoRouteError],
errors_filter_parameters: ["password", "credit_card", "cvv"]
The errors_filter_parameters list controls which request parameter keys get scrubbed before leaving your server. We ship with a sensible default list that covers common sensitive fields (passwords, tokens, API keys, SSNs, etc.), so adding your own list extends rather than replaces the defaults.
Sensitive parameter values are replaced with [FILTERED] before the error payload is built. The filtering runs recursively through nested maps, so a password buried three levels deep in your params still gets caught.
Errors In Context
Scout doesn’t just collect errors in isolation. Our performance metrics and endpoint traces are running right along side. That context turns “we got a RuntimeError in OrderController#create” into “we got a RuntimeError in OrderController#create after a 3-second database query timed out on the inventory check.” One of those is a stack trace. The other is a diagnosis.
Errors in Scout are grouped by exception class and message pattern, so 200 occurrences of the same Ecto.NoResultsError show up as one group with a count, not 200 separate entries. Groups can be prioritized, assigned to team members, and resolved. You can read more about the full error monitoring feature set in our error monitoring documentation.
Under the Hood
The error service runs as a GenServer that batches errors (5 per batch by default) and sends them asynchronously. There’s a max queue size of 500 to prevent memory issues if errors spike. If you’re generating errors faster than we can ship them, we drop the oldest ones rather than letting the queue grow unbounded. This is the kind of boring reliability engineering that matters when something is going very wrong in your application and you really don’t want your monitoring to make it worse.
Getting Started
If you’re already using scout_apm, update to the latest version and attach the telemetry handler:
# mix.exs
{:scout_apm, "~> 2.0"}
# application.ex
def start(_type, _args) do
ScoutApm.Instruments.PhoenixErrorTelemetry.attach()
# ...
end
Errors will start appearing in your Scout dashboard within minutes. If you want to try Scout APM with your Elixir application, you can start a free trial and have full error monitoring and performance tracing running in about five minutes.