Practical tips for debugging Flask under remote uWSGI with rpdb

This is a short post describing how to debug Flask apps with the ever-useful rpdb, along with a few gotchas to be careful of.

Our workhorse web backend is Flask+uWSGI, running on standalone EC2 instances. At the same time we rely on Twisted for several backend services. On occasion a Thinkster might need to debug one of these services on one of the EC2 instances. Due to our instance isolation strategy, it’s tricky to get fancy remote debugging running, such as VS Code’s Remote Debugging. Flask has a built-in debugger which may work for you. We ban it at the web server to ensure it’s never accessible. Instead, we often reach for rpdb. The requirements for this approach are minimal:

Remote access to the instance (e.g. via SSH or AWS’s SSM)
Ability to make a change to code on the instance
Ability to restart the uWSGI process

Why rpdb?

Rpdb is an extremely thin wrapper that makes Python’s default debugger (pdb) network accessible. It’s not featureful, and the CLI has no history, or readline-style editing capabilities. It hasn’t been updated in 8 years. So why do we like rpdb? It’s simple: the package is tiny, has no additional dependencies, and does exactly what we need. We don’t reach for it when doing heavy development, it’s used essentially to inspect internal program state when single-stepping through some section of code. For example, we seldom even set breakpoints when using rpdb.

If we’re heavily debugging code, it’ll be on local setups. For quick checks on, say, staging infrastructure, fancier debuggers like web-pdb require more network access which we want to avoid to ensure we never expose debuggers accidentally to the world. Rpdb is tiny, simple, and hard to expose (its default listening address is localhost).

Toy example

Let’s say you’re trying to debug the following Flask endpoint which isn’t setting its Content-Type header correctly:

@app.route(‘/scream-into-the-void’)
def scream_into_the_void():
    response = make_response(f"{'A'*111}{'H'*95} **deep breath**")
    return response

We want to debug after the response object has been created but before it’s been returned.

Setting up rpdb

Rpdb exists in PyPI, so install with pip or similar:

$ pip install rpdb

Even though it’s dated, it happily runs on Python 3.

Tip: don’t forget to install rpdb in the same Python environment your uWSGI app is running in. Typically this is set in the uwsgi.ini configuration file by any of these variables: home, virtualenv, env, pyhome:

$ grep "home" /path/to/uwsgi.ini
home = /path/to/virtualenv
$ /path/to/virtualenv/pip install rpdb

Adding rpdb to your code

Edit your Flask application by finding the line where you’d like the debugger to kick in, by adding the rpdb import and function call:

@app.route(‘/scream-into-the-void’)
def scream_into_the_void():
    response = make_response(f"{'A'*111}{'H'*95} **deep breath**")
    # Add the line below
    import rpdb; rpdb.set_trace()
    return response

Then restart the uWSGI service that runs your Flask application (assuming you don’t have uWSGI’s autoreload configured):

$ systemctl restart canary-uwsgi-application

Triggering the debugger

Browse to the URL handler in which the rpdb statement was added (e.g. https://canary.tools/scream-into-the-void). This will trigger the debugger, and you’ll notice that the request in the browser hangs. That’s expected; the debugger has paused processing the request, and is awaiting your command.

Access to the debugger is only possible (by default) from the same host that the debugger is running on. Connect either with telnet or netcat (it’s possible to get rpdb to listen on non-localhost address, but that’s much more exposed and we won’t stand for such nonsense):

# Run this on the same host where the uWSGI process is running
$ nc localhost 4444
> /path/to/flask/flask.py(1005)scream_into_the_void() -> response = make_response(f"{'A'*111}{'H'*95} **deep breath**")
(Pdb) print(request)
<Request 'https://canary.tools/scream-into-the-void' [GET]>

We can modify the request:

(Pdb) response.headers['Content-Type'] = 'text/plain'

Then continue execution to see if the response correctly renders as plain text:

(Pdb) continue

With this access, you can inspect the request, and the application state. Use all the ‘pdb’ commands you know and love:

step
next
print
up
down
bt

Sharp edges

There are a handful of things that might surprise you when working with rpdb. Being simple, rpdb won’t handle every single situation gracefully. Fortunately, none of them will hold you back from successfully debugging. (And, as always, there’s nothing to stop us improving rpdb ourselves to soften these edges.)

Sharp edge 1: Closing the debugger session

When you continue inside an rpdb session, program execution will proceed until the next breakpoint set via the breakpoint command. However if you didn’t set any breakpoints inside the controlling session (i.e. the netcat connection), then netcat will just sit there because the execution has continued and the application is still running. You might think the same netcat session will be reused if rpdb.set_trace() is encountered again, but that’s not actually what happens. When the rpdb.set_trace() line is hit again (e.g. you requested the URL a second time), then a new debugger session is created with a new listener. You need to kill netcat manually with Ctrl-C, and reconnect.

The debugging flow for an endpoint typically looks something like:

Add rpdb.set_trace()
Trigger the breakpoint with a web request
Connect to rpdb with nc localhost 4444
When done debugging, enter continue to finish the debugger session
Enter Ctrl-C to exit the netcat session
Repeat steps 2–5 until done

As an aside, typing quit inside the debugger doesn’t close the netcat session either, you still need to Ctrl-C to kill netcat. Typing quit will cause the debugger to throw an exception in the code it interrupted, unless you want that exception, you should default to using continue and Ctrl-C.

Sharp edge 2: uWSGI and web server timeouts

When Flask is run inside uWSGI, it’s subject to uWSGI’s worker timeouts. If a uWSGI worker (i.e. the Python process handling the Flask request) takes longer than the configured value to return a response, the process is killed by uWSGI. Similarly, if uWSGI is fronted by a web server like nginx, the request is also subject to the web server’s timeouts (e.g. uwsgi_read_timeout).

These timeouts are visible in four ways when using rpdb:

The netcat session suddenly dies as you’re busy with it. Something has killed the Python process.
The uWSGI logs contain lines with the term “HARAKIRI”, which means uWSGI killed the Python process due to it taking too long to return a response.
Your web server logs show an error (e.g. if using nginx, you’ll see something like “upstream prematurely closed connection while reading response header from upstream”, which indicates that the uWSGI timeout was hit, or “upstream timed out (110: Unknown error) while reading response header from upstream”, which indicates that nginx’s own timeout was hit.
Your browser shows an error after some period of time. Gateway Timeouts (504) indicate that the web server’s timeout was hit first, a Bad Gateway (502) typically indicates the uWSGI timeout was hit.

The solution is simple:

If the timeout is due to uWSGI killing the debugger, configure the “harakiri” value in your application’s uwsgi.ini file to be high (e.g. 3600), and restart the uWSGI process.
If the timeout is due to the web server killing the debugger, configure the web server’s read timeout to be high (e.g. 3600), and restart the web server process.

Now you can hang about in the rpdb session without fear of getting kicked out.

Sharp edge 3: uWSGI and multiple workers

The final point to consider is uWSGI’s execution model. If you rely on workers, uWSGI will spin up one or more Python processes. During the course of regular application request processing, these Python processes will all eventually be killed and new processes started to take their place.

When rpdb.set_trace() is added to a point in the application and the endpoint is requested, uWSGI will randomly assign one of the worker processes to handle the request. The debugger is triggered in that process. If the request is repeated, it’s possible a different worker process is assigned the request, so the next debugger session may take place in a completely different process from the first.

Usually this is fine, but if the code is doing things like memoization or in-memory caching (and that’s what you’re trying to debug), then your results will be very unexpected if you don’t consider that you may not be debugging the same process as your last debug session.

If this is an issue, you can configure the “workers” parameter in uwsgi.ini to 1, to ensure only a single worker is ever present. But this still doesn’t prevent uWSGI from killing the single worker and replacing it with a new process.

Similarly, uWSGI supports threads and similar hazards exist with not controlling which thread the debugger is tripped in. You can also tune uWSGI to not use threads.

Summary

Rpdb is a tiny wrapper around Python’s pdb debugger. It’s dead simple to install and use, and works in constrained environments. We like it, and in this post explored how to deploy it while being mindful of several sharp edges.

Practical tips for debugging Flask under remote uWSGI with rpdb

Related

Related Posts:

Refreshing Canarytokens.org: a new interface, new functionality, and our security assessment results

On Caring

Leave a ReplyCancel reply

Practical tips for debugging Flask under remote uWSGI with rpdb

Related

Related Posts:

Refreshing Canarytokens.org: a new interface, new functionality, and our security assessment results

On Caring

Leave a ReplyCancel reply

Discover more from Thinkst Thoughts