I've been thinking quite a bit about architecture lately. Not building architecture, but application architecture. The inspiration for this has been many blog posts, talks, slide decks and books, and I'd like to take the time to share my take on what I've been learning. And to do that, I sat down with one of my projects and looked for the most boring, mundane parts of it: logging in.
Right, because who wants to write log in forms and code again? Anyways, this is what I started with:
@app.route('/login', methods=['POST', 'GET']) def login(): if current_user.is_authenticated(): return redirect(url_for('homepage')) form = LoginForm() if form.validate_on_submit(): user = User.query.filter(db.or_(User.username == form.login.data, User.email == form.login.data)).first() if user and check_password_hash(user.password, form.password.data): login_user(user) flash("thank you for logging in", 'success') return redirect(url_for('user.profile', username=user.username)) else: flash("Bad username/password combination") return render_template("login.html", form=form)
That's actually pretty standard across web apps I've seen. You don't let logged in users log back in (though, I have seen apps that occasionally need a user to reauthenticate), there's some webform that we'll pull data off of. Next, we'll try pulling user data for either the username or email provided and then check if the password typed in matches the user's password hash. If it's all good, we redirect off somewhere otherwise there's some HTML rendered.
Big whoop right. However, this is actual logic for the application. But it's all shoved into an endpoint which is essentially UI logic. And imagine testing this, we need to contend with:
- A thread local (current_user)
- Form data
- Werkzeug's password checker
- Message flashing
- Flask's url routing -- both url_for and redirect
- Creating the form
- Multiple exit branches
- A nested if statement
- Actually logging in the user
- The difference between GET and POST requests
That's too many things too many to deal with at once. "But Alec, there's unittest.mock!" If we're going to go that route, we need to have a pile of mock.patch decorators and context managers and there's the real chance we just end up creating some meaningless assertion. Like this...
@mock.patch('myapp.auth.views.current_user') @mock.patch('myapp.auth.views.LoginForm') @mock.patch('myapp.auth.views.User') @mock.patch('myapp.auth.views.flash') @mock.patch('myapp.auth.views.url_for') @mock.patch('myapp.auth.views.redirect') @mock.patch('myapp.auth.views.render') @mock.patch('myapp.auth.views.login_user') def test_login_user_endpoint(login_user, render, redirect, url_for, flash, UserModel, form, user): #TODO: Write test assert True
How do we even begin munging and manipulating those mocks into any sort of meaningful test? If you have an endpoint like this that's tested and you didn't end writing tests that basically assert that mock.patch works, I applaud you.
An acceptance test might be a good fit here to assert the general behavior. Insert a user in the database and then run the function and assert the user is actually logged in at the end of it, insert some user in the database and submit a bad password as part of the form and assert the template is rendered with the flashed message, etc.
And I'd probably recommend that (I didn't do it in this particular instance though) to make sure the functionality of the endpoint is preserved without asserting any fine grained details. With an acceptance test we want to see a forest, not trees. But there's actual reusable components that are obscured by simply shoving everything into the end point. So, let's find them and make them citizens in the application.
Mise en place
Adrienne, a friend of mine, has an excellent blog post comparing cooking preperations with coding preperations. And I think it applies to more than just preparing to code, when we're refactoring, especially with the goal of unjamming an endpoint, it's important to lay out some goals and examine what we're dealing with.
And in this case, we already did that already. We know what the goal is and we know all the things our endpoint touches to do its work.
Let's start with the last concern I listed out as it's the most obvious seam: we handle GET and POST request separately. If it's a GET request, only render a template. If it's a POST request, run the actual logic and then either render or redirect.
Currently, we silently treat the idea of GET and POST separately with the use of validate_on_submit. However, Flask (and other frameworks) provide a way of explicitly declaring the difference without a series of if-statements. In Flask, we call it MethodView, and we can simply define separate methods for each HTTP Verb.
from flask.views import MethodView class LoginController(MethodView): def get(self): return render_template('login.html', form=LoginForm()) def post(self): form = LoginForm() if form.validate_on_submit(): user = User.query.filter(db.or_(User.username == form.login.data, User.email == form.login.data)).first() if user and check_password_hash(user.password, form.password.data): login_user(user) flash("thank you for logging in", "success") return redirect(url_for('user.profile', username=user.username)) else: flash("Bad username/password combination") return render_template('login.html', form=form)
Right now, we've broken one of our imaginary acceptance tests: "Authenticated users cannot re-log in". However, there's an easy fix, MethodView (and it's parent View) allow us to provide decorators that are used at run time when the actual view function is generated. So we can take that little if-check and turn it into a function...
def disallow_authenticated(f): @wraps(f) def checker(*args, **kwargs): if current_user.is_authenticated(): return redirect(url_for('homepage')) else: return f(*args, **kwargs) return checker
And then stuff it in as a class attribute...
class LoginController(MethodView): decorators = [disallow_authenticated]
Already this is slightly more testable. Not much, but we can write a test against disallow_authenticated that doesn't involve LoginUser and vice-versa -- these decorators are only applied when the as_view factory method is used. There's a little bit of refactoring we can do as well, we instantiate the actual form in two different spots and we do the rendering in two different spots as well:
class LoginController(MethodView): def __init__(self): self.form = LoginForm() def get(self): return self.render() def post(self): if self.form.validate_on_submit(): ... return self.render() def render(self): return render_template('login.html', form=self.form)
However, there's still lots of logic inside the endpoint: we authenticate and login the user. Let's first extract that out into its own thing so we can examine it closer:
def log_in_user(login, password): user = User.query.filter(db.or_(User.username == login, User.email == login)).first() if user and check_password_hash(user.password, password): login_user(user)
We're missing the failure case, but we can address that. There's two ways to handle this:
- Error codes
I like exceptions because they're really loud, unlike error code which can silently pass. Exceptions must be dealt with or the application crashes. However, there's solid arguments for both styles, pick the one the fits your brain best.
class AuthenticationError(Exception): pass def log_in_user(login, password): ... else: raise AuthenticationError("Bad login credentials")
And whatever calls this will deal with the exception. We'll get back to that later though. So the actual story for this function looks like this:
- Locate a user with either the provided username or email
- If we find the user, check that the provided password matches the user's stored password hash
- If those match, login in the user
- If a user can't be found or the passwords do not match, raise an error
Notice that we don't say anything about how to find the user, just that we need to. Realizing this sent a shiver down my spine: I don't actually care about the database, I care about the data it stores. Honestly, this data could come from anywhere: database, Redis, flatfile, a hash table, remote API, etc. And the next step was a leap of faith for me: stick an interface there.
def locate_by_username_or_email(login): return User.query.filter(db.or_(User.username == login, User.email == login)).first()
And then, somehow, inject that into the log_in_user function. Why we want to inject it is to remove the dependency from the database and instead depend on the contract: either return a user object or a None. We should do the same with the password hash validator so we can depend on that interface rather than Werkzeug. Since we don't want to have to do it every time, we could use a closure:
def log_in_factory(locator, password_checker): def log_in_user(login, password): user = locator(login) if user and password_checker(user.password, password): login_user(user) else: raise AuthenticationError("Bad login credentials") return log_in_user
This is a lot more testable, we only have to contend with one thing outside of our control: login_user rather than twelve things outside our control. But then I thought about it some more. This is concerned with authenticating a user. And there's a separate concern about actually logging in the user. There's also a feature request on the project about creating a JSON API, which will need to also authenticate the user but probably not using password auth. Instead of using a closure, what if I used the closest thing Python has to the interface keyword?
from abc import ABC, abstractmethod class Authenticator(ABC): @abstractmethod def authenticate(self, *args, **kwargs): pass
That's when the whole idea of software architecture really began falling into place for me. Let's rejigger the current implementation to fit this instead:
class PasswordAuthenticator(Authenticator): def __init__(self, locator, checker): self._locator = locator self._checker = checker def authenticate(self, login, password, *args, **kwargs): user = self._locator(login) if user and self._checker(user.password, password): return user raise AuthenticationError("Bad credentials")
Instead of having the side effect -- actually logging in the user -- right there, the authenticator defines a contract that it either returns a user or raises an AuthenticationError. We can also define an implementation that will actually login the user as well:
class LoginAuthenticator(Authenticator): def __init__(self, login, authenticator): self._login = login self._authenticator = authenticator def authenticate(self, *args, **kwargs): user = self._authenticator.authenticate(*args, **kwargs) self._login(user) return user
If we find that we're doing many things after authenticating a user (updating a last_seen field or keeping an audit log), we can further generalize this to an AfterAuthentication class that we simple give the callable that causes the side effect.
Back to the Endpoint
Because of all this changing, we've broken every test on the controller since it isn't equipped to deal with any of this. So we should revisit it and make changes.
class LoginController(MethodView): def __init__(self, authenticator): self._authenticator = authenticator self._form = LoginForm() def post(self): if self._form.validate_on_submit(): try: user = self._authenticator.authenticate(**self.form.data) except AuthenticationError as e: flash(e.args) return self.render() else: return redirect(url_for('user.profile', username=user.username)) else: return self.render()
However, that post method is really concerned with a lot of stuff and is indented quite a bit. There's two talks that influenced the next changes. The first was Sandi Metz's All the Little Things, where she proposes a major tool in refactoring: the squint test.
You squit your eyes, and you lean back and you look at the code. You're looking for changes in shape and changes in color.
She goes on to say that changes in shape mean you have nested conditions and that color changes means the code is at different levels of abstraction. Which there definitely is here.
Discounting the class and method level indentations (because we have to have those), the actual logic is indented two levels deep. Pretending we have some Flask-specific code highlighting, we'd see that color bumped against our code and the strings.
The first time around this, I ended up creating an authenticate method on the controller, following her advice to make smaller things. The whole thing looked like this:
class LoginUser(MethodView): decorators = [disallow_authenticated] def __init__(self, authenticator): self._authenticator = authenticator self._form = LoginForm() def get(self): return self.render() def post(self): if self.form.validate_on_submit(): return self._authenticate() else: return self.render() def _authenticate(self): try: user = self._authenticator.authenticate(**self.form.data) except AuthenticationError as e: self._handle_errors(e) return self.render() else: return self._redirect() def _handle_errors(self, *errors): flash(e.args) def _redirect(self): return redirect(url_for('user.profile', username=self.form.username.data)) def render(self): return render_template('login.html', form=self.form)
Which is slightly better, but the logic is still at different levels of abstraction. And it's also not really a "small class". It weighs in at only 34 lines, which is actually quite small, but it has a very broad surface area.
So I pondered some more and watched Matt Wynne's Hexagonal Rails, I had a different insight to Sandi's talk. Instead of trying to handle the authentication and the error at the same level, why not provide an interface on the controller that the authenticator can call into rather than the controller asking for something to be done and acting on the results.
class LoginUser(MethodView): def post(self): if self._form.validate_on_submit(): return self._authenticator.authenticate(**self.form.data) else: return self.render() def authentication_succeeded(self, user): return redirect(url_for('user.profile', username=user.username)) def authentication_failed(self, error): flash(error.msg) return self.render() def _flash_errors(self, error):
Which makes the controller act more high level. There's still some differing levels of abstraction, but they only cross one level. It kicks off some data through the authenticator interface and receives a message back through its interface. We can place other things in the middle by implementing either the Authenticator interface or this interface...actually let's make this an interface itself:
class AuthenticationBoundary(ABC): @abstractmethod def authentication_succeeded(self, user, *args, **kwargs): pass @abstractmethod def authentication_failed(self, error, *args, **kwargs): pass
But our authenticator has no way of talking back. And I think implementing those details directly on an authenticator is a bad idea, that gives it another reason to change. You might think of this as the authenticator crossing multiple abstraction levels at once.
However, if we insert a bridge between them, it's easy to do. If you've seen any variation on Uncle Bob's Architecture: The Lost Years talk, this is something like his Interactor or Dr. Ivar Jacobson's Controller object:
class AuthenticateUser: def __init__(self, authenticator, listener): self._authenticator = authenticator self._listener = listener def authenticate(self, *args, **kwargs): try: user = self._authenticator.authenticate(*args, **kwargs) except AuthenticationError as e: return self._listener.authentication_failed(e, *args, **kwargs) else: return self._listener.authentication_succeeded(user, *args, **kwargs)
This is pretty close to Sandi's idea. Jack Diederich might have some negative things to say about it, but it's also easily representable as a closure as well.
def authenticate_user(authenticator, listener): def authenticate(*args, **kwargs): try: user = authenticate.authenticate(*args, **kwargs) except AuthenticationError as e: return listener.authenticate_failed(e, *args, **kwargs) else: return listener.authentication_succeeded(user, *args, **kwargs) return authenticate
We could go one step in the opposite direction and have this implement the AuthenticationBoundary:
class AuthenticateUser(AuthenticationBoundary): def __init__(self, authenticator, listener): self._authenticator = authenticator self._listener = listener def authenticate(self, *args, **kwargs): try: user = self._authenticator.authenticate(*args, **kwargs) except AuthenticationError as e: return self.authentication_failed(e, *args, **kwargs) else: return self.authentication_succeeded(user, *args, **kwargs) def authentication_failed(self, errors, *args, **kwargs): return listener.authenticate_failed(errors, *args, **kwargs) def authentication_succeeded(self, user, *args, **kwargs): return listener.authentication_succeeded(user, *args, **kwargs)
But I feel this is a bit cluttered and makes it more of a middle man than an adapter.
You might be wondering why I pass the *args and **kwargs back up through the listeners. And the reason is there might be information contained in there, such as a "remember me" or "stay logged in" flag, that doesn't apply to the authenticator itself, but could be interesting to another listener.
Our LoginAuthenticator can implement the Boundary interface rather than the authenticator interface, since it's more a listener than an authenticator -- it doesn't do any authentication, but responds to a successful authentication. And since we pass the extra data back up, we can pluck a remember flag out and pass it to flask_login.login_user:
class LoginUser(AuthenticationBoundary): def __init__(self, listener): self._listener = listener def authentication_succeeded(self, user, *args, **kwargs): login_user(user, remember=kwargs.get('remember_me', False)) return self._listener.authentication_succeeded(user) def authentication_failed(self, error): return self._listener.authentication_failed(error)
Finally, we need to make one last change in the controller. It has to add itself to the authenticator, since we need to call back into it.
class LoginController(MethodView): def __init__(self, authenticator): self._form = LoginForm() self._authenticator = authenticator(self)
Putting the pieces together
Gluing everything together is a little difficult, but it's manageable.
authenticator = lambda listener: AuthenticateUser(PasswordAuth(...), Login(l)) controller = LoginController(authenticator)
By simply providing a little factory function, we don't even need to wring our hands about how it gets in there. I've found this to be the easiest, most efficient way to deal with this. There's other things we can do, for example instead of hardcoding form, template and redirect names, we can pass those in as well, allowing this controller to be reused for other areas (I mentioned re-authenticating before).
class AuthenticationController(MethodView, AuthenticationBoundary): def __init__(self, form, authenticator, template, redirect): self._form = form() # concession to avoid the dreaded "Working outside application context" self._authenticator = authenticator self._template = template self._redirect = redirect def get(self): return self.render() def post(self): if self._form.validate_on_submit() return self._authenticator.authenticate(**self.form.data) else: return self.render() def authentication_succeeded(self, user): return redirect(url_for(self._redirect, username=user.username)) def authentication_failed(self, error): flash(error) return self.render()
A nice thing about this, by passing in a callable that returns a form, we can instead pass a function that determines which form to use. For example, you might have an option to enable ReCaptcha depending on some application configuration and this form callable could look at that and return the correct one. Or you need to allow the user to choose a default language but the list of languages can only be decided at runtime.
Note that since we're potentially reusing this, I've stripped out the decorators line. disallow_authenticated would be applied every time we created a view from this and if we want to re-authenticate a user, they're probably already authenticated. Instead, we'll need to apply it by hand since it's now a situational decorator.
Uncle Bob does recommend one thing more: using a presenter that transforms the response into a view model. In Python and Flask the view model could be a dictionary or a object with arbitrary attributes (or maybe a dotteddict, I'm fond of those). However, I've decided against that for two reasons:
- It is another abstraction layer, which I don't feel is needed here.
- I've not found an easy way to reconcile updating a form's errors through this third party object.
I suppose as one last thing, I should show how I've hooked this into an actual Flask app.
from flask import Blueprint from .controllers import AuthenticateController from .forms import LoginForm, ReauthForm from .listeners import LoginUser, ReauthUser from .interactors import AuthenticateUser from ..dependencies import password_auth as BasePasswordAuth authenticate = Blueprint('auth', __name__) authenticate.add_url_rule('/login', endpoint='login', view_func=disallow_authenticated( AuthenticateController.as_view( name='login', template='auth/login.html', redirect='main.index', form=LoginForm, authenticator=lambda l: AuthenticateUser(BasePasswordAuth, LoginUser(l)) ) ) ) authenticate.add_url_rule('/reauth', endpoint='reauth', view_func=AuthenticateController.as_view( name='reauth', template='auth/reauth.html', redirect='main.index', form=ReauthForm, authenticator=lambda l: AuthenticateUser(BasePasswordAuth, ReauthUser(l)) ) )
Yes, it's way more lines of code, and certainly a fair number of classes; however Ross Tuck calls this "The Inverse Biggie Law":
Mo' decoupling and reduced overall design issues
Since we don't couple the high level policy -- see PasswordAuthenticator -- to the low level details of how we get that information, each piece is independently testable. Want to see if the controller behaves correctly? Give a fake form and a fake authenticator. And since that's web aware logic, I'm okay with having to run Flask to test it. You could also use mock.patch.object and dummy out the render and redirect methods on the controller to test if they're called.
Or if you're testing if your AuthenticateUser use case calls back correctly, give it a fake listener and a fake authenticator. Have the fake authenticator throw an error in one test and return some fake user in another -- does the fake listener get the right message?
But the important part is we can take small pieces and make a big structure out of them. We can change how the big structure works by handing it a different small piece.
Is this too abstracted? Is enterprisey? It somewhat feels that way. And logging in a user isn't the most compelling example, I'm more than happy to concede that. But what if you were trying to implement something more complicated? Say, users sending messages to each other. How messy is that endpoint if we just jam everything in there? How hard is that to test?
Of course, at some point this structure might begin to wobble and topple just like the original one, but it will carry you much further. And when it does begin to waver you might want to consider things like a command bus or domain events (or both! they're really cool).
If you're interested in see how this is being implemented and used in a real application and not just some toy, thought piece I'm leading the rearchitecture of FlaskBB's code using this style. Check out the services branch on my repo.
I spilled my brains, spill some of yours.