Difference between revisions of "Using OpenID"

From strattonbrazil.com
Jump to: navigation, search
m (Why Username/Password Authentication Is Bad)
(Conclusion)
 
(17 intermediate revisions by the same user not shown)
Line 7: Line 7:
 
The talk examines some behaviors among users that are forced to make a username/password.  First, making username/password's for every site is laborious.  If the user comes to your site and sees they need a create a user profile with a username and password, often times they just leave your site.  Why?  It's hard to create a new username and password for a site that may be a one-off use.  Some sites make it even harder by defining very strict password policies guaranteeing that the user will not be able to remember their password.  In fact, some users will just fill the password field with random garbage and once they're logged in use cookies.  Once the cookies expire they just reset their password with more garbage.  This is all very laborious for the user.   
 
The talk examines some behaviors among users that are forced to make a username/password.  First, making username/password's for every site is laborious.  If the user comes to your site and sees they need a create a user profile with a username and password, often times they just leave your site.  Why?  It's hard to create a new username and password for a site that may be a one-off use.  Some sites make it even harder by defining very strict password policies guaranteeing that the user will not be able to remember their password.  In fact, some users will just fill the password field with random garbage and once they're logged in use cookies.  Once the cookies expire they just reset their password with more garbage.  This is all very laborious for the user.   
  
Even worse is when users don't use unique passwords.  They implicitly trust you with their username/password, which may be used elsewhere.  Sure, it's their fault for not using unique passwords, but it's also your fault for taking them in the first place.  Getting username/password combos hacked is horrible PR.  When you accept them you have to then spend time and resources to make sure they're secure on your site.  What are you really getting from them that makes them worth the trouble?
+
[[File:fry_username_password.jpg|center]]
 +
 
 +
Even worse is when users don't use unique passwords.  They implicitly trust you with their username/password, which may be used by them elsewhere.  Sure, it's their fault for not using unique passwords, but it's also your fault for taking storing in the first place.  Getting username/password combos hacked is horrible PR.  When you accept them you have to then spend time and resources to make sure they're securely stored.  What are you really getting from them that makes them worth the trouble?
 +
 
 +
== A Painless Alternative ==
 +
 
 +
So how can we consistently identify a user without going through this identity pain?  OpenID is one alternative.  While the protocol for OpenID may seem relatively complex, the idea is pretty simple.  First, a user gets an account with an OpenID provider.  Most users already have at least a Facebook account or an email account from a large provider (Google, Yahoo, Microsoft), so that's usually taken care of.  When the user comes to your site, they can choose from a list of providers (actually just a list of urls designated by the OpenID providers) or manually input a url (see step 1 in the diagram below).
 +
 
 +
[[File:openid_workflow.jpg|center]]
 +
 
 +
I was initially confused where these widgets came from as they're not provided by the OpenID specification.  Usually they come from individual libraries that implement OpenID. 
 +
 
 +
[[File:Openid_providers.png|center]]
 +
 
 +
Once the user sends their provider url back to your server, which then forwards them to that provider (see step 3 in the OpenID workflow diagram above).  This HTTP request can include specific attributes the server wants to know about the user, but in general email is always included reponse.  The OpenID provider usually provides a nice list of information being shared.  Again, if the person is already logged in to their OpenID provider, this is a one-click step! 
 +
 
 +
[[File:openid_permission.png|center]]
 +
 
 +
By clicking accept, the OpenID provider redirects the user back to your site using the "return_to" field you provided the OpenID library see step 5 in the OpenID workflow diagram above).  At this point you have identified the user by their email.  What does this mean exactly?  Well, anyone can setup their OpenID provider so this doesn't provider any email verification.  So the person claiming to be "president_barack_obama@whitehouse.gov" from OpenID provider X doesn't mean he/she actually owns that email, but you can uniquely identify them as that person claiming that email from that OpenID provider. 
 +
 
 +
== Show Me the Code ==
 +
 
 +
Sounds great, right?  What does this actually involve on a developer's part?  While the protocol is relatively complex most web frameworks already have one or possibly several OpenID libraries you can use.  Take [[http://www.tornadoweb.org/en/stable/ Tornado]], an asynchronous web framework written in python as an example.  It's OpenID implementation is actually built into the framework (framework developers take note--this is awesome)! 
 +
 
 +
Tornado provides a variety of mixins for authentication and includes one specifically for Google's (this includes the hard-coded OpenID url described earlier) called [http://www.tornadoweb.org/en/branch2.3/auth.html#google GoogleMixin].  Let's take a look at their example handler to see how it works.  I won't go into the nitty-gritty of how tornado works. 
 +
 
 +
<code>
 +
class MainHandler(tornado.web.RequestHandler, tornado.auth.GoogleMixin):
 +
    @tornado.web.asynchronous
 +
    def get(self):
 +
      if self.get_argument("openid.mode", None):
 +
          self.get_authenticated_user(self.async_callback(self._on_auth))
 +
          return
 +
      self.authenticate_redirect()
 +
    def _on_auth(self, user):
 +
        if not user:
 +
            raise tornado.web.HTTPError(500, "Google auth failed")
 +
        # Save the user with, e.g., set_secure_cookie()
 +
</code>
 +
 
 +
What is this code doing?  Well, first in the ''get'' function it checks whether there's an "openid.mode" argument in the request.  If there is, that means this request is coming from the OpenID provider.  On the first try it's not so let's go down to function ''authenticate_redirect''.  That will actually redirect the user to Google's provider url.  If you look at the code for this function, it actually includes a "next" parameter which tells Google to send the user back to this url when he/she is done authorizes or denies the request. 
 +
 
 +
The OpenID provider should redirect the user back to this handler and this time the "openid.mode" argument should be set.  ''get_authenticated_user'' actually parses the response of the data and sends it to a callback function (''_on_auth'' in this case) with the user data.  ''_on_auth'' checks whether the "user" field is not null.  If so, the user accepted the OpenID request and you now know some information about that (like their email). 
 +
 
 +
In this case we've hard-coded Google as the OpenID provider.  What's the advantage of doing this?  Well, if a user jumps to a page that requires identification, you can redirect them immediately to the Google OpenID provider and with one-click they're back on your site.  How painless is that? 
 +
 
 +
Well, what if you don't want to force Google as the OpenID provider?  Tim explains some people have actually added some additional logic to see which OpenID providers the user has available and chooses one of those.  Again, one click for the user and you get your juicy identification.  No one gets hurt.
 +
 
 +
== Conclusion ==
 +
 
 +
Should you always use OpenID?  In most cases OpenID is perfectly fine for many sites like forums or "toy" sites where verifying the person's actual identity isn't necessary.  You'd still have to do things like email verification if that's important to you.  There are, of course, some cases where a username/password are required and it's fine to use them instead.  Just remember that if go that route you're causing your users pain and cost you time and resources to keep them secure.  Make sure it's worth it.

Latest revision as of 00:18, 17 August 2014

The talks at OSCON 2013 were hit and miss--something I've heard is fairly normal for tech conferences in general--but I definitely came away with a favorite. "Reducing Identity Pain" by Tim Bray was a forty-minute session on how to unique identify your users without requiring a username and password.

Why Username/Password Authentication Is Bad

The talk examines some behaviors among users that are forced to make a username/password. First, making username/password's for every site is laborious. If the user comes to your site and sees they need a create a user profile with a username and password, often times they just leave your site. Why? It's hard to create a new username and password for a site that may be a one-off use. Some sites make it even harder by defining very strict password policies guaranteeing that the user will not be able to remember their password. In fact, some users will just fill the password field with random garbage and once they're logged in use cookies. Once the cookies expire they just reset their password with more garbage. This is all very laborious for the user.

Fry username password.jpg

Even worse is when users don't use unique passwords. They implicitly trust you with their username/password, which may be used by them elsewhere. Sure, it's their fault for not using unique passwords, but it's also your fault for taking storing in the first place. Getting username/password combos hacked is horrible PR. When you accept them you have to then spend time and resources to make sure they're securely stored. What are you really getting from them that makes them worth the trouble?

A Painless Alternative

So how can we consistently identify a user without going through this identity pain? OpenID is one alternative. While the protocol for OpenID may seem relatively complex, the idea is pretty simple. First, a user gets an account with an OpenID provider. Most users already have at least a Facebook account or an email account from a large provider (Google, Yahoo, Microsoft), so that's usually taken care of. When the user comes to your site, they can choose from a list of providers (actually just a list of urls designated by the OpenID providers) or manually input a url (see step 1 in the diagram below).

Openid workflow.jpg

I was initially confused where these widgets came from as they're not provided by the OpenID specification. Usually they come from individual libraries that implement OpenID.

Openid providers.png

Once the user sends their provider url back to your server, which then forwards them to that provider (see step 3 in the OpenID workflow diagram above). This HTTP request can include specific attributes the server wants to know about the user, but in general email is always included reponse. The OpenID provider usually provides a nice list of information being shared. Again, if the person is already logged in to their OpenID provider, this is a one-click step!

Openid permission.png

By clicking accept, the OpenID provider redirects the user back to your site using the "return_to" field you provided the OpenID library see step 5 in the OpenID workflow diagram above). At this point you have identified the user by their email. What does this mean exactly? Well, anyone can setup their OpenID provider so this doesn't provider any email verification. So the person claiming to be "president_barack_obama@whitehouse.gov" from OpenID provider X doesn't mean he/she actually owns that email, but you can uniquely identify them as that person claiming that email from that OpenID provider.

Show Me the Code

Sounds great, right? What does this actually involve on a developer's part? While the protocol is relatively complex most web frameworks already have one or possibly several OpenID libraries you can use. Take [Tornado], an asynchronous web framework written in python as an example. It's OpenID implementation is actually built into the framework (framework developers take note--this is awesome)!

Tornado provides a variety of mixins for authentication and includes one specifically for Google's (this includes the hard-coded OpenID url described earlier) called GoogleMixin. Let's take a look at their example handler to see how it works. I won't go into the nitty-gritty of how tornado works.

class MainHandler(tornado.web.RequestHandler, tornado.auth.GoogleMixin):

   @tornado.web.asynchronous
   def get(self):
      if self.get_argument("openid.mode", None):
          self.get_authenticated_user(self.async_callback(self._on_auth))
          return
      self.authenticate_redirect()
   def _on_auth(self, user):
       if not user:
           raise tornado.web.HTTPError(500, "Google auth failed")
       # Save the user with, e.g., set_secure_cookie()

What is this code doing? Well, first in the get function it checks whether there's an "openid.mode" argument in the request. If there is, that means this request is coming from the OpenID provider. On the first try it's not so let's go down to function authenticate_redirect. That will actually redirect the user to Google's provider url. If you look at the code for this function, it actually includes a "next" parameter which tells Google to send the user back to this url when he/she is done authorizes or denies the request.

The OpenID provider should redirect the user back to this handler and this time the "openid.mode" argument should be set. get_authenticated_user actually parses the response of the data and sends it to a callback function (_on_auth in this case) with the user data. _on_auth checks whether the "user" field is not null. If so, the user accepted the OpenID request and you now know some information about that (like their email).

In this case we've hard-coded Google as the OpenID provider. What's the advantage of doing this? Well, if a user jumps to a page that requires identification, you can redirect them immediately to the Google OpenID provider and with one-click they're back on your site. How painless is that?

Well, what if you don't want to force Google as the OpenID provider? Tim explains some people have actually added some additional logic to see which OpenID providers the user has available and chooses one of those. Again, one click for the user and you get your juicy identification. No one gets hurt.

Conclusion

Should you always use OpenID? In most cases OpenID is perfectly fine for many sites like forums or "toy" sites where verifying the person's actual identity isn't necessary. You'd still have to do things like email verification if that's important to you. There are, of course, some cases where a username/password are required and it's fine to use them instead. Just remember that if go that route you're causing your users pain and cost you time and resources to keep them secure. Make sure it's worth it.