I have been speaking and writing often these days about how single sign on (SSO) technologies are probably one of the most important components of health IT data integration. To help figure out how to integrate multiple systems using standards-based SSO approaches I reached out to Shahid Qadri, a Data Scientist and Software Developer for Applied informatics Inc. Qadri works on health data integration and semantic web and when I heard that he created a solution (which won second place) for an ONC single sign on challenge I thought he’d be the perfect engineer to help the rest of us. Here’s what Qadri had to say about WebID:
The Simple Sign-on challenge sponsored by the ONC through the Health 2.0 challenge was an exciting opportunity for us to learn about a sophisticated technology protocol and then being able to hack several open source system to implement a single sign on solution based on the protocol. This was a challenge that was truly a “challenge” for me, but an exciting and rewarding one (our solution was the second place winner!).
The challenge involved using the W3C WebID protocol to enable a single sign on across different systems used by the HealthData.gov Platform (HDP). The eventual goal of the HDP is to allow various administrators, contributors and even machines to be authenticated and authorized to access different open source systems. In a nutshell, our solution, OneLogin creates a “wrapper” over each of the systems (Drupal, Ontowiki, Virtuoso, Tomcat) that programmatically creates users within these systems with a given role and associates a given WebID with these users. Each tool’s wrapper is independent of the system and can be configured across different machines. The source code for OneLogin is available on GitHub.
In rest of the blog post, I will describe the background and technical details of the solution.
WebID is a W3C open standard for identity and password-less login on the Web. WebID is designed to alleviate the difficulty (and pain) of remembering different logins, passwords and settings for different websites. WebID in itself is essentially a URL pointing to a description of yourself in FOAF format. FOAF stands for Friend Of A Friend and is essentially an RDF vocabulary which allows you to describe your social web) combined with a self-signed X.509 certificate. X.509 certificates (X-men like sounding terms) are the certifications used to verify the identity of web servers via the SSL protocol. It is a secure authentication protocol utilizing FOAF profile information as well as the SSL security layer available in virtually all modern web browsers. Operationally, once you have a WebID with private key stored in your browser, logging into a website is as simple as selecting your WebID and clicking “log in”. Additionally, there are other benefits of creating a WebID. Other people can to reference you and declare social relations on the web (such as that you are their friend, colleague, parent, etc.) even when their profile is hosted on a different web server than yours. Thus the WebID can be a trusted and verified way to enable the Social Web, i.e., social networks between individuals, citizens, companies, universities, governments, while allowing each player to remain in control of their data they publish.
How Does WebID Work?
As mentioned before, the WebID is a URL, which points to a FOAF file. Now if you want to log into some site you simply provide the WebID which means to select a certificate from a list in your web browser. The server will then fetch the FOAF, extract the
certificate’s public key from it, and then ask you to prove your identity. Since you are the only one having the private key of the certificate that is easily done. And that’s it. From a high level point of view it is very simple, but getting to the nuts and bolts of it can be a challenge.
(Source: http://www.w3.org/wiki/WebID)
Is WebID Really Secure?
In order to secure and protect your identity two things need to be made ensured:
1. The FOAF file that your WebID URL points to should be under your control or that of a trustworthy entity, and
2. Make sure nobody steals your private key! Though, if you do lose your private key, disabling the WebID is as easy as removing the public key from your FOAF profile. Importantly, and this is a useful mechanism of decoupling the public key from the WebID url, is that replacing your public key certificate with a new one will never invalidate your WebID since it stays as a permanent identifier for yourself in the semantic web, independent of the certificate.
Building the Single Sign-on OneLogin Application
First we started with Drupal and added a module in Drupal for providing WebID authentication. But configuring that got a bit tricky as we got few errors while using the module. For example, after digging into the module code we found there was a bug in the import statements. After resolving this we wrote code to automatically create a user in Drupal with the specified role who can use the his/her specified WebID to log in to Drupal.
WebID Provider
Next step involved setting up our own WebID provider (the service that generates a WebID). To do this, we chose to use the open source system, Virtuoso (an enterprise grade RDF store) that we installed on one of our servers over SSL/https. It generated WebID correctly, however, the WebIDs generated from it did not pass verification test. After close inspection we found that although the certificate was generated correctly, it was not getting the FOAF data in RDF format. So we installed an RDF mapper module to fix this and then we were able to generate WebIDs correctly.
Next we worked within Virtuoso to associate its internal users with a WebID, as it had a built-in support for WebIDs. We needed to write a wrapper that can programmatically create an account in Virtuoso and associate a WebID. So we developed a wrapper in PHP and ISQL for the same (the code can be viewed here)
After configuring Virtuoso, we installed OntoWiki. We found that this platform also supports WebID authentication though it had been disabled in the current version in favor of OpenID. So we first got it enabled. Next we wrote the code to create a user and associate a WebID to it. The WebID user creation depends on the external plugin Erfurt in OntoWiki. After solving dependencies and other things we got our code working.
Next and last was the associating WebID with Tomcat/Solr. After basic search we found that there is a library for providing WebID authentication to Tomcat/Solr. This library was not easy to use as we needed lot changes in installation and as well as configuring this library with Tomcat/Solr, but after rigorous efforts we were able to configure it correctly (more technical details are on our blog).
In order to build the code to programmatically create users and associating a WebID we had to tweak the tomcat-users.rdf file. This is where the user list and WebID are stored as nodes. We used a PHP XML parser to append users to the RDF file (view code) .
Finally, the application was build that created users and associated a role in each system and used WebID to login to each system. Included below is a screen shot of the Application with a link to the GitHub repository of the application. A demo video is also available here.
Screenshot of the OneLogin Single Sign on System