ISV Guest Post Series: Softlibrary and Kern4Cloud on Windows Azure

Editor’s Note: Today’s post, written by Miguel Parejo, CTO at Softlibrary describes how the company uses Windows Azure and the Windows Azure Marketplace to run and sell its multi-tenant information management service.

Softlibrary is a company started in 1988 in Barcelona (Spain). Since then, it has always been involved in information management, and providing cutting-edge custom solutions to our customers. For that purpose, the company adopted Microsoft platforms and architectures from its very beginning.

Kern4Cloud is a multi-tenant service focused on information management whether it comes from a corporate nature or not. It can handle all the information lifecycle providing a set of tools for publication, categorization and classification, lexical-semantic and thesauruses systems, version control, multilingual, live-translation and workflow processes.

We chose Windows Azure because it resides in certified data centers where information and services are kept in a reliable and secure way. Windows Azure resources are capable of being stretched in order to provide high-performance solutions, as well.

Every single portion of existing information within the system is saved in XML format, so we can also look at Kern4cloud as a black-box that transforms heterogeneous sources into standard and internationalized ones.

Let’s see how we can accomplish a typical flow with Kern4Cloud. Your company likely has a Privacy Policy statement and it will probably change over time. Once you have imported the first version you can create new versions, duplicate existing versions, and even translate versions on-the-fly using the main translation engines available as part of the solution. The system can also convert your files to pure XML so you can later edit them with its own editor, called X.Edit, a WYSIWYM editor. All this can be accomplished with the web component called K4C.Workplace. Your company is likely structured in a way that some departments must give their consent before publishing. Given this, you can first create a workflow to force those and only those departments involved in publishing process to read and revise the statement, review them conforming laws, correct translations and finally give their consent so the document can be published and ready to be consumed.

The Challenge

When we first looked Windows Azure, we realized that architecture and design stages should now include some cost-efficient strategies. There are some billing drivers you must consider when migrating or creating from scratch your solutions on Windows Azure. Fortunately, Microsoft has provided some extra features and capabilities to make this process much easier. To mention a couple:

There are other tools but you can also take a look at some cost-efficient strategies out there. So that was the main challenge: design, migrate, adapt and write code in a way developers had never done before. Every single stakeholder in the project must now take in consideration one new parameter: cost. We don’t mean Windows Azure is expensive, just the other way around.

Once we had redesigned the core components we faced another challenge. How could we authenticate users in a multi-tenant service? Windows Azure comes with Access Control Service (ACS). ACS lets us deal with identities in a transparent way and focus on the authorization process.

The Architecture

Now we know what Kern4cloud does, we’ll show who is doing it and later we’ll map these with Windows Azure components.

Here’s a list of the main components:

  • K4C.Workplace: The core of the UI. Users can version, edit, publish, delete, batch-operate, sort, search, filter data easily with a single window. Everything is organized in a grid to increase visibility and hence, usability. Refer to first picture to have an idea.
  • K4C.Admin: The place where administrators can manage all the properties, X.Edit styles and mapping, group of users, user permissions, workflows, and so on.
  • Repositories: There are three repositories. One for binary documents (Office, images, videos, etc.), one for the XML files (the ones that contain extracted information and meta-data) and finally all the information handled by K4C.Admin and indexed data used for fast-searching, which is stored in two databases.
  • Workflows: This component deals with all the workflow processes defined by users.

This is an eagle-eye view of the architecture. Let’s see how it’s mapped with Windows Azure components.

Before getting into specifics there’s one thing that must be explained. Kern4Cloud modeling service is offered for two main audiences.

  • Individual users: They have limited disk quota space, some features disabled and share storage resources.
  • Business users: When a company subscribes this model, they are automatically provisioned with a set of private resources. All users inside the company will access the common repository through K4C.Workplace. However, K4C.Admin can help administrators to isolate information within the organization.

Below, there are four figures showing the most important Windows Azure components and services in our architecture. There are some interactions between them which we have been omitted for readability.

  • Figure 1: Shows the Web Roles K4C.Workplace and K4C.Admin. Both are deployed on the same hosted service. They are the entry point of the system and the only ones that have some UI. They retrieve up-to-date information from SQL Azure because our indexing engine and K4C.Workplace component assure it will be there as the user interacts and modifies data. Since Full-Text indexing support is not currently available in SQL Azure, our indexing engine has a little bit of extra work. Recently, we’ve started looking into Hadoop because we believe it could be a good alternative.
  • Figure 2:By the moment, all these roles are deployed on the same hosted service (but different from the ones in Figure 1).
    • Workflows and Index: These are Worker Roles. The first one pops messages from the Workflows Queues and processes them. Indexing worker role is a service that indexes and keeps all the information in a coherent state. It also detects document changes to push messages into Workflows Queues as well. Both are multithreaded in order to minimize bottlenecks. However, the resources of one single instance are limited, so scaling is needed. By the moment, this component is scaling manually but we are preparing a new version that includes the Windows Azure Autoscaling Application Block (WASABi), a component of the Enterprise Library 5.0 Integration Pack for Windows Azure. We plan to autoscale this and other K4C components based on CPU and memory usage and network load.
    • K4C.FileProperties: This is a web service that processes binary files to convert them into XML. Following the Privacy Policy sample, imagine they consist of Word files in your environment with some styles used for headings, footers, etc., which can be mapped with K4C.Admin into XML tags. Once you have saved a new version, if mapping is properly set, the indexing role will send the file to this service and the result will be an XML which can be used for further editing. That is the way you can show your information in many platforms and devices.
    • K4C.Backoffice: It’s the middle-tier between a website and the K4C system. If you want to show your Privacy Policy statement on your website, you’ll have to request it to this service, which will also be shared by any other customers, but it relies on Access Control Service to assure data isolation, so that’s why you’ll first need to authenticate to ACS before making any request.
      Each role in our system has its own scaling parameters. Some should scale based on CPU usage only, others on network usage, and so on. We deploy a role into a specific hosted service if its scaling parameters are similar to the ones previously deployed (though this is not the only rule we follow). K4C.Backoffice is pretended to be consumed by third-party users in the future, so we presume its scaling parameters will differ quite a lot. Hence, we are planning to deploy a new version of this component to its own hosted service.
    • Figure 3: SQL Azure server holds data for every single customer. Individual users share two databases while businesses have their own pair. All sensitive information is properly encrypted in our data layer.
    • Figure 4: We have two storage accounts. One for customers and the other for deploying, diagnostics and backup. XML data is stored in Tables using a partition key for each customer identity, so data will be served from different servers as explained here. We’ve overridden the TableServiceContext class and intercepted the WritingEntity and ReadingEntitity events, so we encrypt and compress xml data before writing to tables and the other way around after consuming from tables.

Windows Azure Marketplace Integration

Finally we decided to empower Kern4Cloud by filing an application to be published on the Windows Azure Marketplace. We found it to be a perfect destination for a cloud-based solution because it’s a platform where our customers can securely subscribe our services and makes billing a worry of the past. When customers want to subscribe our service, they only need visit the Windows Azure Marketplace and search for Kern4Cloud. After they choose the most suitable offering, the Windows Azure Marketplace will ask them to login with a Windows Live ID account and provide some billing information (such as credit card number). In the Windows Azure Marketplace, subscriptions are billed monthly, so the first month will be immediately charged. Customers will trust the billing process because it’s provided by Microsoft and ISVs will only have to follow some rules and provide their offering prices to the Windows Azure Marketplace. The whole process is fast, secure and reliable for customers and providers.

Furthermore, integrating any solution to the Marketplace is quite easy. Basically you should follow the steps shown in the proper sample found at the Windows Azure Training Kit (WATK). Microsoft has made a big job recollecting good samples and best practices to build Windows Azure-based solutions. You can find the Windows Azure Marketplace integration project separately but I recommend downloading the whole kit.

Let’s see how this works with a sample:

  1. A customer finds your solution in the Windows Azure Marketplace and desires to subscribe. Once the billing and subscription information has been provided, Marketplace redirects the customer to the AzureMarketplaceOAuthHandler.ashx handler on your website, telling a new customer has subscribed.
  2. Your website must confirm the request is truly coming from the Windows Azure Marketplace. This task is handled by a project called AzureMarketplace.OAuthUtility which you’ll find in the WATK. You can either attach the project or reference the DLL. It contains the handler mentioned above, so you should also be adding the following line to your web.config:

<handlers>
<add name=”AzureMarketplaceOAuthHandler” verb=”*”
         path=”AzureMarketplaceOAuthHandler.ashx”
         type=”Microsoft.AzureMarkeplace.OAuthUtility.AuthorizationResponseHandler, Microsoft AzureMarketplace.OAuthUtility”/>

</handlers>

This project relies on Microsoft.IdentityModel and Microsoft.IdentityModel.Protocols.OAuth libraries, as well.

  1. When the origin of the request is confirmed to come from the Windows Azure Marketplace, your solution is told a customer is subscribing and, hence you can ask for additional information if needed. To do this you must add five classes to your website. One of them is used to read the information that purely defines your website as a Windows Azure Marketplace client. This information should be stored in your web.config like this:

<azureMarketplaceConfiguration
    appSpecificAzureMarketplaceOAuthClientId=”YOUR_CLIENT_ID”
    appSpecificAzureMarketplaceOAuthClientSecret = “CLIENT_ECRET_KEY”
    appSpecificPostConsentRedirectUrl 
                  “http://127.0.0.1:81/AzureMarketplaceOAuthHandler.ashx”
    appSpecificWellKnownPostConsentUseUri 
                  “http://127.0.0.1:81/Subscription/New”/>

This information must be the same as the one defined in the Marketplace (whether it is the playground or not).

Of course, a section element must be added, as well:

<section name=”azureMarketplaceConfiguration” 
        type=”YOURNAMESPACE.AzureMarketplaceConfiguration, YOURASSEMBLY” 
        requirePermission=”false”/>

Note the testing URLs should be replaced in production environment.

The other important class is SubcriptionUtils.cs. This will be the final destination of the Marketplace requests. Here you will create the CreateSubscription and Unsubscribe  methods in order to process all the requests. You only need to add this line to your Global.asax (Application_Start method):

        AzureMarketplaceProvider.ConfigureOAuth(new SubscriptionUtils());

And you’re done. Your website is now capable of getting subscriptions requests only from the Marketplace and process them according to your needs.

In a Nutshell

The migration from our on-premises system has not been an easy task, but a thrilling one. Windows Azure provides all the services we needed to accomplish our mission. We only had to take some considerations before. One of them regards some discussions existing on the web, trying to say whether the Cloud means the death of IT. Our experience concludes that IT departments won’t disappear, they’ll just have to get used to these platforms. Furthermore, we believe the debate is going to finish soon with the introduction of hybrid clouds to the mainstream.

We’d like to thank Microsoft for the opportunity of being the first Spanish company to write for this blog series. If you would like more information regarding some of the aspects shown here, feel free to contact us.