24 May 2009

Portal Versus Content Management Systems (CMS) Versus Web Pages

Introduction

There is a lot of difference of opinion on what a portal is.  This is down to several common, conflicting definitions of portals, thinking that portals and CMSs are the same thing, and many different ways to actually build a web page.

Closely related technology and terms will also be covered, along with some practical best practice recommendations.

History

In the mid to late 90s, the first phase of web applications were the same as websites.    Portal and CMS concerns blended together.  Website developers recognized the need for a control panel to control the customer facing part of their site, to which they added some content controls.  They controlled the whole delivery stack.

By the late 90s to maybe 2004, a number of CMSs became available.  This was the second phase of website construction.  You “had to have” a CMS.  These CMSs blended together portal and CMS concerns.  The CMS was the website – pages, content on pages, how pages linked together.  Great for content sites, but building a transactional, functionality oriented site was painful and expensive as the CMS had to be extended.

Also during this time some very clever people working on Java standards recognized the difference between portal, portlet, and content services and wrote JSR-168 (later 286) and 170.  They got it, but practice was well behind theory.

Starting as early as 2001, but really more by 2005, AJAX and web services were catching hold.  The idea that you didn’t need to control the whole delivery stack became clearer.  The first mash-ups started to appear – pulling in services from one website (service) into another.

Also at the same time, an older concept finally caught, which was the separation of content and business logic.  This was the old MVC (Model-View-Controller) approach, but people finally rallied around it.
This leads to the third phase, which we’ve been in for maybe 3-4 years now.  People began to realize that a website is really a container that can draw services from anywhere.  They also refined the MVC approach to separate content-with-format into content and presentation of that content.  They also took a more granular view of web page construction, effectively creating an MVC for each component on a page rather than treating whole pages at a time.

As a result, a portal became a "container of containers" that pulls in and visually organizes content and functionality from wherever it can, but isn’t otherwise primarily concerned with content itself.

Today (mid 2009)


Which takes us to today.  In a nutshell, if a site is very content oriented (not function/transaction oriented) and the content is self-contained, one of those big CMSs may still be appropriate. 

If a site is quite mixed (content, functionality) and has small volumes, little revenue, you still just hack everything together like you would in the 90s.  Same for those that haven’t made the jump in understanding the differences of concern between portal and CMS, don’t require multi-service integration, or “control everything”. 

For everyone else, content and portal are being treated as separate areas of concern.

Making things more complex is that all the terminology and definitions are evolving, quite variable and/or misunderstood.

Portal


A portal is not a CMS!

A portal unifies together one or more service providers into a unified user interface for an information and service consumer.

Examples of service providers and services used by a portal:

  • Customer handling system – create and manage customers
  • Payment system – debit and credit a customer’s credit card
  • Mash-up classics: flickr photos, google maps and news, youtube video
  • Content Management System (CMS) – manage content
The service providers may be based on different technology stacks.  In some fashion (there are various) they need to export (make visible) the services they offer.

A service provider, particularly a legacy system or one that is difficult to change, may not use web services or may not be able to easily change the web services it has.  In this case it is common to create some middleware to form a bridge between the legacy system and portal.  The middleware speaks “legacy interface” on one side, and provides a set of modern web services on the other.

The service providers do not need to know about or interact with each other to provide their services (not coupled).  The connection between the portal and the service providers is flexible (loosely coupled).

Assuming the service provider interface remains the same or is backwards compatible, it may evolve independently of the portal consuming that service and other service providers in use by the portal.

Web based solutions are typically a website as displayed in a browser or a rich Internet client (e.g., an MMRPG client).  A rich client stands alone, a website requires a browser.

Choice of portal means choice of technology.  The most common models are the "P" family (PHP, Perl, Python that sit on the front of a LAMP stack), ASP/.Net, and Java.  There are others growing in popularity like Ruby.  Choice of technology will drive the portlet approach which is covered next.

Portlets

Portlets are also called fragments, widgets, blocks, or CMSlets.

A portlet is used to encapsulate a service provider’s service or combination of provider services.

A portlet provides a uniform interface used by web page authors to construct web pages.

It is possible that a service provider might provide a portlet directly rather than a service that the portal encapsulates as a portlet.  This is dangerous because it creates a greater degree of coupling between website construction and an underlying service.  Conversely, it might be faster as web service and portlet are collapsed into one (less code, simpler, less fragile).  Note that this is only possible if the service provider uses roughly the same technology as the portal.

Aside: A rich client may have an equivalent to a portlet, a way to encapsulate a service or combined set of services in a uniform way for the client.

Content Management System (CMS)

Sometimes a CMS is a portal!  But it shouldn’t be.

A CMS is an optional services that a portal can draw on to manage content.

A portal encapsulates CMS services into portlets so a website author can dynamically mange website content.

A CMS should contain all site content – both text and images.

A CMS supports localization of its content.

A CMS will have a backoffice interface to manage the content.  A content administrator can preview content updates in a pre-production (staging) environment prior to live publication.  Content is under version control - who/when is recorded when content is created and changed.  Other typical backoffice functions are present: workflow, audit trail, role based access controls, reports.

A portal may pull text from the CMS and mix it with transactional functionality from a service to create a customer solution.  For example a portal might have a portlet for a registration page that pulls all the field prompts from the CMS but creates the customer in the customer management system.

Historically CMSs owned both blocks of content on a webpage and the webpage itself.  They mixed together portal and CMS concerns.  This is bad for transactional sites as it makes it more expensive to evolve website functionality.

CMSs have many other feature considerations, this is only a high level view of them when positioned in the delivery stack as a service and relevant to portal integration.

A few best practices when working with CMSs:

  • Content seen by a customer is never written into a web page.  The web page author requests content from the CMS.  The author knows the context on the webpage and brings in content appropriate for that context.
  • A CMS should never be able to publish Javascript.  Javascript should be considered as application code and flow through a control/QA process before live site publication.  Javascript should be encapsulated in tags, not be present in the web page itself.
  • A CMS should only publish content with a limited and well-defined set of safe HTML tags around the content.
  • Sometimes content and functionality is interdependent.  For example, a new promotions page with some user interaction (e.g., enter a mobile # to access a promotion) needs to go live at the same time as updated (e.g., T&Cs and various links to the new promo page).  To support this, a number of related content changes can be grouped as a change set, and only published all at once when the new functionality is deployed to the production environment.
  • In a shared services environment (one infrastructure and software stack running many different websites), the CMS will require strong RBACs to prevent partners seeing or accessing each other’s content.
  • Only bring in the CMS you need.  When you only need content control over a few areas, its a viable option to leverage simple tools or just roll your own CMS.  This is where many companies make the mistake of bringing in an expensive CMS package and customize it versus just having a couple of web page authors making updates and handing the business a simple backoffice interface to update a few portlet's content.
  • Ideally a CMS contains all service messaging, including error messages.  A service passes along a code for what information to display, the portlet looks up the information in the CMS using the code as a lookup key.

Web Pages

One thing that might feel uncomfortable with the above CMS definition is that the CMS only controls content as displayed in portlets rather than whole web pages.  It's ok, this should feel uncomfortable to some of you.  If you must enable end users to quickly and non-technically create and deploy new web pages to your site, your only choice is to look at a more traditional big boys CMS and hand your soul (and wallet) to the consultants to tailor the CMS to your business.  It means you're going to have to write a set of customized plug-ins for the CMS in the CMS's terms, usually something proprietary.  You may be able to hybridize your approach as well, and I'll cover that below.

However, if the view of CMS as a supplier of CMS oriented components/portlets and not necessarily whole pages is making sense to you so far, let's move into what that means for constructing web pages.

Web pages are requested by a browser.  The browser may pass in a number of parameters when requesting a web page to achieve an affect such as moving through a workflow.  Web page requests are received by the server.  The server identifies the correct page and processes it.  The pages are then handed to a customer’s browser, and processed further by the browser.

The server processes the web page by substituting custom tags with HTML and Javascript.  The custom tag methodology is provided by a custom tag framework as specified by the portal technology choices.

Web pages executing within a client contain a well defined and consistent interface to access dynamic functionality on the server.

Web pages are written by web page authors (aka designers) who specialize in HTML and CSS.  Custom tags aren't a big stretch for them but complex Javascript might be (they typically aren't programmers).  The closer these tags look to conventional HTML, the more effective the HTML/CSS web page authors are when using them.

A few best practices when considering web pages:

  • Web pages typically contain Javascript.  A lot of Javascript in more advanced, interactive, dynamic web pages.  The Javascript should be encapsulated in custom tags or portlet and server side substituted into the web page definition using a custom tag interface.  Javascript should not be added to pages directly.
  • Web page authors are most comfortable working with a whole web page inside an HTML editor as stored on a filesystem.  Submitting web pages into a database (that is, web pages pulled from DB, not served from filesystem) is painful.  Reformatting XML data extracts via XSLT is painful.  This is why web pages best live at the filesystem level and not inside of a CMS.
  • Web pages are under version control and must be checked in and out.  
  • New and updated web pages must be tested on a pre-production (test) environment prior to being placed in a production environment.  
  • Web pages should be published to production in a controlled and audited way.  There should be a toolset that supports this activity to make a web page author’s life easier.  You might almost consider this a webpage level CMS, but with a different users and tighter production release style controls.
  • The CSS is a web page concern.  Designers own the specification of what the CSS markup should do.  Portlet and custom tag developers own insertion of the CSS markup itself.  Developers and designers must work together via a "CSS Contract".
  • The portal must force subordinate services to use a well-defined namespace for CSS use.
A Hybrid view of CMS and Web pages

If for whatever reason an expensive enterprise grade CMS with a recognizable brand is a hard requirement for your website, you have two choices:
  • Write custom CMS plug-ins to integrate all the server side functionality you require to build the solution.  Be aware:
    • What you build will most likely be non-portable to other CMSs in the future
    • You've achieved vendor lock-in.  Vendors like this - you may not in the future.
    • You're likely going to have to (re)train a group of developers to write the plug-ins for the CMS.  The big boys CMSs are big and complex and you're going to have to develop a lot of vertical knowledge to make them successful.
    • Be prepared to bring in consultants specializing in the CMS to help your delivery team and operational support team.  Vendors also like this - your CFO won't.
    • Be prepared to closely risk manage the new CMS technology risks (like any new, big, complex technology introduction) across your IT team and into the business
    • The CMS controls the whole website page.  Designers must work with the CMS to create and update whole page designs using a proprietary approach.  Designers will be creating page "templates" which are then throw into a stable of templates that non-technical users access to create new pages on the site.
  • Look at a hybrid view
What is the hybrid view?  Assuming you have the staff and dosh to make this happen, create two parallel delivery tracks.

Track 1 is about delivering a site using the way I've described above, with strong separation of concerns between services and portal, treating the CMS as a service.  The portal owns the web pages, specifies the technologies to be used to construct pages and support client/browser and server interaction.  Assuming you have a team of people that have done this before, risks are low, productivity is good, and the website comes into being rapidly.  Keep Track 1 focused on the core user journeys web pages and lightly stub all other pages.

Track 2 is about deploying the CMS, building up an understanding of it, and seeing what you can do with it while Track 1 is building the website.  To start with, Track 2 should focus on delivering content heavy and frequent content changes pages into the solution.  Track 2 pages should be covering off all those lightly stubbed pages you're creating in Track 1.

You then bring these two tracks together.  Ideally you can get to the point where the Track 2 CMS controls the content heavy, functionality light sections of your website, while the Track 1 website focuses on functionality heavy portions of the site.

To accomplish this, you'll at least have to write CMS plug-ins to functionally provide login, session, logged-in status, and one or more navigation blocks.  This will allow you to maintain the customer's login state as they move between Track 1 and Track 2 pages.

What are the benefits of the hybrid approach?
  • Primarily, it's risk management.  The approach de-risks the use of a new, big, complex technology.
  • Can get a basic up and running quickly, assuming you are building websites in a standardized way, using the separation of concerns as outlined in this blog.
  • You can incrementally shift functionality from Track 1 to Track 2 based on growing depth of knowledge of both approaches.
  • But you don't have to shift everything from Track 1 to Track 2 - you can take a balanced priorities, efficiency, cost view of how to evolve the two tracks.

Conclusion

Portal, CMS, and web pages are not the same thing.  Each has its own concerns.

Portals mix together various types of functionality to enable an end-user facing solution.

CMSs manage content and provide services to the portal to access content.

Web pages are the practical implementation of portal provided functionality.

If you architecturally mix the three together, perhaps to save initial coding time, you are creating extensibility and team scaling limitations for yourself later.  There is potentially nothing wrong with that - just go into it with your eyes open.