Posts Tagged ‘Cloud’

Cloud Based BI – Understanding The Options Is the Biggest Barrier

Tuesday, June 22nd, 2010

Last week I was in Munich to present at the annual TDWI (The Data Warehouse Institute) conference on “Business Intelligence and Data Management in a Cloud Computing Environment”.  It was a very well attended conference with some great speakers and sessions.  My session focused on the following:

  • What is Cloud Computing and why use it as a deployment option?
  • Why Cloud BI? – What are the requirements for a public cloud or externally hosted BI system?
  • Understanding what is on offer – The Cloud BI Marketplace
  • Getting data into a cloud based BI system
  • Managing access to cloud based BI systems and analytic applications
  • Integrating cloud based BI systems with on-premise systems
  • Pros and cons of deploying on the cloud?
  • Getting started with Cloud based BI

Bear in mind that both public cloud and private cloud based BI were under discussion even though the hype seems all around public cloud or externally hosted BI systems.   Looking at these points it is the third bullet down that for me is the clear inhibitor to cloud based BI adoption.In other words the lack of understanding as to what exactly is on offer.  And there is a lot on offer. On the public cloud we have everything from plain Infrastructure as a Service (IaaS)  all the way through to Software as a Service (SaaS) based packaged analytic applications. On the private cloud several BI platforms are already running on virtualisation software such as VMware and/or Microsoft Hyper-V.  However there seems very little in the way of best practice advice on do’s and don’ts when it comes to deploying BI systems on a private cloud based virtualised environment.

In total I came up with 6 options, the last of which is simply where many of us are today i.e. BI systems not deployed on a cloud (whether it be public or private).  The options are as follows:

1.     Public cloud based IaaS for a BI system

2.     Public cloud or externally hosted BI/DW PaaS for building your own cloud-based BI system

  • Multi-vendor or single-vendor BI PaaS offerings

3.     Public cloud or externally hosted SaaS BI packaged analytical applications

4.     Public cloud or externally hosted SaaS BI for operational reporting on cloud based operational data

5.     Private cloud based BI system running internally

6.     Dedicated hardware based BI system (this is what most companies have today)

Option 1 is simply subscribing to an IaaS vendor like Savvis, Amazon, Rackspace or GoGrid  where you pay as you use on hardware and systems software and then buying and deploying your own ETL, DBMS and BI software (assuming they have no restrictions on what they will support).  I am not sure that this is attractive enough on its own without a BI/DW Platform as a Service (PaaS) as well.

Option 2 is the BI/DW Platform as a Service (PaaS) option on public cloud or even externally hosted.  Here you find another choice however. Should you choose a multi-vendor DW/BI PaaS or a single-vendor offering.  An example of a multi-vendor option is the RightScale/Talend/Vertica/Jaspersoft PaaS offering on Amazon EC2.  A single vendor PaaS offering (of which there are several on offer) would be GoodData, or SAP BusinessObjects On-Demand. Others include Birst, Indicee and PivotLink.  A key question here is going to be “Is Data Integration included?”  Clearly in the multi-vendor offering mentioned there is an ETL solution such as Talend in the above example.   Data integration is very much file based with BI/DW PaaS vendors i.e. you upload files of data and then there is some processing of that data to load it into the PaaS DW/BI database.  Several single-vendor PaaS offerings give you only fairly lightweight data integration once data is uploaded.  Certainly not full blown ETL with built-in data quality that you might be used to in a data centre. In fact if you are looking for full blown DQ you are going to be disappointed in most cases.  The ‘get out’ clause is you can add your own script but what about metadata lineage and auditability once the script writer has left for a better job?  A vendor like SAP (mentioned earlier) does have ETL (SAP BuisnessObjects Data Integrator) available but only if you subscrible to their Advanced Edition of SAP BusinessObjects On-Demand (there are 3 editions on offer).  I was even more surprised to see that SAPs BI/DW PaaS offering uses Microsoft SQL Server as the database and not BW.  I would expect that to change to SybaseIQ fairly soon. GoodData on the other hand have refreshingly recognised that you may want to go beyond the data integration you get out-of-the-box on subscription and have gone the extra mile to provide pre-built integration with cloud based data integration tools such as Informatica Cloud, SnapLogic and Boomi. Therefore you can use these tools to integrate your data before passing the data sets to them. The alternative to all of this is to do the lions share of the data integration in-house before uploading data files.

Option 3 is a fast growing market with many relatively new vendors (e.g. Cloud9 Analytics, Rosslyn Analytics, Lixto) as well as traditional mainstream vendors e.g. SAS, IBM Cognos.  The attraction here is a pre-built solution ready to go. These will clearly appeal to small and medium size businesses (SMBs) and even lines of business in some large organisations.  While we see horizontal applications looking at Salesforce.com data, spend analysis and pricing (to name a few), I am predicting that vertical analytic apps on the cloud will appear.

Option 4 is simply using a cloud based reporting system on operational data typically from a cloud based transaction processing system such as Salesforce.com.  In fact it would seem that Salesforce.com is dominating this space. An example here is SAP BusinessObjects CrystalReports.com for Salesforce.

Option 5 is private cloud based BI systems. The largest private cloud based BI system I know of is IBM’s internal Blue Insight which is based on IBM System Z and IBM Cognos 8 BI.  An estimated 200000 IBMers are using this.  IBM have since launched the Smart Analytics Cloud, a private cloud offering for large enterprises based on the same technologies.  However it is still early days for BI deployments on internal private clouds. There appears to be more support coming from developer forums than vendors at present.  From what I can see, companies are taking a ‘toe in the water’ approach to deploying on virtualized environments. No doubt, confidence will grow over time.  However does everything need to move to private cloud? Many companies with very large EDW initiatives may be reluctant to move to private clouds until they prove their scalability and lower TCO.   This issue here is should ETL, DBMS and BI platform all be on the same virtual servers? Should each have their own virtual server configuration? What is that configuration? Can I adjust it? etc. etc. I don’t think there will be a mad rush to put a 100TB DW on virtual servers.  I do like the fact that vendors like Microstrategy have given this some serious consideration and have released a private cloud enterprise edition of Microstrategy 9.  MicroStrategy components are packaged as Virtual Appliances and tuned for expected load. These Virtual Appliances contain fully configured software components and the number of running virtual appliances can be adjusted to accommodate specific performance goals. This is a damn sight better than just saying to a customer “it’s up to you, just deploy it and you figure out the virtual server configuration”  What Microstrategy have done is to allow you to adjust the underlying assigned physical resources to satisfy performance demands and have made available administrative facilities to control virtualized MicroStrategy environment.

It is early days in Cloud based BI. I recommend looking at your requirements and then match the options available to your needs

I would be interested if any of you have experiences in this area. Do’s and Don’ts. What works, what doesn’t.  Please share them by placing your comments.

Follow me on twitter

MDM and Cloud Computing

Monday, September 28th, 2009

Having read David Linthicum’s blog on MDM and Cloud computing about the impact on data of applications moving off premise, I have to say that I couldn’t agree more with him. What David is pointing out is that the fracturing of data caused by the adoption of cloud computing raises the importance of MDM in keeping disparate data synchronised.

This brings back memories of Business Process Outsourcing adoption several years back and what it did to companies that had no business process integration in place before they outsourced some process activities. The result of that strategy was that it fractured processes even more in many cases and sent some of the data outside the enterprise making it more difficult to get at.  As applications go off premise there is a real danger MDM could get out of reach. It requires MDM to start to get implemented to get control over data. SalesForce.com data is already coming inside the enterprise via ETL tools into DWs. Several ETL vendors support this. I just don’t think that there has been many bringing it back in to populate MDM. Siperian has some case studies of their MDM customer working with cloud applications – in particular SalesForce.com. What it does say, is that pursuing a cloud computing strategy on external cloud based virtualized servers without a data governance strategy, could very well wreak havoc on any enterprise.

With virtualization being high on the agenda of many CIOs, I would suggest that they should also keep an eye on risk management and compliance otherwise they could well cause make it harder to achieve trusted data. Without MDM, a cloud computing deployment strategy certainly puts an Enterprise Data Quality Firewall and data integration services high up the agenda priority list!

BI as a Services (BIaaS) – Will Google Move In On This Opportunity?

Thursday, August 27th, 2009

Most of you by now have probably found it difficult to avoid the hype around Software as a Service (SaaS). For many of us today this is already a reality in our business. You only have to look at the huge uptake of Saleforce.com by small and medium size businesses (SMBs) to realize that there is certainly a place for this in many companies. With respect to the BI market there is no doubt that there is also considerable growth in BI as a Service (BIaaS)and it would appear that many BI vendors are eagerly setting out the stall on the net to jump into this market of hosted BI Services. Given that many BI products are already service enabled and also that many BI vendors have BI portal products there is no doubt that they are technically ready. They are however missing one thing – data, your data. Either they point their tools at you databases and access them over the net or they will need a ready supply of data from any BIaaS subscriber. If you already use SalesForce.com you can bet that all BI vendors entering this market will do so with an ETL adapter for SalesForce to get at your data on your behalf.

Of course SalesForce.com itself is no doubt keen on the BIaaS market and is already active in offering added value in terms of BI to existing clients.

Nevertheless, while simplicity, ‘point your browser and go’ and cool pre-built reports and graphs are the obvious attraction, there are implications when adopting BI as a Service in any business. The most important of these is that companies may need to supply their data to BI SaaS providers for upload to BIaaS sites so that ‘instant BI’ can be made available back to them via hosted web enabled BI tools and pre-built reports. There are also privacy regulations that have to be adhered to in this kind of situation not least the UK Data Protection Act. Companies need assurances on data protection as well as data security and should consider the implications of this in terms of giving BIaaS providers their precius operational data to be managed off-site. This is after all, the crown jewels of any business and there is no doubt that BIaaS providers would jump at the chance to know much more about their clients and would be sitting on a potential old mine with all that data. Reliability of such a service is also paramount so that BI is available when you need it. Companies considering this option should also think about what happens if they need their data back in-house and how easy is it to get it back from a BI SaaS provider. It would be madness to subscribe to such as service and overlook this requirement.

I have wondered about the potential size of the BIaaS market but it was not until I was looking at iGoogle a while back that I realised the real potential of BIaaS. Google have been steadily adding more and more services to their portfolio and are now making these services available over the internet for you to personalise with your own portal via iGoogle. All you have to do is click “Add Stuff” on iGoogle to see the huge number of instant services you can add to your portal. So what am I driving at here? My question is this. How long before Google enters the enterprise SaaS market with a vengence? Both SaleForce itself and a BI vendor could easily be a target to this Internet search giant. If you could get at hosted enterprise services in a SaaS offering just by using iGoogle “Add Stuff” to add it to your portal then how many SMBs would do it? That is a huge question. My guess however is that if it is as easy as Google are making it to add stuff on iGoogle today then the uptake by SMBs could be enormous. All Google have to do is solve the data upload problem and deliver vertical data marts for the industry of your choosing and they would no doubt get the attention of SMBs. So for those watching the BI market for mergers and acquisitions, I would not exclude Google from the mix. We may well see a very big splash if Google decides to move on the BI market to open up its stall as a BIaaS provider to SMBs. An iGoogle for Business offering would certainly do it. Who knows – I’m certainly watching with interest.