<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Mike Ferguson&#039;s Blog</title>
	<atom:link href="http://intelligentbusiness.biz/wordpress/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://intelligentbusiness.biz/wordpress</link>
	<description>Latest opinions from one of Europe&#039;s foremost authorities on BI and Data Management</description>
	<lastBuildDate>Mon, 14 Nov 2011 21:26:56 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>Big Data Analytics &#8211; A Rapidly Emerging Market</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=447</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=447#comments</comments>
		<pubDate>Mon, 14 Nov 2011 17:23:48 +0000</pubDate>
		<dc:creator>mikef</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Big Data]]></category>
		<category><![CDATA[Hadoop]]></category>
		<category><![CDATA[Map Reduce]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=447</guid>
		<description><![CDATA[Last week in London I spoke at the IRM Data Warehousing and Business Intelligence conference on a variety of topics. One of these was Big Data which I looked at in the context of analytical processing.  There is no question &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=447">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Last week in London I spoke at the IRM Data Warehousing and Business Intelligence conference on a variety of topics. One of these was Big Data which I looked at in the context of analytical processing.  There is no question the hype around this topic is reaching fever pitch so I thought I would try to put some order on it.</p>
<p>First, I am sure like many other authors in this space I need to define Big Data in the context of analytical processing to make it clear what we are talking about.  Big Data is a marketing term and not the best of terms at that.  A new reader in this market may well assume that this is purely about data volumes. Actually this is about being able to solve business problems that we could no solve before.  Big data can and more often than not include a variety of &#8216;weird&#8217; data types. In that sense big data can be structured or poly-structured (where poly in this context means many).  The former would include high volume transaction data such as call data records in telcos, retail transaction data and pharmaceutical drug test data.  Poly-structured data is more difficult to process and includes semi-structured data like XML and HTML and unstructured data like text, image, rich media etc. Graph data is also a candidate.</p>
<p>From the experiences I have had in working in this area to date, I would say that web data, social network data and sensor data are emerging as very popular types of data in big data analytical projects.  Web data includes web logs and e-commerce logs such as those generated by on-line gaming and on-line advertising data.  Social network data would include twitter data, blogs etc. These are examples of interaction data which is something that has grown significantly over recent years. Sensor data is machine generated data from  &#8217;An Internet of Things&#8217;. It is something we have only seen the beginning of in my opinion as much of it remains un-captured. RFIDs are probably the most written about of sensors. However these days we have sensors to measure temperature, light, movement, vibration, location, airflow, liquid flow, pressure and much more. There is no doubt that sensor data is on the increase and in my opinion it is something that will dwarf pretty well everything in terms of volume.  Telcos, utilities, manufacturing, insurance, airlines, oil and gas, pharmaceuticals, cities, logistics, facilities management and retail&#8230;..they are all jumping on the opportunity to use of sensor data to &#8216;switch on the lights&#8217; in parts of the business where they have had no visibility before.  Sensor data is massive but we don&#8217;t want it all &#8211; it is the variance we are interested in.  Many Big Data analytical applications are/will emerge on the back of sensor data. These include analytical applications for use in:</p>
<ul>
<li>Supply chain optimisation</li>
<li>Energy optimisation via sustainability analytics</li>
<li>Asset management</li>
<li>Location based advertising</li>
<li>Grid health monitoring</li>
<li>Fraud</li>
<li>Smart metering</li>
<li>Traffic optimisation</li>
<li>Etc., etc.</li>
</ul>
<p><span class="Apple-style-span" style="font-size: 16px; color: #444444; font-family: Georgia, 'Bitstream Charter', serif; line-height: 24px;">Text as I already mentioned is also a prime candidate for big data analytical processing. Sentiment analysis, case management, competitor analysis are just a few examples of a popular types of analysis on textual data.  Data sources like Twitter are obvious candidates but tweet stream data suffers from data quality problems that still have to be handled even in a big data environment. How many times do you see spelling mistakes in tweets for example.  </span></p>
<p><span class="Apple-style-span" style="font-size: 16px; color: #444444; font-family: Georgia, 'Bitstream Charter', serif; line-height: 24px;">There is a lot going on that is of interest to business in big data but while all of it offers potential return on investment, it is also increasing complexity. New types of data are being captured from internal and external data sources, there is an increasing requirement for faster data capture, more complex types of analysis are now in demand and new algorithms and tools are appearing to help us do this</span><span class="Apple-style-span" style="font-size: 16px; color: #444444; font-family: Georgia, 'Bitstream Charter', serif; line-height: 24px;">. </span></p>
<p><strong><span class="Apple-style-span" style="font-size: 16px; color: #444444; font-family: Georgia, 'Bitstream Charter', serif; line-height: 24px;">So why is analytics on big data so important &#8211; or is it?</span></strong></p>
<p><span class="Apple-style-span" style="font-size: 16px; color: #444444; font-family: Georgia, 'Bitstream Charter', serif; line-height: 24px;">There are several reasons why big data is attractive to business. Perhaps for the first time, entire data sets can now be analysed and not just subsets. This is now a feasible option whereas it was not before.  So it is making enterprise think can we go down a level of detail? Is it worth it? Well to many it most certainly is. Even a 1% improvement brought about by analysing much more detailed data is significant for many large enterprises and well worth doing. Also schema variant data can now be analysed for the first time which could add a lot of valuable insight to that offered up by traditional BI systems.  Think of an insurance company for example. Any insurer whose business primarily comes from a broker network will receive much of its data in non-standard document format. Only a small percentage of that data finds its way into underwriting transaction processing systems while much of the valuable insight is left in the documents. Being able to analyse all of the data in these documents could offer up far more business value that could improve risk management and loss ratios. </span></p>
<p><span class="Apple-style-span" style="font-size: 16px; color: #444444; font-family: Georgia, 'Bitstream Charter', serif; line-height: 24px;">At the same time there are inhibitors to big data analysis.  These include finding skilled people and a real lack of understanding around when to use Hadoop versus when to use Analytical RDBMS versus NoSQL DBMS.   On the skills front there is no question that the developers involved in Big Data projects are absolutely NOT your traditional DW/BI developers. Big Data developers are primarily programmers &#8211; not a skill often seen in a BI team.  Java programmers are aften seen at big data meet ups.  In addition, the analysis is primarily batch oriented with map / reduce programs being run and chained together using scripting languages like Pig Latin and JAQL (if you use the Hadoop stack that is).</span></p>
<p><strong>Challenges with Big Data</strong></p>
<p>There is no question that big data offers up challenges. These include challenges in the areas of</p>
<ul>
<li>Big data  capture</li>
<li>Big data transformation and integration</li>
<li>Big data storage &#8211; where do you put it and what are the options?</li>
<li>Loading big data</li>
<li>Analysing big data</li>
</ul>
<div>
<p>Over this and my next few blogs we will look at these challenges.  Looking at the first one on big data capture, the issues are latency and scalability.  Latency needs change data capture, micro batches etc. However I think it is fair to say that if Hadoop is chosen as the analytical platform, it is not geared up for very low latency. Very low latency would lean towards stream processing as a big data technology which I will address in another blog.  Scaling data integration to handle Big Data can be tackled in a number of ways  You can use DI software that implements ELT processing i.e. exploits the parallel processing power of an underlying MPP based analytical database. You can make use of data integration software that has been rewritten to exploit multi-core parallelism (e.g. <a href="http://www.pervasivedatarush.com/">Pervasive DataRush</a>). Alternatively you can use data integration accelerators like <a href="http://www.syncsort.com/ProductsServices/DMExpress/Overview.aspx">Syncsort DMExpress</a> or exploit Hadoop Map/Reduce from within data integration jobs e.g. <a href="http://www.pentaho.com/explore/pentaho-data-integration/?gclid=CJnChpbNtqwCFQRP4QodnCtcGg">Pentaho Data Integrator</a>. Or you could use specialist data integration software like Scribe log aggregation software (originally written by Facebook). Also vendors like <a href="http://www.informatica.com/products_services/hparser/Pages/index.aspx">Informatica have also announced a new HParser</a> to help with data in a Hadoop environment.</p>
<p>With respect to storing data, there are a number of storage options for analysing Big Data. They range from:</p>
<ul>
<li>Classic relational RDBMS (e.g. IBM DB2, Oracle, MySQL, Microsoft SQL Server)</li>
<li>Analytical RDBMS (e.g. <a href="http://www.exasol.com/en/home.html">ExaSol</a>, <a href="http://www.vertica.com/">HP Vertica</a>, <a href="http://www.netezza.com/">IBM Netezza</a>, <a href="http://www.paraccel.com/">ParAccel</a>, <a href="http://www.oracle.com/us/products/database/exadata-database-machine/overview/index.html?origref=http://www.google.co.uk/url?sa=t&amp;rct=j&amp;q=oracle%20exadata&amp;source=web&amp;cd=1&amp;ved=0CFQQFjAA&amp;url=http%3A%2F%2Fwww.oracle.com%2Fus%2Fexadata%2Findex.html&amp;ei=YUrBTtrvBoyKhQf8x_2aBA&amp;usg=AFQjCNEU0irinR_Il-5Ehwy7x2lFDXz7yg">Oracle Exadata</a>, <a href="http://www.teradata.com">Teradata</a>)</li>
<li><a href="http://hadoop.apache.org/">Hadoop</a> solutions (e.g. HDFS, HBase and Hive)</li>
<li>Analytical RDBMS with Hadoop Map/Reduce integration (e.g. <a href="http://www.asterdata.com/">Teradata AsterData</a>, <a href="http://www.greenplum.com/">EMC GreenPlum HD</a>)</li>
<li>NoSQL DBMSs.</li>
</ul>
<div><span class="Apple-style-span" style="font-size: 16px; line-height: 24px;">Let&#8217;s dispel a myth right away. The idea that relational database technology cannot be used as a DBMS option for big data analytical processing is plain nonsense.  Any analyst opinion claiming that should be ignored.  Teradata, ExaSol, ParAccel, HP Vertica, IBM Netezza are all classic examples of analytical RDBMSs that can scale to handle big data applications with some of these vendors having customers in the Petabyte club.  Improvements such as solid state disk, columnar data, in-database analytics and in-memory processing have all helped Analytical RDBMSs scale to higher heights. So it is an option for a big data analytical project perhaps more so with structured data.  </span></div>
<div><span class="Apple-style-span" style="font-size: 16px; line-height: 24px;">Hadoop is an analytical big data storage option that has often been associated more with poly-structured data. Text is a common candidate.  NoSQL databases like <a href="http://neo4j.org/">Neo4J</a> or <a href="http://www.infinitegraph.com/">InfiniteGraph</a> graph databases are candidates particularly in the area of Social Network influencer analysis.   So it depends on what you are analysing.</span></div>
<p>Going back to Hadoop, the stack includes HDFS  - a distributed file system that partitions large files across multiple machines for high-throughput access to application data.  It allows us to exploit thousands of servers for massively parallel processing which can be rented on a public cloud if needs be. To exploit the power of Hadoop, developers code programs using a programming framework known as Map/Reduce. These programs run in batch to perform analysis and exploit the power of thousands of servers in a shared nothing architecture. Execution is done in two stages. Map and Reduce. Mapping refers to the process of breaking a large file into manageable chunks that can be processed in parallel. Reduce then processes the data to produce results. Hadoop Map/Reduce is therefore NOT a good match where:</p>
<ul>
<li>Low latency is critical for accessing data</li>
<li>Processing a small subset of the data within a large data set</li>
<li>Real-time processing of  data that must be immediately processed</li>
</ul>
<p>Also Hadoop is not normally a RDBMS competitor either. On the contrary it expands the opportunity to work with a broader range of content and so Big Data analytical processing conducted on Hadoop distributions is often upstream from traditional DW/BI systems. The insight derived from that processing then often finds its way into a DW/BI system.  There are a number of Hadoop distributions out there including <a href="http://www.cloudera.com/">Cloudera</a>, EMC GreenPlum HD (a resell of MapR), <a href="http://hortonworks.com/">Hortonworks</a>, <a href="http://www-01.ibm.com/software/data/infosphere/biginsights/">IBM InfoSphere BigInsights</a>, <a href="http://www.mapr.com/">MapR</a> and <a href="http://www.oracle.com/us/technologies/big-data/index.html?origref=http://www.google.co.uk/url?sa=t&amp;rct=j&amp;q=oracle%20big%20data%20appliance&amp;source=web&amp;cd=1&amp;ved=0CEgQFjAA&amp;url=http%3A%2F%2Fwww.oracle.com%2Fus%2Fbigdata%2Findex.html&amp;ei=N0zBTp6YJ4mw8gOj1aWcBA&amp;usg=AFQjCNFHTVksrhlRO4ctOKbqcAzZr99mFA">Oracle Big Data Appliance</a>.  Hadoop is still an immature space with vendors like <a href="http://www.zettaset.com/index.php/home">ZettaSet</a> bolstering the management of this kind of environment. To appeal to the SQL developer community Hive was created with a SQL like query language. In addition Mahout supports a lot of analytics than can be used in Map/Reduce programs.  It is an exciting space but by no means a panacea.  Vendors such as IBM, Informatica, <a href="http://radoop.eu/">Radoop</a>, Pervasive (<a href="http://www.pervasivedatarush.com/Products/TurboRushforHive.aspx">TurboRush for Hive</a> and <a href="http://www.pervasivedatarush.com/Products/DataRushforHadoop/DataRushforHadoop.aspx">DataRush for Map/Reduce</a>, <a href="http://www.hadapt.com/">Hadapt</a>, Syncsort (<a href="http://www.syncsort.com/Solutions/HadoopAcceleration.aspx">DMExpress for Hadoop Acceleration</a>), Oracle, and many others are all trying to gain competitive advantage by adding value to it. Some enhancements appeal more to Map/Reduce developers (e.g. Teradata, IBM Netezza, HP Vertica connectors to Cloudera) and some to SQL developers (e.g. Teradata AsterData SQL Map/Reduce, Hive). One thing is sure &#8211; both need to be accommodated.</p>
<p>Next time around I&#8217;ll discuss analysing big data in more detail. Look out for that and if you need help on a Big Data strategy feel free to <a href="info@intelligentbusiness.biz">contact me</a></p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=447</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>The Two Sides of Collaborative BI</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=440</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=440#comments</comments>
		<pubDate>Wed, 30 Mar 2011 08:07:34 +0000</pubDate>
		<dc:creator>mikef</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Collaborative BI]]></category>
		<category><![CDATA[Lyzasoft]]></category>
		<category><![CDATA[Mike Ferguson]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=440</guid>
		<description><![CDATA[While there is a lot of hype around collaborative BI today, this concept is not new. First attempts at introducing collaborative functionality into BI environments happened as far back as eight years ago or more when vendors of Corporate Performance &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=440">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>While there is a lot of hype around collaborative BI today, this concept is not new. First attempts at introducing collaborative functionality into BI environments happened as far back as eight years ago or more when vendors of Corporate Performance Management (CPM) products in particular added collaborative functionality to their products to allow users to annotate scorecards and comment on performance measures.  In addition being able to email links to report also appeared. While a lot was marketed about these kinds of features, they only achieved limited success. A key reason for this in my opinion was because collaborative functionality was ‘baked into’ BI and CPM tools. In other words vendors brought collaboration to BI.  However the MySpace and Facebook generation taught us a different approach. What these collaborative and social networking environments showed was that it is much more natural to publish content to collaborative workspaces to elicit feedback and to share that content with others who are interested in it.</p>
<p>In the context of BI, this turned the first generation collaborative BI tools on their head and said rather than take collaboration to BI it is far more effective to take BI to collaborative platform where the range of collaborative tools available offers a lot more power. <a href="http://www.lyzasoft.com/">Lyzasoft</a> was a pioneer of this new generation of modern social and collaborative BI technologies.  Also new releases of more widely adopted BI platform products are now being integrated with mainstream collaborative platforms such as Microsoft SharePoint and IBM Lotus Connections.  Even cloud based collaboration technologies from vendors like Google are getting in on the act.  Mobile BI technology is taking this further by allowing people to collaborate on BI from mobile devices.</p>
<p>However, I (and others) would argue that we are still seeing only one side of the coin here with respect to BI and collaboration. That side is the classic approach of formal integration of data from multiple sources into a data warehouse, the producing of intelligence and the publishing of BI artefacts (dashboards, reports, etc.) into social and collaborative environments where it can be shared with others, rated and collaborated upon for joint decision making. But what about innovation, what about when innovative business users want to experiment, get some data and ‘play’ with it in a sandbox environment to figure out what business insight might be useful or to figure out what new metrics that would be useful to the business? Do we not need collaboration here also?  Another probing question is whether this innovation should be ‘upstream’ from a data warehouse? In other words let them play with the data until there is consensus as to what is useful and then feed this into a more classic approach of data integration, storage, analysis and sharing. I am comforted by the fact that it is not only me asking this question. Others like my good friend Barry Devlin are also talking about the use of collaboration and sharing of business insight produced in an innovative environment. I know Barry will be speaking about this <a href="http://www.technologytransfer.eu/event/1036/BI%3Csup%3E2%3C/sup%3E_%28Business_Integrated_Insight%29_from_Business_Intelligence_to_Enterprise_IT_Integration.html">here</a>. The point is that in my opinion ( and it is only opinion admittedly) there is a place for collaborative and social BI in an innovative sandbox environment where BI is not yet ‘hardened’.  We need this capability in many industries. I have come across it in both retail banking and in manufacturing for example. However, what must be controlled is the release of newly formed innovation into production. This is where governance comes in. Data governance would allow newly created metrics to be published in a business glossary to be used by multiple BI tools in a hardened production environment for example. Also at this point, new data sources may be declared to a more formal production DW/BI environment for data acquisition.  Therefore we have two sides to collaborative BI, the innovation cycle which needs to share ‘experimental’ information and elicit feedback from other as well as the more formal production BI/DW environment where well polished business insight is shared across the enterprise for people to use and act on.  One feeds the other, typically because innovators also need to collaborate with IT to take the innovation and move it into the mainstream environment.</p>
<p>Let me know what you are doing with social and collaborative BI. I would be grateful for your comments.</p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=440</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Teradata Strengthens It Position in the BIG DATA Market</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=394</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=394#comments</comments>
		<pubDate>Thu, 03 Mar 2011 15:03:02 +0000</pubDate>
		<dc:creator>mikef</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Cloudera]]></category>
		<category><![CDATA[Columnar]]></category>
		<category><![CDATA[DW Appliance]]></category>
		<category><![CDATA[Hadoop]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=394</guid>
		<description><![CDATA[Today, my former employer (many moons ago &#8211; I left 17 years ago!) Teradata announced it is to acquire Aster Data effectively bolstering its position in the BIG DATA marketplace (See the announcement here) . Aster Data has made its &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=394">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Today, my former employer (many moons ago &#8211; I left 17 years ago!) <a href="http://www.teradata.com">Teradata</a> announced it is to acquire <a href="http://www.asterdata.com">Aster Data</a> effectively bolstering its position in the BIG DATA marketplace (See the announcement <a href="http://www.teradata.com/t/News-Releases/2011/Teradata-to-Acquire-Aster-Data/">here</a>) . Aster Data has made its mark in the big data market with its well crafted integration of Hadoop Map/Reduce and the SQL query language allowing SQL developers to execute massively parallel Map/Reduce analytical functions on the Aster Data platform and leverage the power of Hadoop.  Aster Data also has a IDE tool to make it easier for developers unfamiliar with Map/Reduce to generate Hadoop M/R applications (e.g. analytic functions) that can then be automatically deployed in a Aster Data nCluster database and invoked via SQL. Furthermore AsterData nCluster also supports both row AND column based storage. Of course Teradata already has a relationship with Hadoop vendor <a href="http://www.cloudera.com">Cloudera</a> to serve up data from Teradata to Map/Reduce applications running on Cloudera&#8217;s CDH platform. It is also working on interfacing Teradata with Cloudera&#8217;s Sqoop (part of the Cloudera Enterprise offering) to move data into HDFS via the Teradata Hadoop Connector .</p>
<p>Adding Aster Data to the mix means that Teradata now can potentially integrate with Hadoop deployments in both directions rather than one-way as with the Cloudera partnership.  For example, organisations could access Hadoop (Cloudera&#8217;s CDH and other other offerings) from from analytical queries running SQL M/R on Aster Data nCluster or indeed I would assume in the future on the Teradata DBMS itself.</p>
<p>There is no question this is a good move for Teradata. It gives them columnar capability and  also Aster Data has a rich library of pre-built map/reduce analytic functions to speed up M/R development and these functions can be invoked from SQL M/R on nCluster.  I would have to assume that Teradata would also want to open the Aster Data IDE and the M/R functions up to Teradata developers to deploy these M/R functions inside of Teradata. That is a no brainer in my opinion. You would also have to say that this takes Teradata in-database analytics to a new level of depth opening up the door for more sophisticated analytic applications. While the Teradata/SAS partnership is a successful one but adding Aster Data will potentially give Teradata much more power in the in-database analytics area. This is an area that really matters in big data environments. It will also give them more to compete with against IBM whose acquisition of Netezza (particularly with its TwinFin iClass appliance) and SPSS has given IBM much more competitive muscle recently especially against Teradata and Oracle (Exadata).   Besides competing with IBM, Aster Data will also give Teradata much more to compete with against Oracle&#8217;s Exadata.  We will have to wait to see what HP does with Vertica.</p>
<p>In addition, with the Tsumani of sensor data coming over the horizon this acquisition will help Teradata move into the world of Sensor Data Analytics which, by the way, is a battle still to be fought for(<a href="http://intelligentbusiness.biz/wordpress/?p=274">see my blog on this from last year</a>).  Aster Data will help Teradata in accommodating the onslaught of data being generated by organisations increasing the instrumentation of their business operations with sensor networks.  However in this space, adding a CEP vendor technology to the Teradata portfolio would be a good move as sensor data event correlations need to be acted upon BEFORE that event data is stored in a data warehouse.  CEP, Active DW and SQL/MR. Hmm&#8230; now that is a combination  worth having. It will be interesting to see what is offered across the family of Teradata Appliances and if Teradata decide to rollout nCluster on any of them.  I would also think that Teradata will make sure they carefully protect the Aster Data customer base if they bring the DBMS technologies together gradually.</p>
<p>My only question is who will acquire Cloudera who have partnerships with BI platform vendors and other appliance vendors. That acquisition would pull the rug from under a lot of players.</p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=394</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>EDW on a Private Cloud? Steady as you go!</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=364</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=364#comments</comments>
		<pubDate>Wed, 16 Feb 2011 16:29:40 +0000</pubDate>
		<dc:creator>mikef</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[EDW]]></category>
		<category><![CDATA[Virtualization]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=364</guid>
		<description><![CDATA[I read James Kobielus&#8217;s blog on Innovation Transforms Data Warehousing today talking about EDW in the Cloud as becoming mainstream by the middle of the decade. My take on this is that configuration management and workload management are critical to &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=364">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I read James Kobielus&#8217;s blog on <a href="http://www.informationweek.com/news/software/info_management/showArticle.jhtml?articleID=229218652&amp;cid=nl_IW_ent_soft_2011-02-16_html">Innovation Transforms Data Warehousing</a> today talking about EDW in the Cloud as becoming mainstream by the middle of the decade.</p>
<p>My take on this is that configuration management and workload management are critical to large scale EDWs making it to the Cloud. I am thinking more about PRIVATE cloud than public cloud at this point as their are already an abundance of DW/BI PaaS and SaaS offerings on public clouds outside the firewall. However on-premise <strong>private</strong> cloud deployment is still very young. There is no doubt that small data marts are already moving but large scale EDW are not. Why not? I believe the reason is simple &#8211; no one has any experience how to configure virtual resources to make production EDW (Data integration, DBMS and BI platform) maximise it&#8217;s use of underlying hardware.</p>
<p><span id="more-364"></span>The point I would make here is more that I have not yet seen any detailed best practices from any source on porting bare metal EDWs to a virtualised private cloud in terms of the number of virtual servers needed, virtual memory needed, how to divide up virtual network channels or optimise virtual I/O.  In addition there is little experience of what software to install on what virtual servers e.g. whether data integration, DBMS and BI platform should all go on the same virtual servers or on different ones and in what combination. Also, what happens in ELT processing versus ETL processing? Instead what I am seeing is pre-configured private cloud DW/BI appliances emerging where the large vendors are coming to market with appliances with the complex configuration work already done i.e. the configuration is already optimised for an analytic workload to get maximum value out of the hardware.</p>
<p>If you look at guidance documentation for virtualising RDBMSs, it is clear that to date at least, intricate knowledge of the underlying hardware is needed in order to set this up correctly. However there is another point here which is that the advice is to create virtual storage (all physical disks available as a huge storage pool visible to potentially all VMs), switch on self-tuning memory management and let multiple VM instances (each with their own DBMS instance) access the virtual storage.  That is a virtualised shared disk architecture. Yet for years we have seen scalable bare metal EDWs built on shared nothing MPP architectures.  There is no doubt that you could set up virtual resources on a private cloud to function as a virtual MPP architecture but who are you going to get to do that? Data centre cloud developers who need to second guess MPP DBMS? Or should we leave it to MPP based DBMS vendors who know how to make that work? What happens if a cloud administrator changes the configuration underneath a EDW? Does the DBA know? Will it impact the service?&#8230;what experience has a cloud developer got here?  I would argue that it is very very early days. Only Teradata has any long term expertise in virtual MPP architectures for EDW with virtual resources such as Virtual AMPs having been around for well over a decade on their platform. Even then they ship their own hardware with it balanced out of the box and provide tooling to monitor every aspect of virtual resource in detail.  Also no one has issued any guidance on balancing the right amounts of  virtual resources or the best virtual configurations needed to scale data integration or BI platforms etc.  So I agree that there is no way EDW on Private Cloud at least is a done deal. Certainly not this year.  We still have a lot to learn. As one CIO put it to me recently, &#8220;why would I port a 100TB platform on dedicated hardware to a virtual server environment?&#8221;. Indeed&#8230;.why would he?  The bottom line is that for now at least he should not. There is no hard and fast rule that says everything HAS to be virtualised.  I am not saying don&#8217;t do it. What I am saying however is that people will want proof before they start moving these kinds of systems off scalable bare metal hardware to go into virtual scalability.</p>
<p>We have also learned over the years that data integration often scales better using ELT rather than ETL. Pushdown optimisation is the secret here such that the data integration server leverages the underlying massively parallel DBMS to do the heavy lifting work. So, what happens if ELT pushdown optimisation is on an MPP architecture mapped on to a shared disk virtualised environment? Would that not impact performance? I would think so. Again&#8230;very little experience here compounded by the fact that cloud developers are not in the same teams as DBAs, data integration developers, BI developers&#8230;.  So we are entering an era where the ground is being changed under our feet by developers not necessarily skilled in DW/BI system deployments.  Is it any wonder that vendors shipping private cloud analytical appliances are getting attention?</p>
<p>With a data centre team configuring VMs and virtual resources and BI, data integration and DBA developers now NOT responsible for installation, I think we have a way to go before large enterprises will just &#8216;up sticks&#8217; and move every part of a BI/DW system to the cloud. &#8216;Steady as you go&#8217; is my advice</p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=364</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Pervasive Rush To Take On The Challenge of Scalable Data Integration</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=355</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=355#comments</comments>
		<pubDate>Wed, 18 Aug 2010 10:15:12 +0000</pubDate>
		<dc:creator>mikef</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Data Integration]]></category>
		<category><![CDATA[data quality]]></category>
		<category><![CDATA[ELT]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Pervasive]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=355</guid>
		<description><![CDATA[As a member of the Boulder BI Brain Trust (BBBT), I sat in on a session given by Pervasive Software Chief Technology Officer (CTO) and Executive Vice President Mike Hoskins last week.  The session started out covering Pervasive financial performance &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=355">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>As a member of the <a href="http://www.boulderbibraintrust.org/members.html">Boulder BI Brain Trust (BBBT)</a>, I sat in on a session given by <a href="http://www.pervasivesoftware.com/Pages/default.aspx">Pervasive Software</a> Chief Technology Officer (CTO) and Executive Vice President Mike Hoskins last week.  The session started out covering Pervasive financial performance of $47.2 million revenue (Fiscal 2010) with 38 consecutive quarters of profitability before getting into the technology itself. Headquartered in Austin, Pervasive offer their PSQL embedded database, a data an application exchange (Pervasive Business Xchange) as well as their Pervasive Data integrator and Pervasive Data Quality products which can connect to a wide range of data sources using their Pervasive Universal Connect suite of connectors.  They also offer a number of data solutions.  Pervasive has has success in embedding its technology in ISV offerings and in SaaS solutions on the Cloud.  However, what caught my eye in what was a very good session was their new scalable data integration engine DataRush.</p>
<p><span id="more-355"></span>I have had concerns for some time about how data integration tools are going to step up to the challenge of big data.  We are already in the era were hundreds of Terabytes and even Perabytes are a reality in data warehouses.  Also the volume of data needed for web analytics is massive let alone the <a href="http://intelligentbusiness.biz/wordpress/?p=274">tsunami of data being emitted by sensors that is coming over the horizon</a> (if that data ever makes it into a data warehouse we really are going to re-define large). There is no doubt that the future is constantly on the up in terms of volumes of data and the number of data sources that the businesses need to integrate data from.  We all talk about how important data warehouse appliances, columnar compression  and scalable MPP databases are to handle large data volumes. But what about data integration? There is not much point having the ability to manage big data in databases if we can&#8217;t get the data in there in the first place.  So data integration vendors have to step up to this challenge. Of course several already have.  Many products on the market have offered pipeline parallelism during ETL processing for many years.  We have also seen many vendors switching to an ELT model for better performance so that they can exploit parallel SQL in the target DBMS engine to deal with the problem. That of course makes the data integration engine dependent on the parallel DBMS. This is fine as long as the data integration workload can be fenced off and managed separately from query processing workloads by a DBMS workload manager.  But is that it? What about the fact that these days modern hardware consists of multi-processor multi-core systems. Can a data integration engine itself not exploit this without relying on a DBMS?  Is there any way to get more bang for your buck on this kind of hardware?</p>
<p>Step in Pervasive DataRush.  This new engine from Pervasive is designed from the ground up as an MPP data integration engine that can exploit every core on a multi-core, multi-processor server.  This means that you might potentially avoid the need to go to clustered hardware because you can scale up before you need to scale out. The DataRush architecture is shown below</p>
<p><a href="http://intelligentbusiness.biz/wordpress/wp-content/uploads/2010/08/DataRush.png"></a><a href="http://intelligentbusiness.biz/wordpress/wp-content/uploads/2010/08/DataRush.png"><img class="alignnone size-medium wp-image-361" title="Pervasive DataRush" src="http://intelligentbusiness.biz/wordpress/wp-content/uploads/2010/08/DataRush-300x202.png" alt="" width="300" height="202" /></a><br />
<em>Source: Pervasive Software</em></p>
<p>What is interesting about this is not just that it exploits multiple cores in the DataRush engine but  that it also has an analytics library and other plug-in modules. The one that caught my eye was the DataRush Recommender module.  It strikes me that this engine (which can also be extended to support user defined libraries) not only has the capability to integrate data in parallel but it can also analyse that data using analytical models (data mining models) at the same time.  Couple that with the DataRush Recommender module and we are bordering on complex event processing (CEP). It seems we just need a rules engine in there and also of a sudden we are into massively parallel CEP.  Given that data integration is already rules driven it certainly looks to me that this product could potentially go well beyond just doing integration in parallel as important as that need is.  Pervasive has also made their products available in a PaaS offering on the Amazon EC2 Cloud as well as offering them on-premise which means that they can integrate and clean data from inside and outside the enterprise.  Given that you can also embed the technology it is certainly worth a look.  I plan to cover it in more detail in my up and coming <a href="http://www.intelligentbusiness.biz/EnterpriseDataGovernance&amp;MDM.htm">Enterprise Data Governance and Master Data Management class running in London on September 22-24</a></p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=355</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>MicroStrategy Takes BI Mobile – What are The Implications of Mobile BI for BI Platforms?</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=342</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=342#comments</comments>
		<pubDate>Thu, 08 Jul 2010 14:17:22 +0000</pubDate>
		<dc:creator>mikef</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[MicroStrategy]]></category>
		<category><![CDATA[Mobile BI]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=342</guid>
		<description><![CDATA[Having just got back from the MicroStrategy World Conference in beautiful Cannes, I thought I would cover what was announced this week at the event.  CEO Michael Saylor launched MicroStrategy Mobile for iPhone, iPad and Blackberry describing it as “the &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=342">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Having just got back from the MicroStrategy World Conference in beautiful Cannes, I thought I would cover what was announced this week at the event.  CEO Michael Saylor launched <a href="http://www.microstrategy.com/mobile">MicroStrategy Mobile</a> for iPhone, iPad and Blackberry describing it as “the most significant launch in MicroStrategy history”.  In his opening keynote he talked about mobile as “the 5th major wave of computing” starting with mainframes, then mini-computers, then personal computers, desktop internet and now mobile internet.  Their vision here is a good one – BI all the time, everywhere and for everyone. Mobile device access to BI has been around for a while in some offerings but I was impressed with the work MicroStrategy have put into the mobile user interface on touch sensitive ‘gesture’ devices like Apple iPhones and iPads.   They have taken advantage of the full set of Apple gestures and also added BI specific gestures including Drill down and Page By.  They have also released an Objective C software development kit (SDK) for MicroStrategy Mobile.  This allows developers to build custom widgets and embed them in the MicroStrategy Mobile application or embed MicroStrategy Mobile in your own application.</p>
<p><span id="more-342"></span>Also it is possible to re-brand and re-skin MicroStrategy Mobile to fit your own corporate branding look and feel standards.  Overall, the user interface appears very interactive and very natural.  To support alerting, MicroStrategy Mobile is also integrated with the Apple push notification service so that personalised alerts are possible.  Also data can be cached so that when alerted users touch “view data”, the data is already there.</p>
<p>Architecturally speaking, MicroStrategy Mobile sits on top of the MicroStrategy Intelligence Server to allow BI to be made available to mobile employees, customers, partners and suppliers. Therefore all the capabilities of the MicroStrategy platform are still there.  I particularly liked the fact that the same design tool can be used for both web browser and mobile device access. Pre-built templates are also available to get you started quickly.  Getting started seems therefore fairly straightforward.  You can download the MicroStrategy Mobile App from Apple and point it at your own server.</p>
<p>One thing that was not covered enough in my opinion was the Portable Documents capability.  This allows you to build up a library of dashboards and re-use them across projects.  The reason this is important is consistency.  We all want trusted and consistent BI. In addition re-use is better than re-invention every time and so you can gain both from a productivity perspective as well as from a consistency perspective.</p>
<p><strong>Implications of Mobile BI on Traditional BI Environments</strong><br />
Looking at this announcement there are several implications on any BI environment when implementing Mobile BI.  This includes the fact that the number of concurrent users is likely to increase significantly.  If you intend to make mobile BI available to external users like customers then the concurrent usage could skyrocket. So your BI platform and the underlying DBMS need to be able to scale to handle concurrent usage.  64-bit support in a BI platform will help a lot as more data can accessed in-memory. This together with caching will make a significant contribution to performance when handling more users. If your customers are worldwide and you make mobile BI available to them then it is very likely that you will also push your BI system into a 7&#215;24 hour operating environment. High availability therefore starts to become critical.   I also believe that Mobile BI will become a ‘workhorse’ in business operations with many employees making use of it.  Therefore the number of on-demand BI requests from mobile devices is set to increase. The upshot is that workload management is going to be needed both in the underlying DBMS and in the BI platform to handle concurrent user requests from mobile devices as well as the traditional reporting and analysis taking place on a BI system.    Also if your mobile dashboards on an iPad are accessing data via data federation, there are implications here too.  There are likely to be many more federated queries and so caching matters here also.</p>
<p><strong>Conclusion</strong><br />
Overall then, MicroStrategy has entered the mobile market with a robust implementation that exploits native interfaces to mobile devices.  With other BI vendors already in the mobile BI market the only question is how much of a differential will it have over its competition.  I like the offering. It also covers Blackberry which is more than Oracle BI 11g (announced yesterday) does.  MicroStrategy customers will no doubt exploit it.</p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=342</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>BITunes on the Cloud? &#8211; The Emergence Of Subscription Based On-Demand BI</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=288</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=288#comments</comments>
		<pubDate>Tue, 29 Jun 2010 14:08:56 +0000</pubDate>
		<dc:creator>mikef</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Cloud BI]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=288</guid>
		<description><![CDATA[As I research more and more into the world of Cloud-based BI, it is becoming pretty evident where we are headed. In my opinion we are moving down the road to an iTunes model for BI.   Yesterday I spent some &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=288">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>As I research more and more into the world of Cloud-based BI, it is becoming pretty evident where we are headed. In my opinion we are moving down the road to an iTunes model for BI.   Yesterday I spent some time with Actuate in London looking at their <a href="http://www.birt-exchange.com/be/marketplace/paas/">BIRT On-Demand </a>platform as a service (PaaS) solution (which is very easy to use). It was only a matter of minutes before I was up and running with a Mashboard.  A few weeks back in New Orleans I used <a href="http://www.dundas.com/Dashboard/index.aspx?Campaign=DashboardMicroSite">Dundas Dashboard</a> to quickly build a dashboard from pre-built components. Similarly Microsoft SQL Server 2010 has the ability in ReportBuilder 3.0  to quickly build up a library of components that can be dragged and dropped into a report. The  more I use these products to understand their capabilities the more I see a similarity to what is happening in the information management world.  Looking at cloud-based data integration solutions like Boomi, Informatica and SnapLogic for example, you can see that what these vendors are trying to do is to create a development platform for <strong>Information as a Service.</strong> In other words you build data integration jobs and then make the results available on subscription such that companies can subscribe to information which is supplied to them by cloud based data integration workflows running on the net.   So now apply this idea to the BI produced on cloud-based PaaS solutions.  Once your reports and dashboards are built then the next thing people are going to want to do is to publish these artifacts as on-demand BI services assuming the intelligence is of business value to others.</p>
<p><span id="more-288"></span>If you combine On-demand BI running on top of on-demand information you can quickly see where we are going.  In my opinion it is only a matter of time before we see lots of intermediate companies (maybe even PaaS BI and SaaS BI vendors) making BI available on-demand in something similar to an iTunes  store.  The point here is that the level of abstraction rises again such that it is the BI that is of business value while the PaaS BI or SaaS BI solution is almost forgotten about.  Think of it like this. Imagine selling intelligence on-demand for the World Cup. At $5 a subscription the fact that this is built on a specific BI PaaS or SaaS BI solution is almost irrelevant. The consumer doesn&#8217;t care, they just want the intelligence.  It is the insight that is of value. The trick is to make it really easy to share intelligence of value in the form or reports and dashboards or dashboard components once they are built so that others can consume them quickly and easily on a subscription basis.</p>
<p>So bring on the &#8220;BITunes&#8221; store where insight is available on-demand on a subscription basis.  You can apply this to BI services built on external data available on the public cloud as well as to BI services available inside the enterprise on a private cloud. Users can simply subscribe to intelligence available on-demand.  This is an iTunes model and what a model it is. The size of this market could be very very significant. I don&#8217;t think we are far off.</p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=288</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Cloud Based BI &#8211; Understanding The Options Is the Biggest Barrier</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=281</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=281#comments</comments>
		<pubDate>Tue, 22 Jun 2010 12:16:06 +0000</pubDate>
		<dc:creator>amandad</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[BI PaaS]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Cloud]]></category>
		<category><![CDATA[Cloud BI]]></category>
		<category><![CDATA[Mike Ferguson]]></category>
		<category><![CDATA[SaaS BI]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=281</guid>
		<description><![CDATA[Last week I was in Munich to present at the annual TDWI (The Data Warehouse Institute) conference on &#8220;Business Intelligence and Data Management in a Cloud Computing Environment&#8221;.  It was a very well attended conference with some great speakers and &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=281">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Last week I was in Munich to present at the annual TDWI (The Data Warehouse Institute) conference on &#8220;Business Intelligence and Data Management in a Cloud Computing Environment&#8221;.  It was a very well attended conference with some great speakers and sessions.  My session focused on the following:</p>
<ul class="unIndentedList">
<li> What is Cloud Computing and why use it as a deployment option?</li>
<li> Why Cloud BI? &#8211; What are the requirements for a public cloud or externally hosted BI system?</li>
<li>Understanding what is on offer &#8211; The Cloud BI Marketplace</li>
<li> Getting data into a cloud based BI system</li>
<li> Managing access to cloud based BI systems and analytic applications</li>
<li> Integrating cloud based BI systems with on-premise systems</li>
<li> Pros and cons of deploying on the cloud?</li>
<li> Getting started with Cloud based BI</li>
</ul>
<p><span id="more-281"></span>Bear in mind that both public cloud and private cloud based BI were under discussion even though the hype seems all around public cloud or externally hosted BI systems.   Looking at these points it is the third bullet down that for me is the clear inhibitor to cloud based BI adoption.In other words the lack of understanding as to what exactly is on offer.  And there is a lot on offer. On the public cloud we have everything from plain Infrastructure as a Service (IaaS)  all the way through to Software as a Service (SaaS) based packaged analytic applications. On the private cloud several BI platforms are already running on virtualisation software such as VMware and/or Microsoft Hyper-V.  However there seems very little in the way of best practice advice on do&#8217;s and don&#8217;ts when it comes to deploying BI systems on a private cloud based virtualised environment.</p>
<p>In total I came up with 6 options, the last of which is simply where many of us are today i.e. BI systems not deployed on a cloud (whether it be public or private).  The options are as follows:</p>
<p style="padding-left: 30px;">1.     Public cloud based IaaS for a BI system</p>
<p style="padding-left: 30px;">2.     Public cloud or externally hosted BI/DW PaaS for building your own cloud-based BI system</p>
<blockquote>
<ul class="unIndentedList" style="padding-left: 30px;">
<li> Multi-vendor or single-vendor BI PaaS offerings</li>
</ul>
</blockquote>
<p style="padding-left: 30px;">3.     Public cloud or externally hosted SaaS BI packaged analytical applications</p>
<p style="padding-left: 30px;">4.     Public cloud or externally hosted SaaS BI for operational reporting on cloud based operational data</p>
<p style="padding-left: 30px;">5.     Private cloud based BI system running internally</p>
<p style="padding-left: 30px;">6.     Dedicated hardware based BI system (this is what most companies have today)</p>
<p>Option 1 is simply subscribing to an IaaS vendor like Savvis, Amazon, Rackspace or GoGrid  where you pay as you use on hardware and systems software and then buying and deploying your own ETL, DBMS and BI software (assuming they have no restrictions on what they will support).  I am not sure that this is attractive enough on its own without a BI/DW Platform as a Service (PaaS) as well.</p>
<p>Option 2 is the BI/DW Platform as a Service (PaaS) option on public cloud or even externally hosted.  Here you find another choice however. Should you choose a multi-vendor DW/BI PaaS or a single-vendor offering.  An example of a multi-vendor option is the <a href="http://www.rightscale.com/lp/bi-stack.php">RightScale/Talend/Vertica/Jaspersoft</a> PaaS offering on Amazon EC2.  A single vendor PaaS offering (of which there are several on offer) would be <a href="http://www.gooddata.com">GoodData</a>, or <a href="http://www.ondemand.com/businessintelligence/">SAP BusinessObjects On-Demand</a>. Others include <a href="http://www.birst.com">Birst</a>, <a href="http://www.indicee.com">Indicee</a> and <a href="http://www.pivotlink.com">PivotLink</a>.  A key question here is going to be &#8220;Is Data Integration included?&#8221;  Clearly in the multi-vendor offering mentioned there is an ETL solution such as Talend in the above example.   Data integration is very much file based with BI/DW PaaS vendors i.e. you upload files of data and then there is some processing of that data to load it into the PaaS DW/BI database.  Several single-vendor PaaS offerings give you only fairly lightweight data integration once data is uploaded.  Certainly not full blown ETL with built-in data quality that you might be used to in a data centre. In fact if you are looking for full blown DQ you are going to be disappointed in most cases.  The &#8216;get out&#8217; clause is you can add your own script but what about metadata lineage and auditability once the script writer has left for a better job?  A vendor like SAP (mentioned earlier) does have ETL (SAP BuisnessObjects Data Integrator) available but only if you subscrible to their Advanced Edition of SAP BusinessObjects On-Demand (there are 3 editions on offer).  I was even more surprised to see that SAPs BI/DW PaaS offering uses Microsoft SQL Server as the database and not BW.  I would expect that to change to SybaseIQ fairly soon. GoodData on the other hand have refreshingly recognised that you may want to go beyond the data integration you get out-of-the-box on subscription and have gone the extra mile to provide pre-built integration with cloud based data integration tools such as <a href="http://www.informaticacloud.com/">Informatica Cloud</a>, <a href="http://www.snaplogic.com">SnapLogic</a> and <a href="http://www.boomi.com/">Boomi</a>. Therefore you can use these tools to integrate your data before passing the data sets to them. The alternative to all of this is to do the lions share of the data integration in-house before uploading data files.</p>
<p>Option 3 is a fast growing market with many relatively new vendors (e.g. <a href="http://www.cloud9analytics.com/">Cloud9 Analytics</a>, <a href="http://www.rosslynanalytics.com/">Rosslyn Analytics</a>, <a href="http://www.lixto.com/?page_id=13">Lixto</a>) as well as traditional mainstream vendors e.g. <a href="http://www.sas.com/solutions/ondemand/index.html">SAS</a>, IBM Cognos.  The attraction here is a pre-built solution ready to go. These will clearly appeal to small and medium size businesses (SMBs) and even lines of business in some large organisations.  While we see horizontal applications looking at Salesforce.com data, spend analysis and pricing (to name a few), I am predicting that vertical analytic apps on the cloud will appear.</p>
<p>Option 4 is simply using a cloud based reporting system on operational data typically from a cloud based transaction processing system such as Salesforce.com.  In fact it would seem that Salesforce.com is dominating this space. An example here is SAP BusinessObjects CrystalReports.com for Salesforce.</p>
<p>Option 5 is private cloud based BI systems. The largest private cloud based BI system I know of is IBM&#8217;s internal Blue Insight which is based on IBM System Z and IBM Cognos 8 BI.  An estimated 200000 IBMers are using this.  IBM have since launched the <a href="http://www-03.ibm.com/systems/z/solutions/cloud/smart.html">Smart Analytics Cloud</a>, a private cloud offering for large enterprises based on the same technologies.  However it is still early days for BI deployments on internal private clouds. There appears to be more support coming from developer forums than vendors at present.  From what I can see, companies are taking a &#8216;toe in the water&#8217; approach to deploying on virtualized environments. No doubt, confidence will grow over time.  However does everything need to move to private cloud? Many companies with very large EDW initiatives may be reluctant to move to private clouds until they prove their scalability and lower TCO.   This issue here is should ETL, DBMS and BI platform all be on the same virtual servers? Should each have their own virtual server configuration? What is that configuration? Can I adjust it? etc. etc. I don&#8217;t think there will be a mad rush to put a 100TB DW on virtual servers.  I do like the fact that vendors like <a href="http://www.microstrategy.com">Microstrategy</a> have given this some serious consideration and have released a private cloud enterprise edition of Microstrategy 9.  MicroStrategy components are packaged as Virtual Appliances and tuned for expected load. These Virtual Appliances contain fully configured software components and the number of running virtual appliances can be adjusted to accommodate specific performance goals. This is a damn sight better than just saying to a customer &#8220;it&#8217;s up to you, just deploy it and you figure out the virtual server configuration&#8221;  What Microstrategy have done is to allow you to adjust the underlying assigned physical resources to satisfy performance demands and have made available administrative facilities to control virtualized MicroStrategy environment.</p>
<p>It is early days in Cloud based BI. I recommend looking at your requirements and then match the options available to your needs</p>
<p>I would be interested if any of you have experiences in this area. Do&#8217;s and Don&#8217;ts. What works, what doesn&#8217;t.  Please share them by placing your comments.</p>
<p><a href="http://www.twitter.com/mikeferguson1">Follow me on twitter </a></p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=281</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Chasm Not Crossed as A Sensor Data Tsunami Comes Over The Horizon</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=274</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=274#comments</comments>
		<pubDate>Mon, 21 Jun 2010 14:26:38 +0000</pubDate>
		<dc:creator>amandad</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[CEP]]></category>
		<category><![CDATA[Complex Event Processing]]></category>
		<category><![CDATA[Mike Ferguson]]></category>
		<category><![CDATA[Operational BI]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=274</guid>
		<description><![CDATA[Just over a week ago I spent a day at SensorExpo in Chicago to present on Complex Event Processing (CEP) discussing how CEP engines, Predictive Analytics, business rules can be used to analyse sensor emitted event data in-motion to facilitate &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=274">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Just over a week ago I spent a day at <a href="http://www.sensorsexpo.com/">SensorExpo</a> in Chicago to present on Complex Event Processing (CEP) discussing how CEP engines, Predictive Analytics, business rules can be used to analyse sensor emitted event data in-motion to facilitate business optimisation.  This was a very busy conference.  I estimated at least 2000-3000 people on the exhibition floor with maybe 400 on the conference.  I found around 100 vendors with all kinds of sensor devices on show exhibiting their products and services.  To my surprise however I had only heard of 2 of the vendors. IBM and Texas Instruments.  The floor was heaving with people looking to instrument their business operations to measure everything from movement, temperature, energy consumption, stress, heat, fluid volumes, pipeline flows and RFIDs.  There were analog devices and digital devices.  When taking to the vendors the big common denominator was that they are all trying to collect the data from sensor networks and RFIDs to analyse it.  Yet other than IBM there was not a single BI vendor in sight. Not even a single complex event processing (CEP) vendor in sight.   I was shocked because this market is clearly booming.   What was even more surprising was that I could not find an IT professional anywhere. 99.9% of all delegates and speakers were engineers.</p>
<p><span id="more-274"></span>Attending some of the case studies I found some fantastic applications of the use of sensor networks and RFIDs.  Healthcare with sensors all over hospitals with equipment and patients all tagged with RFIDs.  The return on investment in this case was fraud prevention on equipment (theft mainly) and process improvement for patients.  Another session I attended was one on monitoring stress in all the bridges in the US &#8211; over 700000 of them.  Some of the stats being quoted by the speakers were staggering.  &#8220;Well we are emitting, 3 events per minute from every sensor on a 7&#215;24 hour basis. After 6 months operating like this we have over 20 PETABYTES of data&#8221;.  You read it right 20 PETABYTES.   A lot of the technical focus at the conference was on energy harvesting to prolong sensor battery life,  but the business message was clear as a bell.  Process optimisation, preventative maintenance and cost reduction comes from instrumenting business operations.  Manufacturing production lines, supply chains, product distribution, asset management.  You name it, they&#8217;re measuring it.</p>
<p>So I have to ask, where are all the BI vendors? Where are all the analytical DBMS vendors? Where are the CEP products, the real-time dashboards and predictive analytical models for automated analysis?  This is an operational BI gold mine.  Yet there are no mainstream vendors in sight bar IBM (at least someone there is switched on to what is happening).  The volume of data coming over the horizon from the adoption of sensor networks and RFIDs is nothing short of massive.  What is also clear is that this is already going on in enterprises and IT are blissfully unaware of it in the main.  Clearly IT BI professionals have got to get in touch with their Engineering colleagues and engineers have got to be made aware of mainstream data integration, analytical database and BI platform technologies as well as CEP software of course.  In my 29 years in the industry, I don&#8217;t think I have ever seen a chasm between IT and business not even explored never mind crossed.  Yet the value of CEP and mainstream DW/BI to this market is nothing short of enormous.   It is symptomatic of a young market heaving with engineers that has yet to be tied into mainstream IT to exploit far more robust software than is being used on this data at present.  What an opportunity. What a huge opportunity.  It most certainly is going to re-define large databases when we have to set them up for analysis of historical event data emitted by these devices.  CEP <span style="text-decoration: underline;">has to</span> go there. CEP vendors have to get out of just being in the financial markets and wake up to a ton of data in motion being emitted by the growing number of devices.  An article I read recently said that <a href="http://www.edn.com/article/509123-Sensors_empower_the_Internet_of_Things_.php">Sensors empower an Internet of Things</a>.  Well, those things are coming over the horizon emitting a Tsunami of data. It is time CEP and DW/BI vendors woke up an smelt the coffee and became aware of this rapidly growing market.  CIOs had better take heed too because they are going to have to integrate it into mainstream IT.</p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=274</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Microsoft Opens Up Collaborative and Self-Service BI</title>
		<link>http://intelligentbusiness.biz/wordpress/?p=225</link>
		<comments>http://intelligentbusiness.biz/wordpress/?p=225#comments</comments>
		<pubDate>Mon, 21 Jun 2010 11:38:36 +0000</pubDate>
		<dc:creator>amandad</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[BI]]></category>
		<category><![CDATA[Collaborative BI]]></category>
		<category><![CDATA[CPM]]></category>
		<category><![CDATA[Data Governance]]></category>
		<category><![CDATA[Microsoft]]></category>
		<category><![CDATA[Mike Ferguson]]></category>
		<category><![CDATA[SharePoint]]></category>

		<guid isPermaLink="false">http://intelligentbusiness.biz/wordpress/?p=225</guid>
		<description><![CDATA[Just over a week ago I was invited to attend an analyst briefing at the Microsoft BI conference in New Orleans that was running alongside the Microsoft TechEd conference.  The conference itself was very well attended with several thousand delegates.  &#8230; <a href="http://intelligentbusiness.biz/wordpress/?p=225">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>Just over a week ago I was invited to attend an analyst briefing at the Microsoft BI conference in New Orleans that was running alongside the Microsoft TechEd conference.  The conference itself was very well attended with several thousand delegates.  Several things were on show at this event including SharePoint 2010, SQL Server 2008 R2, Office 2010,  PowerPivot, PerformancePoint services 2010. Also on show was SQL Server Data Warehousing Edition (also known as the Madison project) &#8211; the massively parallel edition of SQLServer that will be shipped later this year.</p>
<p><span id="more-225"></span>The one thing that stood out for me was the seismic shift towards collaborative BI.   As my friend Colin White so aptly put it in the analyst briefing, &#8220;Microsoft have brought BI to collaboration rather than collaboration to BI&#8221;.  This is an important point because what it is says is that there is little point adding collaborative features to a BI platform if these are not the services associated with a mainstream collaborative platform.  There is far more value in integrating a BI platform with the company collaboration software to tap into things like collaborative workspaces, presence awareness, unified communication, shared calendar etc. etc.  In Microsoft&#8221;s case this is of course the SharePoint product which has become viral in most organisations.</p>
<p>It is no surprise therefore that Microsoft&#8217;s BI initiative is built around 3 main components and not just SQL Server.  These are:</p>
<ul>
<li>Office</li>
<li>SharePoint</li>
<li>Microsoft SQL Server 2008 R2</li>
</ul>
<p>Note that SQL Server 2008 R2 includes StreamInsight, Microsoft&#8217;s complex event processing (CEP) engine and Microsoft Master Data Services</p>
<p>While there we were take through an excellent demo to show the power of collaboration and what it can do when integrated with BI.  It even included the Microsoft Round Table device which although it has been available for some four years, was the first time I have actually encountered one.</p>
<p>What the demo showed me was the speed with which BI and BI &#8216;components&#8217; can be spread among a community of users. My conclusion is that integration of SQL Server 2008 R2 with Sharepoint 2010 takes this to another level in that the rate that business intelligence can be shared it is almost &#8216;twitter speed&#8217;.  For those of you using twitter, you will know that as soon as something of interest breaks, re-tweets can spread it across masses of people in a matter of minutes.  This is the feeling I got during the demo.  It fuels mass sharing, mass reuse and mass development of BI applications and artifacts.  In particular reports and dashboards. It certainly fits with Microsoft&#8221;s vision of BI for everyone.</p>
<p>Several new features open up the flood gates for collaborative BI to share intelligence with other without the need for IT. For example,</p>
<p>BI reports can be managed by Sharepoint in document libraries. You can also preview reports before opening them up.</p>
<p>Also Microsoft is fueling development by business users on the back of what power users have done, thereby bypassing IT.  This is because there is now a capability whereby Microsoft ReportBuilder 3.0 can access PowerPivot workflows uploaded to SharePoint sites.  You can also export to Excel from PowerPivot.  Power users using PowerPivot (originally referred to as Gemini), can take data from different data sources (including newly supported Atom feeds), merge and join that data. Relationships between tables can be managed inside of PowerPivot.  PowerPivot power users can then create workflows that process this data and can upload these to Sharepoint sites.  ReportBuilder 3.0 (or any BI client) can then treat the PowerPivot workflow as a data source.  Not only that but ReportBuilder can create report parts which are sharable in a report part gallery do that other users can reuse them by simply dragging an dropping the report parts onto a new report for rapid development without having to know the detail underneath.</p>
<p>Hopefully by now you have got the picture &#8211; power users building their own workflows in PowerPivot, publishing them to SharePoint, other users using them as data sources in reports, report parts created, and a gallery of parts to be shared across a community of users.  Powerful stuff, and we are not done yet.</p>
<p>In Sharepoint 2010 there is a new site template called Business Intelligence Center.  What you can now do is create a new site in SharePoint using the Business Intelligence Center template. This template includes chart web parts and Excel services workbook access. It also includes a PerformancePoint library so that you can start building your dashboard very rapidly including access to reports and report parts. With is mechanism, Microsoft is opening up dashboard development to the masses and also allowing &#8216;social&#8217; performance management whereby dashboards and/or dashboard components can be rated.  All this integrated with SharePoint and Office is in my opinion going to take self-service BI development to another level that it could easily have a &#8216;popcorn effect&#8217; with masses of BI being produced rapidly and IT nowhere in sight.  There is no doubt that it opens up the flood gates for business innovation and sharing.  Personalised dashboard development using PerformancePoint Services 2010 integrated with SharePoint 2010.</p>
<p><strong>A Question of Governance?</strong></p>
<p>My only concern with this is the issue of governance.  What Microsoft have done is to put mass development in the hands of the business.  If you think upi have seen anything on self-service BI, just wait until SharePoint 2010, Office 2010 and SQL Server 2008 R2 move into production in your shop. You ain&#8217;t seen nothing yet.</p>
<p>However, I see very little with respect to data governance. What about business glossaries? What about metadata lineage?  In a world of increasing regulation and legislation to prevent corporate catastrophes, can anything be audited? Can it be tracked back to where the data come from? How has the data been transformed by the power users? iWhat does the data mean?  I have as yet seen little from Microsoft in the form of metadata management and data governance despite the fact that Master Data Services is also delivered as part of this SQL Server release.  While there is no doubt that this is coming (confirmed by the Microsoft guys I spoke with on the exhibition floor booth) my only fear is will be too late.  Will the horses have already bolted with self-service BI unstoppable and off down a track without lineage to help users know that the data is trusted.</p>
<p>Equally, scorecard and dashboard development is bottom up. Everyone (with authority) can create their own scorecards and dashboards rapidly but there appears to be no framework whereby these can be slotted into a multi-level  strategy management unlike say SAP with SAP Strategy Management.  So what is the answer? Is it all bets are off and just let the business figure out the best way to manage on the back of socially rated scorecards and dashboards?  What happened to business strategy?  Many companies set a strategy at executive level and want enterprise wide business strategy execution.   This latter approach is top-down.  What Microsoft is fueling is bottom up.  My opinion is we need both and not one or the other.</p>
<p><strong>Freedom Versus Governance &#8211; A Delicate Balancing Act</strong></p>
<p>It is pretty clear then that, setting aside the new SQL Server Data Warehousing Edition, this is very much a Collaborative BI release by Microsoft.  It is a major leap forward in what the business users can do for themselves.  We have two forces at work here.  Freedom versus governance.  We have to get the balance right.  Too much freedom and we could have chaos with no ability to audit what has been done or whether the BI is trusted. Too much governance and we put innovation in a straight jacket or kill it altogether.   All I would say is that IT had better get a data governance program underway soon to control data all the way out to data marts and cubes. If that is done then there is no doubt that the business can be empowered to innovate which is what should happen. Without a data governance program however, I think it is really going to be hard to get alignment with what the business is doing given the sheer speed of development that is now possible with this release.  Let&#8217;s hope governance, innovation and collaboration are a winning combination.</p>
]]></content:encoded>
			<wfw:commentRss>http://intelligentbusiness.biz/wordpress/?feed=rss2&#038;p=225</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

