• Calender

    October 2012
    M T W T F S S
    « Aug   Nov »
  • Contact

    Send us press releases, events and new product information for publication..

    Email: nchuppala@ outlook.com

“Magnetic, Agile, Deep” (MAD) approach to data




MAD Skills: New Analysis Practices for Big Data (link updated to VLDB version).  The paper does a few controversial things (if you’re the kind of person who finds data management a source of controversy):

  • It takes on “data warehousing” and “business intelligence” as outmoded, low-tech approaches to getting value out of Big Data. Instead, it advocates a “Magnetic, Agile, Deep” (MAD) approach to data, that shifts the locus of power from what Brian Dolan calls the “DBA priesthood” to the statisticians and analysts who actually like to crunch the numbers.  This is a good thing, on many fronts.
  • It describes a state-of-the-art parallel data warehouse that sits on 800TB of disk, using 40 dual-processor dual-core Sun Thumper boxes.
  • It presents a set of general-purpose, hardcore, massively parallel statistical methods for big data.  They’re expressed in SQL (OMG!) but could be easily translated to MapReduce if that’s your bag.
  • It argues for a catholic (small-c) approach to programming Big Data, including SQL & MapReduce, Java & R, Python & Perl, etc.  If you already have a parallel database, it just shouldn’t be that hard to support all those things in a single engine.
  • It advocates a similarly catholic approach to storage.  Use your parallel filesystem, or your traditional database tables, or your compressed columnstore formats, or what have you.  These should not be standalone “technologies”, they are great features that should — no, will — get added to existing parallel data systems.  (C’mon, you know it’s true… )

I started to write the paper because it was just too cool what Brian Dolan was doing with Greenplum at Fox Interactive Media (parent company of MySpace.com) — e.g., writing Support Vector Machines in SQL and running it over dozens of TB of data.  Brian was a great sport about taking his real-world experience and good ideas and putting them down on paper for others to read.  Along the way I learned a lot about the data architecture ideas he’s been cooking with Mark Dunlap, which are real thumb in the eye of the warehouse orthodoxy, and make eminent good sense in today’s world.  Finally, it was nice to get to write about the good things that Jeff Cohen and Caleb Welton have been doing at Greenplum to cut through the hype and shrink the distance between SQL and MapReduce.  I’m hoping those guys will have time to sit down one of these days and patiently write up how they’ve done it … it’s really very elegant.

And it still warms my heart that it’s Postgres code underneath all that.  Time to resurrect the xfunc code!



Groundbreaking new technology puts business users in touch with big data, eliminates the data warehouse, and makes Hadoop ready to disrupt the $35B business analytics market

San Mateo, CA, October 23, 2012 – Platfora revealed the first scale-out in-memory business intelligence platform for Hadoop today at the Strata + Hadoop World Conference in New York. For the first time, users can change their business intelligence questions without involving teams of IT staff to re-organize a data warehouse. Platfora will forever change the way businesses analyze data by allowing them to determine the questions after data is collected, essentially eliminating the need for a data warehouse or ETL (extract, transform, and load) software.


Today’s Big Data Technology: Hadoop

Today, businesses perform a complex and rigid set of steps between the customer interactions that generate data and analyzing that data with Business Intelligence (BI) software. Businesses have been rapidly turning to Apache Hadoop(TM), a relatively new and open source technology to solve the big data storage and processing problem. A recent forecast from IDC shows that revenues for the worldwide Hadoop-MapReduce ecosystem software market are expected to grow at a compound annual growth rate (CAGR) of 60.2% over the 2011-2015 forecast period.[1]

“Hadoop is changing the economics of big data for businesses. Companies now want to store everything and be ready for unanticipated questions,” said Merv Adrian, Research VP, Gartner. “Hadoop does not replace technologies like business intelligence software or the data warehouse today; it creates a foundation on which applications can deliver the full potential of new and underexploited information assets.”

A New Breed of Product

Platfora is a fundamentally different kind of BI platform. It transforms raw data in Hadoop into interactive, in-memory BI without the need for a traditional data warehouse. End users are presented with a beautiful web-based analysis interface using HTML5 technology — the first of its kind.

Platfora works by pulling data out of Hadoop into a scale-out in-memory data processing engine, making access to the data extremely fast. Platfora leverages the power of Hadoop to perform the heavy lifting, processing massive data sets into highly efficient, in-memory data stores, which can seamlessly span dozens or hundreds of servers to utilize their collective memory and processing.

Platfora’s biggest innovation lies in how the layers work together seamlessly. It automatically refines the in-memory data based on the questions being asked by end users.

“Everyone talks about the promise of big data, but the real innovation lies in the data agility and exploration made possible by Hadoop,” said Ben Werther, founder and CEO, Platfora. “Businesses want immediate answers – they cannot wait days or months for insight into their own data. Unless businesses can explore and interact with this data, the promise of Hadoop is lost.”

“As new software models continue to change the data center and drive better business outcomes, Platfora’s approach removes the requirement for a data warehouse and the accompanying limitations,” said Scott Weiss, partner, Andreessen Horowitz and Platfora board member. “Platfora up-ends traditional business intelligence, unlocks the potential of Hadoop and puts the power of data in the hands of the end user.”

“Platfora helps us gain insights into our business and make decisions more quickly,” said Ray Duong, founder and CTO at AdMobius, world’s first mobile audience management platform for publishers and advertisers. “We have been using Hadoop to store tens of billions of advertising events combined with data from hundreds of millions of users. Without Platfora it could take us weeks of time to answer simple questions. Platfora makes it possible for us to work with the data in ways that were just not possible before.”

Product Overview 

Platfora is built from the ground up to present Hadoop raw data in entirely new ways. Key components of Platfora include:

● Unbounded, In-Memory BI: Platfora Vizboards(TM) allow business users to interactively build stunning visualizations. Vizboards are web-based, using HTML5 canvas technology, and feature a collaborative layer for sharing and collaboration.

● Scale-Out In-Memory Data Processing Engine: Using Fractal Cache(TM) technology, Platfora’s in-memory data processing engine is natively connected with Hadoop. As the powerful core of Platfora’s platform, Fractal Cache automatically refines itself, without the intervention of IT.

● Hadoop Data Refinery: Adaptive Job Synthesis(TM) automatically builds MapReduce jobs and pushes them down to Hadoop to produce aggregated, in-memory representations of the data.

● Platfora works with the following Hadoop platforms: Cloudera, Amazon Web Services, MapR and Hortonworks.

Platfora will be offered as a license on a per server, not per user, basis to be deployed either on premise in an enterprise data center or in the cloud.

About Platfora, Inc.

Platfora changes the way businesses use data. Headquartered in San Mateo, CA, we put the power of big data in the hands of people who need it most, the end users. Platfora makes Hadoop amazing. Platfora is funded by Andreessen Horowitz, Sutter Hill Ventures and In-Q-Tel. For more details, visit www.platfora.com and follow us @platfora.


Jessica Waight
Nectar PR for Platfora

The End of the Data Warehouse

By  October 23, 2012

Today is a major milestone on the Platfora journey. But it is more than that. Today we reach out beyond our early beta customers and share what we know is possible.

We’ve been living in the dark ages of data management. We’ve been conditioned to believe that it is right and proper to spend a year or more architecting and implementing a data warehouse and business intelligence solution. That you need teams of consultants and IT people to make sense of data. We are living in the status quo of practices developed 30 years ago — practices that are the lifeblood of companies like Oracle, IBM and Teradata.

When I ran product at Greenplum, we understood this reality. Working with brilliant folks like Joe Hellerstein (UC Berkeley) and Brian Dolan (then at Fox Interactive), the team developed practices to navigate around the outmoded approaches of the past. Joe coined the name ‘MAD Skills’ (Magnetic, Agile and Deep).

But we could only distort reality so far. At the end of the day it was still a big relational database. When the rubber met the road, DBAs were doing what they always do — designing data models, building ETL jobs, and tuning indexes and aggregates.

The insight for Platfora came a number of months after leaving Greenplum (post EMC acquisition). I’d been spending a lot of time thinking about Hadoop and why it was gaining so much momentum. Clearly it was cost-effective and scalable, and was intimately linked in people’s minds to companies like Google, Yahoo and Facebook. But there was more to it. Everywhere I looked, companies were generating more and more data — interactions, logs, views, purchases, clicks, etc. These were being linked with increasing numbers of new and interesting datasets — location data, purchased user demographics, twitter sentiment, etc. The questions that these swirling datasets would one day support couldn’t be know yet. And yet to build a data warehouse I’d be expected to perfectly predict what data would be important and how I’d want to question it, years in advance, or spend months rearchitecting every time I was wrong. This is actually considered ‘best practice’.

The brilliance of what Hadoop does differently is that it doesn’t ask for any of these decisions up front. You can land raw data, in any format and at any size, in Hadoop with virtually no friction. You don’t have to think twice about how you are going to use the data when you write it. No more throwing away data because of cost, friction or politics.

Which brings us to the insight.

In the view of the status-quo players, Hadoop is just another data source. It is a dumping ground, and from there you can pull chunks into their carefully architected data warehouses – their ‘system of record’. They’ll even provide you a ‘connector’ to make the medicine go down sweet. Sure, you are back in the land of consultants and 12-18 month IT projects, but you can rest easy because you know the ‘important’ data is safely being pumped into your multi-million dollar database box. Just don’t change your mind about what data or questions are important.

But lets go through the looking glass. The database isn’t the ‘system of record’ — it is just a shadow of the data in Hadoop. In fact there is nothing more authentic than all of that raw data sitting in Hadoop. With just a bit of metadata to describe the data, it’d be possible to materialize any ‘data warehouse’ from that data in a completely automated way. These ephemeral ‘data warehouses’ could be built, maintained, and disposed of with a click of a button.

Imagine what is possible. Raw data of any kind or type lands in Hadoop with no friction. Everyday business users can interactively explore, visualize and analyze any of that data immediately, with no waiting for an IT project. One question can lead to the next and take them anywhere through the data. And the connective tissue that makes this possible — bridging between lumbering batch-processing Hadoop and this interactive experience — are ‘software defined’ scale-out in-memory data marts that automatically evolve with users questions and interest. Enter… Platfora.

Through the looking glass, there is no need for a traditional data warehouse. It is an inflexible, expensive relic of a bygone age. It is time to leave the dark ages.

PLATFORA: Makes Hadoop amazing


Platfora is the world’s first in-memory business intelligence platform for Hadoop. Our software works with Hadoop and transforms it from infrastructure into a visually stunning and remarkably powerful business analytics platform.

This innovative technology provides complete agility with raw data, essentially eliminating the need for a data warehouse or ETL software.


Platfora transforms Apache Hadoop™ from batch engine into a subsecond-interactive, exploratory business intelligence and analytics platform designed for business analysts. This has never been done before, and getting there involves far more than putting a pretty user interface on top of Hadoop.


Platfora is a fully featured, enterprise-class business intelligence application. Built on HTML5, it is accessible anywhere and there are no per-user licensing limits. Collaboration and sharing is built in. With Platfora, analysts can drill into detail or dimensions only available in raw data without the IT friction and complexity of traditional solutions.

Read More >


Powered by Fractal Cache™ technology, Platfora automatically transforms massive datasets into highly responsive lenses. The high-performance query engine makes access to big data fast and scales-out to terabytes of in-memory data. Lenses are refined and updated based on the needs of the end-user.

Read More >


Platfora transforms Apache Hadoop into a work engine. Adaptive Job Synthesis™ synthesizes specialized MapReduce jobs that automatically extract detail, perform calculations and downstream analysis, building lenses efficiently and on demand. All datasets in Hadoop are cataloged and can be securely browsed by any Platfora user.




Platfora does not require an expert to return straightforward results. Direct Hadoop requires complex MapReduce programs or Hive queries. Platfora’s visual interface is designed for business users and IT data administrators.

In Hadoop or Hive, each query executes an inefficient and slow batch job from scratch. Platfora’s Fractal Cache technology makes data interactive and fast.



Unlike a data warehouse, Platfora’s lenses are not limited by size or fixed schema.

Business users can make schema changes, integrate new data sets, all without IT intervention



Connectors fail with large data volumes as they pull chunks of data to the desktop. In practice users need to build a data warehouse.

Platfora is 100% HTML5 and natively designed for Hadoop and big data.



These tools are batch Hadoop job builders with no interactive engine

Platfora seamlessly connects business users to data through visual interactions, unlike a spreadsheet

Top 5 Technology Trends in Financial Services – October 2012

by  on October 30th, 2012


Trying to stay ahead of the curve when it comes to IT issues is a challenging task. Emerging technology forces in the financial services industry are already impacting business. The convergence of these forces does present challenges; however, it also provides a window of opportunity for financial institutions to elevate business performance and gain a competitive advantage. Perficient provides a monthly perspective on some of the most talked about IT issues and emerging trends to help industry professionals identify and rationalize their IT investments.

Mobile Banking

Financial services organizations are mapping their customer-centric initiatives to mobile banking solutions in order to capture market share and gain a competitive advantage. Today’s consumer expects technology at their fingertips whenever they want it and however they need it.

Customer Analytics

In Financial Services, business intelligence can provide organizations the ability to use their data to improve customer retention, increase efficiency and operational effectiveness, explore new services and support enterprise risk management strategies. Analytics can provide the critical insights in meeting the organizations’ goals and help gain competitive advantage.


Big Data

Big Data is a current hot topic in IT departments all across the world, but beyond the hype, what does it mean for the Financial Services industry? In Financial Services the ability to use tools to understand and interpret Voice of Customer (VOC), risk and fraud patterns, and, in concert with mobile banking apps, offers real-time segmentation to build customer loyalty and retention, and the highest return in terms of value.


To remain “top of mind” and “top of wallet” in consumers’ perception, banks must define their tactical and strategic plans in concrete terms and decide which mobile capabilities best represent customer needs. Banks must rationalize mobile payments as a key customer channel to be competitive against emerging capabilities of non-bank payment providers.

2013 Trends and Spend

The next 12 months we will see growth and maturation of several technologies, including mobility, business intelligence, and big data to enable multi-channel capabilities such as loyalty and rewards, real-time offers, mobile payments and expanded mobile banking services. Data shows that financial institutions are still creating or refining segmentation strategies to leverage technologies that support their business. In doing so, the industry will also see an increased investment in updating and replacing core systems to support these emerging technology forces. It is important CIOs focus in and help lead their organizations in these areas to be successful in the digital age of banking.

Gartner Identifies the Top 10 Strategic Technology Trends for 2013


Analysts Examine Top Industry Trends at Gartner Symposium/ITxpo, October 21-25 in Orlando

ORLANDO, Fla., October 23, 2012—  


Gartner, Inc. today highlighted the top 10 technologies and trends that will be strategic for most organizations in 2013. Analysts presented their findings during Gartner Symposium/ITxpo, being held here through October 25.

 Gartner defines a strategic technology as one with the potential for significant impact on the enterprise in the next three years. Factors that denote significant impact include a high potential for disruption to IT or the business, the need for a major dollar investment, or the risk of being late to adopt.

 A strategic technology may be an existing technology that has matured and/or become suitable for a wider range of uses. It may also be an emerging technology that offers an opportunity for strategic business advantage for early adopters or with potential for significant market disruption in the next five years. These technologies impact the organization’s long-term plans, programs and initiatives.

“We have identified the top 10 technologies that will be strategic for most organizations, and that IT leaders should factor into their strategic planning processes over the next two years,” said David Cearley, vice president and Gartner fellow. “This does not necessarily mean enterprises should adopt and invest in all of the listed technologies; however companies need to be making deliberate decisions about how they fit with their expected needs in the near future.”

Mr. Cearley said that these technologies are emerging amidst a nexus of converging forces – social, mobile, cloud and information. Although these forces are innovative and disruptive on their own, together they are revolutionizing business and society, disrupting old business models and creating new leaders. As such, the Nexus of Forces is the basis of the technology platform of the future.

 The top 10 strategic technology trends for 2013 include:

 Mobile Device Battles
Gartner predicts that by 2013 mobile phones will overtake PCs as the most common Web access device worldwide and that by 2015 over 80 percent of the handsets sold in mature markets will be smartphones. However, only 20 percent of those handsets are likely to be Windows phones. By 2015 media tablet shipments will reach around 50 percent of laptop shipments and Windows 8 will likely be in third place behind Google’s Android and Apple iOS operating systems. Windows 8 is Microsoft’s big bet and Windows 8 platform styles should be evaluated to get a better idea of how they might perform in real-world environments as well as how users will respond. Consumerization will mean enterprises won’t be able to force users to give up their iPads or prevent the use of Windows 8 to the extent consumers adopt consumer targeted Windows 8 devices. Enterprises will need to support a greater variety of form factors reducing the ability to standardize PC and tablet hardware. The implications for IT is that the era of PC dominance with Windows as the single platform will be replaced with a post-PC era where Windows is just one of a variety of environments IT will need to support.

Mobile Applications and HTML5
The market for tools to create consumer and enterprise facing apps is complex with well over 100 potential tools vendors. Currently, Gartner separates mobile development tools into several categories.  For the next few years, no single tool will be optimal for all types of mobile application so expect to employ several. Six mobile architectures – native, special, hybrid, HTML 5, Message and No Client will remain popular. However, there will be a long term shift away from native apps to Web apps as HTML5 becomes more capable. Nevertheless, native apps won’t disappear, and will always offer the best user experiences and most sophisticated features. Developers will also need to develop new design skills to deliver touch-optimized mobile applications that operate across a range of devices in a coordinated fashion.

Personal Cloud
The personal cloud will gradually replace the PC as the location where individuals keep their personal content, access their services and personal preferences and center their digital lives. It will be the glue that connects the web of devices they choose to use during different aspects of their daily lives. The personal cloud will entail the unique collection of services, Web destinations and connectivity that will become the home of their computing and communication activities. Users will see it as a portable, always-available place where they go for all their digital needs. In this world no one platform, form factor, technology or vendor will dominate and managed diversity and mobile device management will be an imperative. The personal cloud shifts the focus from the client device to cloud-based services delivered across devices.

Enterprise App Stores
Enterprises face a complex app store future as some vendors will limit their stores to specific devices and types of apps forcing the enterprise to deal with multiple stores, multiple payment processes and multiple sets of licensing terms. By 2014, Gartner believes that many organizations will deliver mobile applications to workers through private application stores. With enterprise app stores the role of IT shifts from that of a centralized planner to a market manager providing governance and brokerage services to users and potentially an ecosystem to support apptrepreneurs.

The Internet of Things
The Internet of Things (IoT) is a concept that describes how the Internet will expand as physical items such as consumer devices and physical assets are connected to the Internet. Key elements of the IoT which are being embedded in a variety of mobile devices include embedded sensors, image recognition technologies and NFC payment. As a result, mobile no longer refers only to use of cellular handsets or tablets. Cellular technology is being embedded in many new types of devices including pharmaceutical containers and automobiles. Smartphones and other intelligent devices don’t just use the cellular network, they communicate via NFC, Bluetooth, LE and Wi-Fi to a wide range of devices and peripherals, such as wristwatch displays, healthcare sensors, smart posters, and home entertainment systems. The IoT will enable a wide range of new applications and services while raising many new challenges.

Hybrid IT and Cloud Computing
As staffs have been asked to do more with less, IT departments must play multiple roles in coordinating IT-related activities, and cloud computing is now pushing that change to another level. A recently conducted Gartner IT services survey revealed that the internal cloud services brokerage (CSB) role is emerging as IT organizations realize that they have a responsibility to help improve the provisioning and consumption of inherently distributed, heterogeneous and often complex cloud services for their internal users and external business partners. The internal CSB role represents a means for the IT organization to retain and build influence inside its organization and to become a value center in the face of challenging new requirements relative to increasing adoption of cloud as an approach to IT consumption.

 Strategic Big Data
Big Data is moving from a focus on individual projects to an influence on enterprises’ strategic information architecture. Dealing with data volume, variety, velocity and complexity is forcing changes to many traditional approaches. This realization is leading organizations to abandon the concept of a single enterprise data warehouse containing all information needed for decisions. Instead they are moving towards multiple systems, including content management, data warehouses, data marts and specialized file systems tied together with data services and metadata, which will become the “logical” enterprise data warehouse.

Actionable Analytics
Analytics is increasingly delivered to users at the point of action and in context. With the improvement of performance and costs, IT leaders can afford to perform analytics and simulation for every action taken in the business. The mobile client linked to cloud-based analytic engines and big data repositories potentially enables use of optimization and simulation everywhere and every time. This new step provides simulation, prediction, optimization and other analytics, to empower even more decision flexibility at the time and place of every business process action. 

In Memory Computing
In memory computing (IMC) can also provide transformational opportunities. The execution of certain-types of hours-long batch processes can be squeezed into minutes or even seconds allowing these processes to be provided in the form of real-time or near real-time services that can be delivered to internal or external users in the form of cloud services. Millions of events can be scanned in a matter of a few tens of millisecond to detect correlations and patterns pointing at emerging opportunities and threats “as things happen.” The possibility of concurrently running transactional and analytical applications against the same dataset opens unexplored possibilities for business innovation. Numerous vendors will deliver in-memory-based solutions over the next two years driving this approach into mainstream use.

 Integrated Ecosystems
The market is undergoing a shift to more integrated systems and ecosystems and away from loosely coupled heterogeneous approaches. Driving this trend is the user desire for lower cost, simplicity, and more assured security. Driving the trend for vendors the ability to have more control of the solution stack and obtain greater margin in the sale as well as offer a complete solution stack in a controlled environment, but without the need to provide any actual hardware. The trend is manifested in three levels. Appliances combine hardware and software and software and services are packaged to address and infrastructure or application workload. Cloud-based marketplaces and brokerages facilitate purchase, consumption and/or use of capabilities from multiple vendors and may provide a foundation for ISV development and application runtime. In the mobile world, vendors including Apple, Google and Microsoft drive varying degrees of control across and end-to-end ecosystem extending the client through the apps.

About Gartner Symposium/ITxpo
Gartner Symposium/ITxpo is the world’s most important gathering of CIOs and senior IT executives. This event delivers independent and objective content with the authority and weight of the world’s leading IT research and advisory organization, and provides access to the latest solutions from key technology providers. Gartner’s annual Symposium/ITxpo events are key components of attendees’ annual planning efforts. IT executives rely on Gartner Symposium/ITxpo to gain insight into how their organizations can use IT to address business challenges and improve operational efficiency.

Additional information about Gartner Symposium/ITxpo in Orlando, is available atwww.gartner.com/symposium/us. Video replays of keynotes and sessions are available on Gartner Events on Demand at www.gartnerondemand.com. Follow news, photos and video coming from Gartner Symposium/ITxpo on Facebook athttp://www.facebook.com/GartnerSymposium, and on Twitter at http://twitter.com/Gartner_inc and using #GartnerSym.

Upcoming dates and locations for Gartner Symposium/ITxpo include:

October 29-31, Sao Paulo, Brazil: www.gartner.com/br/symposium
November 5-8, Barcelona, Spain: www.gartner.com/eu/symposium
November 12-15, Gold Coast, Australia: www.gartner.com/au/symposium
 5-7, 2013, Dubai, UAE: www.gartner.com/technology/symposium/dubai/


The Perfect Storm: The Impact of Analytics, Big Data and Cloud.

Barry Morris’ presentation from the October 23 edition of The Briefing Room, entitled: “The Perfect Storm: The Impact of Analytics, Big Data and Cloud.” In this presentation, Morris introduces the NuoDB solution, an asynchronous, peer-to-peer database, specifically designed to meet 21st century database requirements.

NuoDB is 100% SQL compliant and 100% ACID but scales elastically in the cloud or on-premise.