We are all pumped about the Strata Summit next week (9/20-21) here in New York City. A.) It’s in our town, B.) one of our brightest minds was asked to speak, and C.) we are exhibiting. The theme is “the business of data,” which makes it a natural for us, and “Big Data” is all over the agenda. Double natch.
It’s an O’Reilly conference, but not for developers like most are. The presenters include thought leaders from McKinsey, the Economist, MIT Media Lab, the Guardian, Amazon, and Google; and the audience is mostly business execs of the kind that might watch the occasional TED video – our kind of folks.
There is a number of good sessions and interesting exhibitors on deck (including ours and us, of course), so here are some of the must sees, IOHO.
1010data‘s own Robert Leftkowitz, A.K.A. “r0ml” (long story) will be giving a talk on Tuesday at 10:50 a.m. entitled “Turning Their Data Into Your Money (And Vice Versa).” The talk will examine how Big Data is driving the emergence of a new business ecosystem linking data owners, analysts, and business decision makers in new and interesting ways. r0ml is one of O’Reilly’s highest rated conference speakers, so it should be a great talk.
In the 1010data demo pod, we will be showing the latest version of our cloud-based analytical platform (A.K.A. the Trillion-Row Spreadsheet™), munching on billions of rows of business data and spitting out fascinating insights with breathtaking speed. If you are a data person, your spine will tingle. If you are a businessperson, your mind might be blown.
Looking through the program, there are tons more good stuff at the show that we are also looking forward to.
Big Thinker Alert
The opening session, featuring McKinsey’s Michael Chui is titled “Big Data: The Next Frontier” and follows up on the great recent landmark Big Data study by McKinsey Global Institute by looking at “data innovation, challenges and competitive advantages.”
Attention, Sports Quants
One such recent competitive data innovation is the use of analytics in professional sports. It was the subject the very witty Michael Lewis book Moneyball and the eponymous Brad Pitt movie premiering next week. The Tuesday afternoon keynoter Paul DePodesta is one of the actual Moneyball guys. He’s a Cum Laude Harvard economics grad that, with Billy Beane, (Brad Pitt), pioneered the use of statistical analysis in baseball scouting, team management, and, most importantly, winning games. His talk will show a riveting example of a team gaining competitive advantage through data analytics.
Big Data Lessons Learned
If you don’t ask a good question, you won’t get a good answer, of course. But, how do business analysts know if they are asking the right questions? Not so obvious. Monica Rogati of LinkedIn is giving what should be an enlightening talk on this topic entitled, “Lies, Damned Lies, and the Data Scientist.” She will discuss some of lessons learned about how to carefully ask Big Data questions.
Big Data, like much of IT, is constantly changing. As the colossal volume of data continues to grow, companies are sprouting up to harness it with innovation. One such company, PeopleBrowsr is using Big Data to take on Nielsen. It’s founder, Jodee Rich, is giving an interesting talk on how they are doing it.
There will also be a number of compelling Big Data analytics company presenting and attending the summit as part of the Startup Launchpad.
Overall, it shapes up to be an incredible event. Whether you are a CIO of a company beginning to learn about Big Data or an executive who has been discussing data equity for years, we hope to see you at the Strata Summit.
Don’t forget to stop by and listen to r0ml at 10:50 on Tuesday, as well as come to our exhibit, booth number 14, and say hi.
There was a great article in the Economist about the challenges and opportunities of big data entitled Building with Big Data; The Data Revolution is Changing the Landscape of Business. The article describes the growth in various types of data, and cites a McKinsey study on the potential enterprise value to be gained by better understanding that data:
Last year people stored enough data to fill 60,000 Libraries of Congress. The world’s 4 billion mobile-phone users (12% of whom own smartphones) have turned themselves into data-streams. YouTube claims to receive 24 hours of video every minute. Manufacturers have embedded 30m sensors into their products, converting mute bits of metal into data-generating nodes in the internet of things. The number of smartphones is increasing by 20% a year and the number of sensors by 30%.
Clearly the amount of data businesses deal with every day is exploding. The article continues…
In a suitably fact-packed new report, “Big data: the next frontier for innovation, competition and productivity”, [McKinsey] argues that data are becoming a factor of production, like physical or human capital. Companies that can harness big data will trample data-incompetents. Data equity, to coin a phrase, will become as important as brand equity. MGI insists that this is not just idle futurology: businesses are already adapting to big data.
We agree with this assessment, and believe that companies need to use their rapidly expanding data to create data equity. It’s not enough just to have the data stored in legacy systems or remote silos- you must have easy access to the data and the analytic tools necessary to leverage that data to create real value. We have been speaking and blogging about related topics, like data monetization (see this recent post).
Our own VP of Marketing Tim Negris will be speaking about creating data equity through big data analysis at the Cloud Computing Expo in New York this week: Collaborative Big Data Analytics: It takes a Cloud. If you are attending the show, we invite you to come listen to Tim or stop by our booth to learn more.
Some people confuse the term “Web analytics” with the more general fields of BI and analytics. While BI and analytics can help make sense out of a wide range of transactional data, Web analytics, as its name implies, deals specifically with clicks, links and other types of Web activity.
These interactions between the user and website can be modeled as transactional data. They are both time-series based, and contain information on the interaction between the user and the business. Because of the need to leverage historical data to evaluate advertising effectiveness and social impact, to optimize e-commerce offerings, and to A/B test websites, we have seen a growing interest in cloud-based, big data analytics solutions like 1010data’s for Web analytics. The explosion of Web data caused by the growth of social media, streaming / rich media and the real time Web is driving the need for technology that can handle massive amounts of Web-driven data and bridge the divide separating more general business analytics and Web analytics.
Solutions that can help e-commerce sites and Web publishers better understand the data is a step towards getting value from, or monetizing the data. I was reminded of this by an article in DIGIDAY: The Big Bet on Social Data. Brian Morrissey writes:
At a time when content businesses grope for business models, it’s clear where the real money remains: data. The best part about data: no need to publish content or even attract users. The latest evidence: Clearspring Technologies, a company that started as a widget maker back in the heady days of MySpace, closed a big $20 million funding round, led by former Twitter and Zynga investor Institutional Venture Partners, to build out its social data business. Clearspring boasts it sees a billion Internet users — all without operating a single Website… “What social has done is unlock data,” he said. “The biggest outcome of social media is big data.”
Just how much data is big data? What does cloud computing really mean? How is this different than SQL? Oracle? Can you do analytics? A whirlwind of questions can be asked of many “big data” companies. Companies that help others manage large volumes of data have long had trouble describing what they truly are. This is partly because the range of solutions offered is immense. These solutions fall into two main categories.
First, data warehousing companies help companies deal with data overload. The companies may have recognized they wanted to keep their data, but couldn’t manage it themselves. They choose to buy hardware or occasionally a cloud offering to collect and maintain historical data. This offering often allows for simple analytical queries like indexing and basic aggregations. However, they lack analytic power, and either need to do pre-aggregating and indexing before each complex query or split the data set into smaller segments for analysis.
Second, there are analytic offerings. These include Excel, and for the more technical, SQL. While Excel is very flexible, it is hampered by its inability to work on large data sets and relative lack of power. SQL has relatively strong analytic capabilities, but forces data to be stored relationally. This makes the data take up much more space than it needs to, and makes many types of queries, especially time series, very sluggish on large data sets.
Companies are struggling to use these solutions to solve their problems. It’s important to consider that the biggest concern of most customers is how to leverage data to drive business decisions. The proper solution tackles this problem by providing robust analytic flexibility and power enabling both technical and non-technical decisions makers to drive strategic vision and change. This requires an incredibly fast big data platform that combines the flexibility and simplicity of Excel with the ability to do even more powerful analytics than SQL.
To truly meet our customers’ goals we have created a Big Data Analytics Platform (BigDap). We can take all the data any customer could throw at us (big data), and empower users to do lightning-fast and powerful analyses that drive business results (analytics). Perhaps most importantly, we offer non-technical end users a flexible spreadsheet like, web browser interface as intuitive as Excel, and technical users an incredibly powerful XML based query language accessible via several programmatic interfaces such as APIs, ODBC, MS Excel Add-in, and an SDK that supports most modern programming languages, all delivered securely via a managed cloud. (platform).
Using the managed cloud means that users can access the data and analytic power securely from any web browser. This ensures that there is exactly one version of the truth, and everyone working together will see the same data at the same time. It also reduces hardware costs and lets the company focus on their business needs, not how to store data.
BigDap is more than just a way to manage your data. It is more than an on demand tool to analyze some of it. This is a Big Data Analytics Platform: let your data lead the way!
E-Commerce Times had a nice story about 9 Ways to Sharpen Business Intelligence. David Carr writes:
In a time of economic turmoil, business intelligence (BI) initiatives stand out for their potential to improve corporate performance… Compared with many enterprise IT projects, BI requires relatively modest investments… [which can] pay off the most for organizations that are serious about doing it right…
He goes on to list 9 ways to maximize BI investments (please visit the link to read the details of each):
1. Customize user interfaces/dashboards for specific roles
2. Integrate data across departments and applications
3. Foster a culture of data-driven decision making
4. Implement processes for continuous data quality improvement
5. Demonstrate improved planning, operations and other outcomes
6. Implement a formal KPI methodology (e.g., Balanced Scorecard, Six Sigma, etc)
7. Deploy alerts/notifications
8. Implement employee training
9. Improve analytics capabilities.
In addition to data warehousing, 1010data lets IT professionals build customizable UIs that provide simple and advanced reporting and querying capabilities; this directly addresses the need in the first point for roles-based UIs and dashboards.
Regarding the second point, Carr explains the problems of having data silos and disparate tools. We have found that providing analytics and data warehousing capabilities as a service makes it easier to offer clients an enterprise-wide system with a common interface and a unified store of data (which helps them to arrive at “a single version of the truth,” as we like to say).
He also talks about challenges inherent in extracting and rolling up data, and about data quality issues; our clients appreciate that the 1010data platform works directly on raw data, so there is no need for extensive data pre-processing, grooming, aggregation, etc. This helps accelerate “time-to-insight.”
Finally, in reference to the last point, please see our post A Modest Proposal Regarding Advanced Analytics; also take a look at the news we announced: 1010data Advances Hosted Analytics and Reporting.
Recently telecommunications companies have grown in size and complexity. Many large operators like Orange, Vodafone, Telenor, and Deutsche Telekom have dozens of subsidiaries in various countries and even on different continents.
Many subsidiaries have their own data warehousing technology. Often, these technologies aren’t easily compatible, and require extreme amounts of overhead. This can cause subsidiary data to feel like a local silo, inaccessible to the group headquarters, let alone other subsidiaries. Performing group-wide analyses requires jumping through hoops to force a common structure to aggregate the data.
At 1010data we believe in the idea of “one version of the truth.” We allow a large telecommunications operator to host all of its data in a private cloud, ensuring simple transparency between the subsidiary and group. This greatly shortens the information relay chain between strategic decision makers and the data they need to drive their decisions.
This transparency will allow the group headquarters to have simple and easy access to each subsidiaries data. They will be able to see key metrics and reports in seconds, with no technical experience necessary. The group can even enable individual subsidiaries to view operator wide data when deemed beneficial.
We have seen that empowering business users with real data leads to a more complete picture of the company, and ultimately helps drive results. Over the next few weeks we will add posts further explaining our offering in telecom, why it is so fast and flexible, and how it can help your business.
Dr. Dobbs had a nice story (Big Data Inside Stormy Clouds) about how the cloud is driving improvements in BI, e.g. offering more flexibility for developers. A source quoted in the article said that: …cloud BI represents a way for software engineers to build reporting and analysis solutions more easily. We have found that building BI/analytics solutions on top of a platform like 1010data’s allows developers to focus on the business needs they are trying to address, and not on performance or hardware issues
The article also cited an IDC study that was bullish on the use of the cloud for BI and analytics: A recent IDC survey and presentation, The Maturing Cloud: What the Grateful Dead Can Teach Us About Cloud Economics.. showed that 50 percent of respondents said it was highly likely they would pursue the public cloud for BI and analytics — and nearly 70 percent said it was likely they would pursue a private cloud deployment.
Another quote in the article said: Many large enterprises are interested in cloud BI as a horizontal tool to provide a simple, distinct, affordable ‘IT sandbox’ where software developers can work on project experimentation and evaluation can occur far from the production environment. The organization may also want to use cloud BI to develop and deliver a departmental BI project more quickly and inexpensively than other options. In both cases, avoiding the time and expense of buying and configuring server hardware, operating software, and database software holds a strong appeal and greatly accelerates the evaluation-to-deployment cycle
Dr. Dobbs is written for developers, and the article is written to appeal to that type of audience. We agree with the benefits cited above, and would just also add that cloud-based BI can offer direct benefits for a wider range of users too. E.g., business users can also use cloud-based BI to simplify reporting and analytics. Further, large enterprise are increasingly using the cloud not just as a sandbox; a number of our clients have used it to deploy production enterprise data warehouses.
On Friday February 11th 2011, the Obama administration released its plan for reforming the US housing market. The bulk of the plan deals with the future of the Government-Sponsored Enterprises (GSEs) – FNMA, FHMLC and GNMA. As part of the release, the administration reiterated their commitment to winding down Fannie Mae and Freddie Mac and proposed several potential future models for housing finance in the US. All three models are dependent on the return of private capital to the housing market to replace the GSEs, which currently insure or guarantee more than nine out of ten loans made in the wake of the credit crisis.
As we have expressed in previous postings (most notably “Can You Hear Us Now, Mr. Geithner?” ), data transparency in the mortgage markets is of utmost importance, and the reluctance of the GSEs to provide such data has been a thorn in the industry’s side for quite some time. A market shift from public to private financing would have many benefits, not least of which would be the reduction of federal exposure to the ups and downs of the mortgage market. But of equal importance would be the increased availability of loan-level data for mortgage-backed securities, which would allow investors to truly understand the risk composition of their investments. 1010data applauds the intent of the administration’s plan for reform, and looks forward to seeing the results of its implementation.
The Harvard Business Review blog recently posted an article entitled “21st Century Medicine, 19th Century Practices.” In the article, by Ashish Jha, an MD, MPH, discusses the dichotomy of medical professionals having “the latest in cutting-edge devices and surgical therapies… while the system that helps us deliver that care has changed very little.”
In recent years we have actually seen a greater adoption of electronic health records (EHR). Jha remarks “The federal government has gotten involved as well, offering nearly $30 billion in incentives (as part of the 2009 stimulus bill) for doctors and hospitals that adopt and ‘meaningfully’ use EHRs.”
However, Jha also discusses the difficulties both culturally and economically of using that data effectively and sharing it across various groups. At 1010 we empathize with this concern, and have experience bringing data analysis across disparate user types with different needs.
He ends with “To really transform healthcare, we need a 21st century health care system where incentives encourage sharing of data and collaboration between providers, not just care in silos. So yes, the U.S. healthcare system is at a crossroads — but we all know which path we’re going to follow. Despite the naysayers, we will modernize healthcare through information technology. We have no choice; we simply can’t improve the efficiency of the healthcare system without it.”
We at 1010 believe that our model of cloud-based analytics can provide an incredibly easy way to share data across groups. The incentives can be properly aligned by using a vendor portal configuration where major health organizations will store HIPPA compliant anonymized data. Pharmaceuticals and medical research groups can access this data for analytical research purposes. Our lightening fast backend can also provide an analytic and data warehousing backend to current EHR applications used in hospitals.
The major health organizations including insurance companies will gain a unified data center to improve subscriber care, increase revenue opportunities with pharmaceutical and hospitals and cut their data costs. The users of the data will gain a central location to analyze huge amounts of data.
Andy Hayler of research firm Information difference wrote a nice article for CIO: Lots in Store for Data Warehousing (I am not sure if the pun in the title is intended or not, but it does seem to work well as one). He covers some of the history of the technology, e.g. cites relational databases and OLTP, discusses what is hot today and how this relates to where the technology is going. Andy writes:
Some markets which appear to be mature can suddenly become exciting once more. One of the earliest mainstream enterprise applications was the database… But once the relational database became widely accepted there was only a brief period of competition before the market was carved up between Oracle, IBM and Microsoft… Yet in the last five years or so there has been a flood of new entrants to the market, some using quite different database designs from traditional ones. What happened?
He then focuses on some of the forces that have led to disruption and new competition: chiefly, the growth of data and strides in computer design and database architectures. Hayler continues:
This combination of specialist software and hardware aimed at data warehousing has become known as an appliance, though the definition is a little blurry, as some appliance offerings… can operate in the cloud, so do not require hardware on site.
The article focuses more on hardware and software and mentions the cloud only in passing. We believe that the cloud can be not just a delivery mechanism and another way to deploy an appliance for data warehousing – cloud-based analytics can open up new opportunities, and lead to wider adoption of self-service, turnkey analytical tools across the enterprise. They can help achieve another long-sought-after goal that Hayler writes about:
Sheer size of data has not been the only issue — the need to analyse large volumes of data in something close to real time has allowed further specialization…
…the increasing desire for near real-time analysis and the inexorable rise in the volumes of data that organisations need to handle, promise to keep things lively in the data warehouse market for some time.
1010data and others have proved that cloud-based analytics are one way to achieve these goals.