My little “Big Data” story
I recently attended an internal conference called CA Exchange in Orlando, Florida. It’s for CA Technologies employees in engineering and product management to gather, meld minds, network and make our solutions better. It was a great event for us “CAer’s” to share some pretty innovative tools and solutions. On a whim, I attended a few of the “Big Data” sessions at this conference to get a better understanding of what our teams internally are working towards to help us make sense of our customer data.
It just so happens that I had a little Big Data problem/story to share myself. One where we never knew where the information could take us until we began to look at it on different levels.
One of the products I manage is called CA AppLogic. It’s a turn-key cloud platform enabling a true software defined datacenter. CA AppLogic is used by many Managed Service Providers (MSP’s) who need to build and deliver cloud services for their customers quickly, efficiently and reliably. Now, I won’t get into the detail on everything the technology can do, but the point I’m getting to is this: AppLogic runs applications in the cloud but it’s an on premise solution (AppLogic must run on bare metal). So what kind of data can this cloud platform gather? About 200 data points and growing. What makes the data both valuable and a problem is it’s frequency and depth:
AppLogic gathers data from each of it’s installs every 12 minutes. Every 24 hours, AppLogic compresses those 12 minute “coin drops” and sends the compressed file back to the CA Technologies metering server. Here’s the kicker – This has been happening for (GASP!) almost 7 years! Since the inception of the product at a start-up called 3tera.
So what does one do with all this information? Is it even valuable? Well, one very smart guy, Bert Armijo, has been pushing for this data to continue to be collected for a reason. Initially so that the 3tera start-up could understand what services their customers were deploying on their technology and adjust their product strategy accordingly. Then, overtime as more MSP’s wanted to be able to bill their customers for different types of services on different measurement metrics, the data set was expanded and the information was used to facilitate “Metered Billing” or “pay for what you use” cloud services.
Until recently, only a few of our MSP partners were using this data to bill their customers. This few built elaborate systems to eat the mega compressed, flat files that AppLogic served up and punch out meaningful billing information. And then there was me, and my colleagues attempting to figure out how many customers of ours were on the most recent version of our product or whether or not we should support older operating systems or browsers. When we had a question like this, we’d go back to those lovely flat files, pull them together for a specific customer and repeat hundreds of times over to get the information we desired. Then when the executives said something like “Oh, this is wonderful. Can you add another field for my presentation tomorrow?” we knew our data was not only useful and valuable, but it was giving us a HUGE Big Data headache. We had a volume, velocity and variety problem like no other and the data wasn’t even in a database yet! Oh, and did I mention that those lovely flat files were compressed in some cases 8 layers?
So what did we do you ask? Well, we grabbed one of our well known data guys, a UI guy and a product manager (me) and we went to work creating a tool to be able to gather the compressed flat files and show basic reports from those files. At first it was to get over our Big Data hangover but then… as we starting seeing the information come forth, we realized this information was a goldmine for our customers!
Not only would our customers be able to see what they have running, understand their capacity, manage their AppLogic licenses, or bill their customers, but they could use this metering data to boost revenue too. Bert Armijo captured a few of these very valuable points in his MSP Mentor guest blog entitled “Mining Metering Data to Boost Revenue” back in February of this year.
Fast forward a few months from Bert’s blog post, we re-dedicated the team to creating a portal into this information that our customers could access. It’s up and running today in beta. One thing is for certain, every time we look at the information and fields being gathered, we can think up hundreds of reports we want to write, trends we want to track, statistics we want to pull. Our executives want more and more each day – information is key, data is the new gold. Or as Debra Danielson put it at the CA Exchange event “Data is the new oil“.
So back to the event and my attendance in the Big Data sessions. I found out there that I’m not alone in this dilemma and that we have a team of experts called Data Scientists that can help us find the hidden trends in our data or determine things we can predict from the information we collect. I’m excited for the prospect of people (other than our team of 3) looking at this information to help our customers gain more value from their data. Imagine if we could predict when a customer would need more capacity or new services and get in front of that with a phone call/touch point. Customer engagement to the next level and we’re not even retail shops predicting the next thing they should show you on the shelves based on your browsing habits.
My little Big Data dilemma is not as rare as I thought. Though we are not gathering exabytes of data each hour, we have a lot of data with a lack of capacity to consume. But once we dug in, we found value right at the surface. This is why I agree with Bernard Marr’s blog “Why the ‘Big Data’ Hype is NOT About Big or Data!”
Big Data is not about how big the data set. It’s about what you can get from the information that you never thought of before. It’s about how much more educated you can be in managing your business and understanding trends that can effect your bottom line.
My advice to software product managers and engineers: Find a way to gather data now. It doesn’t need to have a complete architecture, or standardized on the latest platform. Instead of focusing on the tools, focus on getting the information and making it meaningful for you and your customers.