em360tech image

The following is an excerpt from the book, "Data Juice: 101 Real-World Stories of How Organizations Are Squeezing Value From Available Data Assets."

In today's world, energy efficiency and cost savings are crucial factors for businesses to consider. The real estate industry is no exception and companies are exploring various ways to reduce energy consumption and increase energy efficiency. Millennium Partners, a leading mixed-use development company, is one such company that is using predictive analytics to achieve these goals. The company owns and operates a 10-story office building in Miami, and with the help of South Florida Controls, it was able to retrofit the building's HVAC system for better control and energy efficiency.

South Florida Controls used CopperTree Analytics' CopperCube and Kaizen to extract current and historical trend logs from the building's automation system. These tools enabled them to view energy trends and predict the potential return on investment from the renovation project. The analysis showed that the project would not only pay for itself through energy cost savings but also generate a healthy return on investment. After the implementation of the retrofit, the building saw a 64% reduction in its electrical energy consumption, saving nearly $60,000 per year in electricity costs.

This use case highlights the benefits of using predictive analytics in the real estate industry. Predictive analytics helps companies make informed decisions by providing data-driven insights into energy consumption and cost savings. However, it is important to consider weather data and mean time between failure (MTBF) data from the original manufacturer of machines when using predictive analytics to prevent inaccurate maintenance predictions.

Analytics is the Future of Retail

In the digital age, e-commerce has become a complex and highly competitive business. For online business owners, the key to success lies in the ability to connect with customers in real-time, across multiple channels and tailor offerings to their needs. However, with millions of online customers, keeping track of their behaviour, preferences, and activities can be a daunting task. This is where data analytics comes into play.

The case of Future Group, India's leading conglomerate, is a prime example of how analytics and big data can drive business success. By partnering with Manthan, a US-based analytics and AI solutions provider, Future Group has been able to identify customer opportunities at a deeper and more granular level, and tailor their offerings and promotions accordingly. Manthan's analytics tools track and monitor over 10 million customer transactions every month, providing Future Group with valuable, precise, and real-time actionable information.

One of the key benefits of data analytics for Future Group is the ability to personalize messaging and marketing promotions across multiple channels. By analyzing customer behavior and preferences, Future Group can deliver personalized offers to its 30 million loyalty program customers. Moreover, analytics-driven solutions have enabled Future Group to adapt rapid predictive modeling for specific strategic and tactical business objectives.

The success of Future Group's partnership with Manthan is largely due to two crucial actions. First, Future Group chose to partner with a company that already had a proven product, rather than building their own system. This helped them generate business value quickly and avoid competitive disadvantages. Second, they joined a data ecosystem, which allowed them to contribute their data and receive valuable insights in return. This enabled them to know more about their customers and reach additional target customers, which is the key to growth.

The case of Future Group and Manthan provides valuable insights for other businesses looking to leverage data analytics for success. Companies need to consider what data they have and what additional data they need, plan what they want to do with the data, and how to manage it. Moreover, they need to plan for delivering and using the data at the point of decision, such as during a real-time web shopping session or checkout. Real-time call-to-action with up-to-date data is crucial for maximizing the investment value in analytics.

National Library’s Search Index Goes on a Data Diet

The National Diet Library (NDL) in Tokyo serves as a repository of knowledge, language, and culture, collecting and preserving both traditional books and paper materials, as well as digital information from Japan and other countries worldwide. To achieve its objective of making this vast amount of information accessible to everyone, anywhere and anytime, the NDL embarked on the development of a digital archive. The development of the NDL Search began with the use of open-source software called Apache Hadoop, which was used to speed up full-text search indexing and automatic bibliographic grouping. The system used more than 30 Hadoop nodes and processed data volume at around 5TB, equivalent to tens of millions of items.

By leveraging online tools and technology, the NDL was able to capture and manage a huge volume of data, create a search index from all its documents, and reduce the time for manual searching of materials and information. The benefits of this digital archive are numerous, not only enabling library users and readers to access information more efficiently but also creating new knowledge by reusing existing information and building a usable knowledge infrastructure.

Creating a massive online volume of data is laudable, but the greater value lies in the ability to distill large sets of data down to meaningful essence. Any organization with large sets of data should expand access using alternative data structures such as graph to make the exploration of data more fluid. Large sets of data also lend themselves to the use of machine learning (e.g., clustering, classification) to guide users to the most significant bits of information more readily. Furthermore, organizations should capture usage metadata so that over time, they can highlight the most popular paths through the data by other users. Finally, governance processes and standards will be important to maintaining and increasing the value of the data and insights over time.

The NDL's digital archive serves as an excellent example of the power of data and analytics in managing large volumes of information. By leveraging open-source software and online tools, the NDL was able to create a search index that reduced manual searching and increased the efficiency of access to information. The benefits of this digital archive go beyond efficiency, as it enables the creation of new knowledge by reusing existing information and building a usable knowledge infrastructure.

For organizations with large sets of data, creating a massive online volume of data is just the first step. They must also leverage alternative data structures, machine learning, and capture usage metadata to extract meaningful insights from the data. Finally, governance processes and standards are crucial to maintaining and increasing the value of the data and insights over time. By following these steps, organizations can create a robust digital archive that not only streamlines access to information but also creates new knowledge and value.

Big Data is Just the Ticket for Reducing Fraud

StubHub, an eBay company and one of the world's largest ticket marketplaces, enables users to buy and sell different types of tickets to various sports and entertainment events worldwide. Managing global customers and monitoring millions of transactions coming from over 25 different data sources is a significant challenge for the platform. The company needed to process and analyze volumes of big data efficiently to address daily churns, fraudulent activities, customer-related concerns, and more. To help address these issues, StubHub tapped into big data to acquire valuable insights about their customers' ticket buying patterns and behaviors.

To achieve this, StubHub implemented a single data warehouse to store and process information on millions of customers from multiple data sources. The system delivered insights about churn prediction, fraud notification and alerts, and product recommendations. It also enabled faster, smarter, and more efficient data analysis of customer transactions and online shopping behaviors. StubHub had quick access to customer-related data, including ticket purchase history, patterns, demographics, and exploring this data to build, deploy, and multiple data-mining models, create predictions, and improve responsiveness. Furthermore, it enabled calculation of 180 million customers' lifetime value compared to just 20,000 values at a time previously possible, and fraud issues were reduced by up to 90%.

By leveraging the power of big data and analytics, StubHub continues to grow and expand in the online sports, concert, theater, and other live-entertainment events ticket marketplace, serving millions of customers worldwide. According to Dr. James Short, Director of the Center for Large Scale Data Systems Research (CLDS) at the San Diego Supercomputer Center, University of California San Diego, StubHub's evolution from a data warehouse to the operating platform for the world's largest ticket market shows what can be accomplished in building out a modern, real-time data analytics platform: market leadership, business return on investment, customer knowledge, and predictive modeling about where to go next.

For organizations like StubHub that need to manage vast amounts of data, starting with scale is essential. A data systems infrastructure that can scale as the business pressure-tests it is crucial. Going forward, StubHub's business systems will need to support the full range of analytics use cases, from self-service visualization and exploration to guided analytics apps and dashboards, custom and embedded analytics, mobile analytics, and reporting for its business ecosystem partners and their customers. By starting with scale and sustaining investment and strategy intent, StubHub can grow its platform across other live-entertainment ticket marketplaces, leveraging its real-time capabilities.

Package Delivery Company Ups its Fuel Efficiency

UPS, the world's largest package delivery company, uses data and analytics to optimize its delivery routes for reduced fuel consumption and CO2 emissions, and faster package delivery. Its fleet management system, ORION, uses telematics and advanced algorithms to create optimal routes for delivery drivers, taking into consideration changing road conditions and commitments in real time. Since its initial deployment, ORION has saved UPS about 100 million miles and 10 million gallons of fuel per year, which translates into 100,000 metric tons of reduced CO2 emissions.

ORION is a notable achievement in analytics and data science due to its scale and cost. It is estimated to have cost between $400 and $500 million. However, the large annual savings in fuel and labor costs make it worth the substantial investment. The application is also a triumph of incremental planning and capability-building. UPS has been implementing the IT systems that make ORION possible for the last twenty years, from online tracking and tracing to the creation of detailed maps including pickup and delivery locations to the universal deployment of five generations of DIADs (Delivery Information Acquisition Devices) among drivers.

But perhaps most of all, the application is a change management success. UPS worked with drivers, planners, supervisors, and other personnel to convince them that ORION would make their jobs better. The company even worked with the PBS science show NOVA to create a documentary on ORION and created various training programs. Securing the understanding and buy-in of those who will use the system is critical to success when pushing out data and analytics to the front lines.

In addition to ORION, UPS is developing a chatbot with AI capability to help customers search for information about their packages. The company continues to work on other initiatives for a smart logistics network to improve decision-making across package delivery networks.

UPS's use of data and analytics to optimize delivery routes has led to significant cost savings and reduced environmental impact. ORION is a notable achievement in analytics and data science due to its scale, cost, and incremental planning. The company's success in convincing personnel to adopt the new technology highlights the importance of securing understanding and buy-in from those who will use the system. UPS's ongoing development of a chatbot and other initiatives for a smart logistics network demonstrates its commitment to quality service and environmental sustainability.

Training Locomotive Performance Analytics

DB Cargo, the management company for Deutsche Bahn’s Rail Freight Business Unit, faces a serious logistics challenge of transporting at least 300 million tons of cargo every year across Europe and Asia. To drive efficiency across its operation, DB Cargo turned to Splunk Enterprise to handle the large volume of diverse data in real-time, providing real-time insights across fleet control, operations, maintenance, and engineering. Splunk alerts are tied to a rules engine based on failure code tables, enabling the locomotive team to decide the best action to take when a failure occurs.

By analyzing the data coming from many sources, DB Cargo has been able to keep locomotives in service longer, reduce maintenance costs, and ultimately deliver better service to its customers, making DB Cargo more competitive. The locomotive manufacturers can also use the data provided to identify occasions when locomotives can stay in service longer and recommend whether the locomotive needs to go to the maintenance workshop or not.

DB Cargo’s real-time approach to visibility into the data and tying that data to action via a business rules engine provides direction to the locomotive team. This approach applies to industries that deal with large, expensive, and complex machinery, such as mining, construction, shipping, and transportation. Rules engines are effective but reactive and based on expert knowledge. A more proactive approach would augment rules with machine learning that predicts future failures well before they occur.

However, there are challenges such as sensors and log files not always telling the whole story, limits of legacy old-school technology’s ability to log critical data, and the need to assess what data is most useful to solve problems. Simplifying the problem by piloting collecting many streams of information on a few locomotives, experimenting on what predicts outcomes, and finally rolling out widespread data collection based on what has proven utility can help.

DB Cargo’s wealth of data will also be useful in optimizing capital investment decisions. Collecting detailed insights on locomotives’ actual performance in different weather, humidity, air quality, and track conditions over time will enable DB to better procure and deploy equipment based on its best use. DB will also develop detailed insights on supplier and manufacturer quality, potentially marketable back to manufacturers or non-competitors in the same industry and prove useful in procurement decisions and negotiations.

DB Cargo’s approach to real-time analytics of locomotive data has enabled the company to optimize locomotive operations, reduce costs, and deliver better service to customers. The challenges in analyzing data from legacy technology, human input, and determining the most useful data can be overcome by adopting a proactive approach that leverages machine learning and simplifies the data collection process. DB Cargo can also optimize capital investment decisions by analyzing locomotive performance data and supplier and manufacturer quality.

Dialling Up Big Data to Manage Annual Data Growth

Aircel Limited, a leading provider of innovative communications and mobile services in India, understood the significance of big data for generating accurate business forecasts and making well-informed decisions. Aircel required an efficient solution to improve database performance, increase scalability, and analyze structured data with an annual growth rate of 10 to 15 percent.

To meet these requirements, Aircel selected the Vertica Analytics Platform, which offered a more efficient system designed to handle multiple workloads, complex processes, and faster queries despite its larger database size. Vertica enabled Aircel to gather and acquire intelligent insights from structured data, resulting in a 10 to 15 percent decrease in total cost of ownership and annual data growth.

In addition, the platform's support for a large customer base allowed Aircel to analyze up to 200 GB of summarized data every day. Business Intelligence Head of Aircel Limited, Sanjeev Chaudhary, stated, "We chose Vertica as our short-term and long-term solution as we scale. This big data platform is vital to our ability to innovate and solve business problems."

The Vertica Analytics Platform produced tangible results, including increased speed, decreased ownership costs, and simplified management tasks. These outcomes could go a long way toward promoting data democratization in the organization, allowing every Aircel employee to gain a greater understanding of the power of data to solve business problems.

To remain competitive, however, businesses must begin to leverage the insights derived from unstructured and semi-structured data, such as email, social media, video, voice, and the Internet of Things. As the size of these types of data dwarfs that of structured data, storage and scalability become crucial. Aircel could become a data powerhouse in the telecom industry by mastering these less structured data types.

Aircel's implementation of the Vertica Analytics Platform has enabled the company to gain insightful, intelligent, and data-driven insights from structured data while simultaneously reducing total cost of ownership and annual data growth. To maintain a competitive advantage in the telecom industry, organizations must investigate the insights that can be gleaned from unstructured and semi-structured data types. Adoption of effective big data analytics tools is essential for organizations to generate accurate business forecasts, well-informed decisions, and innovation.

Analytics Helps Marketing Click and Convert

TalkTalk, a UK-based provider of TV, broadband, and mobile services, wanted to acquire more business from new and existing customers. To achieve this goal, TalkTalk integrated its CRM data into Google Analytics 360 and other advanced analytics tools, including Google AdWords and DoubleClick Bid Manager. This allowed them to gather customer insights that could be used for targeted marketing and remarketing using display and video ads.

Analytics 360 has a custom dimensions feature that enabled TalkTalk to build, monitor, and measure non-standard dimensions that were relevant to their brand. Unlike the standard free version of Google Analytics, Analytics 360 has 200 custom dimensions that TalkTalk could take advantage of. Some examples of TalkTalk's custom dimensions included customers' existing products, their eligibility to purchase other products, and the method by which the customers were recruited or invited to transact with the brand.

To measure the effectiveness of this new analytics-based strategy, TalkTalk and m/SIX ran a remarketing campaign consisting of a test group and a control group. The test group used remarketing lists from Analytics 360, while the control group only used standard URL-based remarketing. The result of the test group showed a 63% higher click-through rate (CTR) compared to the control group based on landing page combinations and abandoned cart visits. The test group also resulted in a 219% increase in conversion rate from that of the control group and a 77% lower cost per acquisition (CPA).

Kelle O'Neal, founder and CEO at First San Francisco Partners commends TalkTalk and m/SIX for their successful approach to campaigns leveraging data and analytics. She recommends that they calculate the additional revenue generated by the 219% increase in conversions compared to the control group. She also suggests that they use the platform to improve their marketing to new clients, especially new clients within a household with existing clients. This strong foundation and analytics capability may also be used to improve other sorts of customer engagement beyond marketing, into optimizing service, increasing the efficiency of order management and billing, product optimization, and possibly new feature development.

TalkTalk's integration of CRM data into Analytics 360 and other advanced analytics tools helped them gain customer insights that were useful for targeted marketing and remarketing. Their approach resulted in a 63% higher CTR, a 219% increase in conversion rate, and a 77% lower CPA compared to traditional URL-based remarketing campaigns. The custom dimensions feature in Analytics 360 enabled TalkTalk to monitor and measure non-standard dimensions that were relevant to their brand. Moving forward, they can leverage this platform to improve their marketing to new clients and optimize other areas of customer engagement beyond marketing.