Is Big Data going through an object-oriented revolution?

What can data teams learn from software’s object-oriented design?

and

May 14, 2024

There’s no denying it – Data is complex! Despite advancements in data warehousing and business intelligence products, the ability to analyze, manipulate, and model data remains elusive for most users. The first step toward a solution is admitting that these struggles are fundamentally about complexity.

The software industry addresses the issue of complexity every day. If you remove all of the details, like tooling and programming languages, software is mainly about abstracting complexity so that we humans can code complex logic into software applications. The more complexity programmers can handle, the more competitive they, and their organizations, can be when developing software. Taking on an increasing level of complexity and managing it effectively is a matter of survival for software developers. The same can be said for data teams and data consumers in data-driven organizations.

There are many valuable lessons the data industry can learn from the software development process. There’s an object-oriented revolution in the works, with modern BI applications embracing metrics, a new kind of artifact. Just as objects manage complexity in software code bases, metrics act as building blocks in data management solutions, breaking down complexity in analytics and making it more accessible to everyone.

Object-oriented methodology – A revolution in software development and data analytics

In the software industry, complexity is often managed using object-oriented design. A far-reaching innovation, you’d be hard pressed to find a modern application that wasn’t written using this approach. In this methodology, programming logic is organized into well-defined parts, making it possible for developers to manage much more sophisticated code bases. This object-oriented approach has enabled the next level of software development, pushing the bounds of complexity that can be handled by a software team to new heights.

It’s time for data-driven businesses to embrace an object-oriented methodology. Metrics manage complexity by breaking data down into manageable, and meaningful artifacts. With metrics, data teams and consumers get many of the same benefits that object-oriented programming brings to software development. Decision makers no longer need to analyze complex, raw data. Instead, they’re free to to work with larger amounts of data, organized into smaller, logical parts. By solving the complexity inherent in today’s analytics businesses can more easily make sense of their data and use it for better decisions.

Let’s take a look at some of the specific properties of object-oriented programming that have parallels in metrics-based data architectures.

Thank you for reading Metric Stack Newsletter. This post is public so feel free to share it.

Single purpose artifacts

In software development, there’s an art to designing good objects when programming. One of the main principles to follow is that every object should have a single, easily-described purpose. This helps ensure that everyone using the object understands its purpose and, as a result, uses it appropriately. Also, when programmers understand an object’s purpose, they take steps to preserve it by making sure they don’t break code that uses the object. Clearly describing and maintaining the single purpose of an object means its consumers don’t need to know the internal code for the object; they can simply trust it will work in a predictable way.

In data analytics, each metric has a single purpose, or meaning. Everyone who uses a metric can understand its purpose and use it accordingly, without having to know how the data is configured.

Abstraction

In software, objects are defined in such a way that consumers don’t see their internal logic. Instead, each object includes a set of properties and actions that it exposes. This creates an abstraction that hides and separates the internal logic of the object from the code that uses the object. In fact, a programmer can change an object’s internal logic without breaking the code using it, as long as they continue to meet the contract defined by the object. This allows a programmer using the object to work at a higher level of abstraction than when working with raw logic. They can design applications around these abstractions, enabling them to think about the big picture and how to make each object work, each as separate areas of focus. This property of objects has enabled a whole discipline of software architecture, where applications can be designed through abstract blocks of well-defined purpose. The objects, as defined in the architecture, can be implemented by programmers since they’ve already been broken down into manageable parts.

Organizations using metrics for data analysis, are also able to define their overall data architecture through abstractions. Decision makers define the metrics they need, independently from the data, by using the metric as an abstraction. This, in essence, defines the architecture of their data-driven solutions. Data teams then connect the metrics to data, making the abstractions real and usable by consumers. Decision makers get accessible analytics, without the decision makers needing to understand how the data behind the metric is configured.

Asset reuse

In software programming, it’s a good idea to reuse logic whenever you can. Once objects are defined and implemented they can be reused throughout an application, and, in many cases, across applications. Object reuse saves time, reduces the size of the code base, and ensures that when issues are fixed, they’re fixed everywhere.

Using metrics for data analysis is no different. Creating a metric with a particular meaning allows you to reuse that metric across a whole set of reports and analyses. Metric reuse eases the burden of configuring data for presentation and enables changes in metrics to be immediately available to downstream consumers.

The reusability of objects and metrics has led to external libraries for both of these asset types. Software applications often use libraries containing objects with well-known purposes that have been proven over time to work well and be trustworthy. Metrics are also starting to see libraries of expert-defined metrics that are recognized and known to be effective for monitoring and managing business performance.

Modularity

Modularity is a key property that enables large, complex systems to evolve and be maintained effectively.

When code in a software application is broken down into objects, you can swap out and interchange pieces within the architecture. For example, if you want to use a better sorting algorithm, you can swap out the object or objects responsible for sorting and replace them with new, improved versions. The rest of the application will be unaffected and continue to function normally.

Metrics provide a similar modularity. If you replace the definition for a metric, all of the reports and analyses that use the metric will continue to function as expected.

Quality control

Object-oriented software programming enables objects in the system to be tested independently, ensuring each one meets its intended purpose. Tests are often written in code along with the object to prove the object meets its described behavior and to ensure future changes to its internal logic do not break it. This is key to a programmer’s ability to trust the objects when they use them within an overall software system.

In analytics, metrics can be tested independently. These tests can even be automated in some systems, to ensure the data aligns with the meaning built into the metric. This process enables data consumers to trust these objects for use in their reports and analyses.

Separation of responsibilities

Given the complexity of software, there are many responsibilities in an organization related to its development and maintenance. Objects allow for architects to define them and determine how they fit into the overall code base. Developers implement the objects that the architects define. Testers write test code that ensures the objects do what the architects intended.

In analytics, metrics enable a similar separation of responsibilities, where industry experts define metrics and how they should be used. Data teams implement the data selection and query rules to fulfill each metric. Business users use the metrics regularly and provide feedback to the data team.

Version control systems

Version control systems help manage changes in software environments. In systems that are constantly evolving, it’s essential to be able to identify what has changed, especially when diagnosing and solving issues that arise from those changes. Version control systems in software keep track of, control, and manage changes to code bases.

A semantic layer is one of the key technologies used to enable metrics in a modern data stack. These technologies typically use version control systems to manage changes to the metadata described in the semantic layer and in the metric definitions. In a semantic layer scenario, version control ensures the quality of the metrics in the system and enables the metrics to evolve and improve safely as they become more complex.

The metric revolution

First, an object-oriented revolution supercharged the software world. Now, it’s energizing metrics and setting the stage for modern analytics. Today, it’s almost inconceivable for a software developer not to use object-oriented approaches for major software systems. Soon, it will become equally inconceivable for a business not to employ a metrics-data architecture to supercharge its data-driven decisions. No matter how you look at it, metrics are changing the way businesses manage and use data.

Brought to you by Klipfolio PowerMetrics

Klipfolio has helped thousands of people worldwide succeed with data. Designed and developed by Klipfolio, PowerMetrics puts data analysis and dashboard creation into the hands of business users with curated metrics, governed by the data team. Learn more about PowerMetrics on getpowermetrics.com.