Why Graph Computing is STELLAR

Mar 9

Every once in a while, there comes a technology so revolutionary that it changes the way people work and live. Blockchain, deep learning and cloud computing are examples of such. If you are searching for the next big thing, graph computing could be the answer.

Why? Because graph computing solves the most common and costly problems in enterprise systems once and for all, including: scalability, transparency, explainability, lineage, adaptability and reproducibility. We coined an acronym STELAR for these challenges in enterprise systems.

In almost every organization, significant engineering resources and efforts are devoted to addressing the STELAR needs for their core enterprise systems. These efforts are not portable because they are specific to the particular organization and architecture. For example, the solution to improve the scalability of the trading system in JP Morgan is not applicable to Goldman Sachs, as their technology stacks are fundamentally different. There is an enormous waste of time, money, energy and human talents for re-creating bespoke solutions to the same STELAR problems across the industry. The world would be a much better place if we could solve these problems once and for all enterprise systems. This is the promise of graph computing.

Instead of functions in the traditional programming paradigm, directed acyclic graphs (DAG) are the fundamental building blocks in graph computing. A DAG is a special type of graph, it consists of a collection of nodes and directional connections between them. Acyclic means that these connections do not form any loops. Because of this, a DAG is a generic abstraction for any causal relationships or dependencies, for example in a family tree. Circular dependencies make no sense in these causal relationships.

A DAG can also be used as a generic representation of any kind of computing, which is fundamentally a collection of data transformations performed by computers; and every transformation is nothing but a causal dependency between its input and output. Conceptually, any computation, from the simplest formulas in a spreadsheet to the most complex enterprise systems, reduces to a DAG. In graph computing, complex DAGs representing entire applications or systems are built by composing smaller and modular DAGs, analogous to function compositions in the traditional programming paradigm.

Being a generic representation of computing doesn’t necessarily make DAGs better or more useful than the traditional function centric representations. The real value in DAG is that it is a much better and more convenient representation for building generic solutions to STELAR. Let’s explore why.

Scalability is one of the most challenging and costly problems to solve in modern enterprise systems. Scaling is not just a one time investment, it is an on-going and recurring maintenance cost. There is a common misconception that scalability can be achieved by simply adding more hardware, which could not be further from the truth. The bottleneck of scalability is rarely driven by the lack of hardware, rather it is the software’s inability to utilize more hardware in parallel. To improve scalability, enterprise software has to be constantly updated, tuned and optimized so the workload can be distributed efficiently across more hardware. Eventually, a fundamental redesign and reengineering of the entire software stack has to take place when the business demands out-grow the capacity of the old system architecture.

Why can’t software systems auto-scale without the need for manual tweaks and redesign? The reason is deeply rooted in functions, the fundamental building blocks of the traditional programming paradigm. A function’s runtime behavior is highly unpredictable because of the ubiquitous presence of loops and conditional branches. For example, the same function could branch to iterate over 10 data sources or a million data sources, and there is no way to predict which will take place ahead of its execution. The optimal distribution strategy for these two outcomes are obviously different. Therefore, to scale a system in the traditional way, a developer often has to rely upon his/her specialized knowledge to predict the system’s actual run-time behavior and distribute the workload accordingly. The reliance on special knowledge of a particular system is the main reason why generic auto-scaling is extremely difficult in practice.

Generic auto-scaling is much easier to achieve with a DAG representation. Since a DAG does not have any loops or conditional branches, its runtime behavior is highly predictable. In the previous example, a DAG iterating over 10 data sources will have 10 nodes and the other a million nodes. A clever algorithm could analyze a DAG’s topology and accurately predict its run time behavior, then use that information to distribute the DAG execution efficiently to multiple computers. This distribution strategy works for any DAG, regardless of its size and topology. Auto-scaling offers tremendous benefits in practice, by removing much of the manual scaling efforts required in the traditional programming paradigm.

The lack of transparency and lineage is another serious problem in today’s enterprise systems. These systems can live for a long time, and their complexity grows more and more as new functionalities or patches are added. Most systems today are designed and built without proper lineage support. Because of that, when they grow large and complex, they effectively become blackboxes. Given the lack of lineage, there is no easy way to trace a particular output to its dependent inputs, or to access the intermediate results in order to explain the system’s behavior. Therefore, it is extremely difficult and costly to understand, support and modify these complex blackboxes.

Graph computing solves the lineage problem at the root. A DAG representation captures the end-to-end run time data and analytical lineage, and provides full transparency and explainability to the entire system.

Given lineage is such a crucial need for enterprise systems, why can’t it be supported in the traditional programming setup? A DAG is actually the only way to implement true lineage, there is simply no other way around it. But functions, the fundamental building block in the traditional programming paradigm, do not automatically create DAGs to track lineage at run time. In order to support lineage in a traditional system, developers have to manually create and maintain many DAGs to track the run time lineage of every system component, then connect them together to form a system wide DAG. This approach is manual, ad hoc and time-consuming. Furthermore, there is no guarantee that these manually created DAGs are a complete or accurate representation of the true lineage as determined by the underlying functions of the system.

In graph computing, since the entire system is represented and runs as a DAG, the lineage comes for free and is guaranteed to be 100% complete and accurate. Once we have lineage and scalability, the rest of STELAR easily follows. A transparent system with full lineage is much easier to visualize, explain, change and reproduce than a complex blackbox system.

A DAG is therefore the ideal representation for building generic solutions to STELAR. Once built, these solutions are applicable to any enterprise system as they all reduce to DAGs. But how do we create the DAGs in the first place? The DAG for a real system could comprise millions or even billions of nodes. Often the creation of large DAGs is more time consuming and computationally intensive than its execution. If we have to use traditional programming languages to create DAGs, we would be facing the same STELAR challenges, which unfortunately defeats the once-and-for-all promise of graph computing.

To realize the full potential of graph computing, we therefore cannot rely upon the traditional programming languages for DAG creation. We need a special graph programming syntax that can create large and complex DAGs easily and quickly. Even though a DAG can be arbitrarily large and complex, its underlying construct is extremely simple, only consisting of nodes and connections. Therefore a low-code declarative syntax is well suited for defining and creating DAGs, which does not require any of the complex constructs and features in a conventional programming language, such as loops, functions, variables or conditional branches etc. We now add another L for low-code to complete the acronym, making graph computing truly STELLAR.

To benefit from graph computing, a company doesn't need to overhaul its existing systems or restart from scratch. Graph computing can co-exist with traditional technology stacks. A DAG can easily interop with functions: each node in the DAG can be a function written in a traditional programming language, so existing data processing and analytical libraries can be reused. A DAG can also be wrapped up behind an API, to be called and consumed by functions. This bi-directional interop between DAG and functions allows maximum flexibility and enables a piecemeal transition. Existing applications and services can keep running while its functionalities and components are gradually replaced by DAGs and graph computing.

Let’s imagine a future when graph computing solves the STELLAR problems once and for all. In that future, it only takes a small amount of code to build stellar enterprise systems, existing systems can also become stellar by gradually migrating to graph computing. That future is now! Julius Technologies has created such a STELLAR graph computing and graph programming solution that delivers all the promises above. You can sign up for a demo today at http://juliustech.co, and start your journey with the next big thing.

Yadong Li

Why Graph Computing is STELLAR

What’s wrong with functions and how graphs will bury them

Introducing Graph Programming and Graph Runtime