CaixaBank Tech | Graphs and Neural Networks: How AI Transforms Connections into Value

15/01/2025

Discover the main types of graphs and their importance in artificial intelligence. Learn how these key structures optimise connections and data analysis

In today’s digital age, where data flows and is generated in unimaginable quantities and at dizzying speeds, the ability to understand and analyse the connections between different elements has become crucial. How can we make sense of all this tangled data?

Imagine being able to visualise the network of interactions on a social platform or untangle the complex financial relationships of a banking institution. This is where graphs come into play, a mathematical tool that has become the backbone of technological innovations such as Graph Neural Networks (GNNs).

These models are not only revolutionising artificial intelligence, but also opening new frontiers in sectors like finance, enabling deeper and more meaningful data analysis. In this article, we explain what GNNs are, their application in the financial sector, and how we use this innovative technology at CaixaBank Tech.

What is a graph?

A graph is a mathematical structure used to model relationships between different elements. To visualise it, imagine you have a group of friends and you want to represent who is friends with whom. Each friend becomes a node, and each friendship is an edge or link connecting two nodes. This representation allows us to visualise and analyse how friends are interconnected within the group.

A common example can be found in social media. On platforms like Facebook or Instagram, users interact in various ways: sending messages, liking posts, or attending events. Here, each user is a node, and each type of interaction is an edge. For example, if user A is friends with user B, and B is friends with C, we can represent this in a graph where A is connected to B, and B is connected to C, through edges that represent “friendship”.

Types of Graphs: Essential structures in AI

However, not all graphs are the same; they can be either homogeneous or heterogeneous. In homogeneous graphs, all the nodes and edges are of the same type, like in a simple social network where only friendships between people are represented. In contrast, in heterogeneous graphs, nodes and edges can represent different types of entities and relationships. For example, in a social network, nodes can represent users, posts, and events, while edges can represent friendships, followers, likes, and event participations, creating a rich and complex web of interconnections.

It is also important to distinguish between directed and undirected graphs. In a directed graph, edges have a direction, such as a “like” or a message sent from one user to another; and in an undirected graph, edges have no direction, like a mutual friendship between two users (the relationship is bidirectional and symmetrical).

Below, we see an example of a graph of a social network with heterogeneous nodes and relationships (users, posts, and events) and edges with directionality.

Graphs and Neural Networks: How AI Transforms Connections into Value

Example of a (heterogeneous and directed) graph representing interactions between users in a social network.

As we can see, graphs allow us to represent data and its interconnections in a much richer and nuanced way. These representations help us understand how different elements within a network interact, whether it’s a social network or a financial network, providing a clearer view of the dynamics and relationships between them.

However, to train models on these graphs, we need to adapt traditional machine learning algorithms to work with nodes and edges, which is where Graph Neural Networks (GNNs) come into play!

What are Graph Neural Networks (GNNs)?

Graph Neural Networks (GNNs) are machine learning algorithms designed to work directly with data in graph format, rather than traditional tabular formats, like rows and columns in a spreadsheet.

Unlike traditional models that analyse each piece of data in isolation, GNNs capture information through the connections between the data. This is possible because they combine deep learning techniques with graph theory, allowing them to understand both local structures (e.g., the relationship between a user and their close friends) and global structures (communities or groups in a social network).

How do GNNs work?

The fundamental principle behind GNNs is “message passing“, which can be thought of as a constant conversation between the elements of a graph. This is how it works:

Aggregation of information: Each node (such as a user in a social network) collects information from its immediate neighbouring nodes (its friends).
Knowledge update: The node uses this information to update its own state or internal representation. It’s like adjusting your own perspective by learning your friends’ opinions.
Repetition of the process: This exchange and update repeats several times, allowing information to flow throughout the entire network. So, even if you’re not directly connected to someone, the information can reach you through intermediaries.

Visual example of how a node improves its prediction by adding information from its neighbours flowing through the graph.

This process allows GNNs to capture complex relationships and high-level dependencies between nodes, which is especially useful in large, densely connected networks. However, scalability challenges arise with these models, especially in graphs with millions and millions of users. For this reason, proposed and modified GNNs have emerged to make them more efficient in real-world large-scale applications.

GraphSAGE: Expanding the Scope of GNNs

Once we understand how Graph Neural Networks (GNNs) capture and process relationships between interconnected data, the question arises: how do we apply these techniques to large-scale graphs, such as social networks with millions of users?

This is where GraphSAGE (Graph Sample and Aggregate), one of the most popular variants of GNNs, comes into play. GraphSAGE is an innovative method that allows GNNs to be efficiently trained on large graphs by sampling and aggregating information from neighbouring nodes.

In simple terms, GraphSAGE works like this:

Sampling of neighbours: Instead of analysing all neighbouring nodes (which would be computationally expensive), GraphSAGE selects a random sample of neighbours for each node. This significantly simplifies calculations and can be used to scale up to larger graphs.
Aggregation of information: It combines the information from the nodes sampled to update the representation of the central node. This process captures the essential features of its local environment.

Additionally, one of the key aspects of GraphSAGE is its inductive capability, meaning it can generate representations for nodes that were not seen when training the model. This is especially useful in scenarios where the graph constantly changes, such as when new users are added or new connections are made in a social network.

Practical applications of GNNs: From Spotify to Proteins

So where can GNNs be applied? A clear example of how GNNs impact our daily lives is in product recommendation systems. Platforms like Spotify, Pinterest, and Amazon use GNNs to better understand relationships between users and content. Have you ever wondered how these platforms know exactly what to recommend? By analysing how users interact with different songs, images, or products, GNNs can more accurately recommend things that might interest you, based not only on your own preferences, but also on those of similar users connected to you.

Another interesting example is found in protein analysis. GNNs are used to model and predict interactions between proteins, which is crucial for understanding complex biological processes and developing new drugs. By treating atoms and molecules as nodes and the bonds between them as edges, GNNs are used to identify patterns and relationships within structures that help uncover how certain molecules behave or how harmful interactions could be inhibited for health benefits.

Driving innovation at CaixaBank Tech

At CaixaBank Tech, as drivers of financial innovation, we work to integrate GNNs in various areas to improve our services and operations, allowing us to enhance processes, make more informed decisions, and ultimately create products and services that make a difference.

For example, we are investigating how to apply GNNs in the field of cybersecurity to, for instance, detect anomalies in network traffic. By modelling the connections between different devices and users in our network as a graph, GNNs can identify unusual or suspicious patterns that could indicate a threat. This allows us to detect potential cyberattacks before they cause significant damage, thus protecting both our systems and our customers’ information.

In summary, graphs and GNNs are powerful tools that allow us to explore and understand connections between data. With applications ranging from recommendation systems to cybersecurity, their ability to analyse complex relationships is revolutionising multiple sectors, and the financial sector is no exception. The potential of GNNs is vast and continually evolving, and as graphs and data grow in complexity, GNNs will become increasingly essential for extracting value.

The future of interconnected data is promising, and GNNs are the key to unlocking its full potential!

Gonzalo Recio Domènech Innovation

tags:

#IA

Graphs and Neural Networks: How AI Transforms Connections into Value