From 82941f750dbefee804ad5611f1a3ebf9c28bc8f2 Mon Sep 17 00:00:00 2001
From: Mohd Hassan <109505395+Mohd-Hassan@users.noreply.github.com>
Date: Wed, 29 Oct 2025 20:40:22 +0530
Subject: [PATCH 1/3] Delete
UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
---
.../Graph_Analysis_PY_SQL.ipynb | 740 ------------------
1 file changed, 740 deletions(-)
delete mode 100644 UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
diff --git a/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb b/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
deleted file mode 100644
index d9a59a58..00000000
--- a/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
+++ /dev/null
@@ -1,740 +0,0 @@
-{
- "cells": [
- {
- "cell_type": "markdown",
- "id": "5f462bad-9f88-4416-a3de-bae6324319cb",
- "metadata": {},
- "source": [
- "\n",
- " \n",
- " Graph Analysis using Script Table Operator(STO)\n",
- "
\n",
- "
\n",
- "
\n",
- ""
- ]
- },
- {
- "cell_type": "markdown",
- "id": "05bc242a-7ee9-4b33-b837-ff52a9cd76d2",
- "metadata": {},
- "source": [
- "
Introduction
\n",
- "\n",
- "Call Detail Records (CDRs) contain valuable information about communication patterns and interactions between users. By leveraging community detection algorithms on CDR data, businesses can gain insights into the underlying network structure and uncover meaningful communities within their user base.
\n",
- "\n",
- "The objective of this analysis is to identify distinct communities or groups of users within the CDR network. Communities are like smaller social circles or friend groups within the larger group of friends. It helps us understand how people naturally form different clusters based on their interactions and relationships. This analysis also identifies influential people in the graph.\n",
- "
\n",
- "
\n",
- "By grouping users into communities based on their calling patterns, the business can better understand the dynamics and relationships among users, leading to several potential applications and benefits like Customer Segmentation, Fraud Detection, Network Optimization, Cross-Selling and Upselling Opportunities, Customer Support and Retention, etc.\n",
- "
\n",
- "\n",
- "In this demo, we'll be using Script Table Operator(STO) to execute custom python scripts on Vantage. The STO operates by executing R and Python scripts from the command line of the Advanced SQL Engine underlying operating system, according to\n",
- "the following sequence:\n",
- "
\n",
- "\n",
- "\n",
- " - The language script is installed on the Advanced SQL Engine of the target Teradata Vantage system via a call to\n",
- "an External Stored Procedure (XSP)
\n",
- " - The script is invoked by executing a SQL query that calls the STO
\n",
- " - Each Advanced SQL Engine AMP provides its own portion of input table data, if any, to the script. The script\n",
- "reads its input from the standard input STDIN
\n",
- " - Each Advanced SQL Engine AMP runs a different instance of the same script. Hence, the script execution is an\n",
- "operation that scales through system architecture as the same script is run concurrently on all AMPs
\n",
- " - The script executes its code and sends its results to the standard output STDOUT. Each Advanced SQL Engine\n",
- "AMP individually picks up the corresponding script instance results, and returns them to the STO
\n",
- "
"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "92c9fc4a-42bb-454a-8f7e-5b0983004501",
- "metadata": {},
- "source": [
- "
\n",
- "Downloading and installing additional software needed"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "c2c074f7-9866-4af0-b822-434435c361a2",
- "metadata": {},
- "outputs": [],
- "source": [
- "%%capture\n",
- "# '%%capture' suppresses the display of installation steps of the following packages\n",
- "!pip install python-louvain\n",
- "!pip install mplcursors"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "d035bbaa-84c0-4401-af88-8bff031caa9d",
- "metadata": {},
- "source": [
- "
\n",
- "
Note: The above statements may need to be uncommented if you run the notebooks on a platform other than ClearScape Analytics Experience that does not have the libraries installed. If you uncomment those installs, be sure to restart the kernel after executing those lines to bring the installed libraries into memory. The simplest way to restart the Kernel is by typing zero zero: 0 0
\n",
- "
\n",
- "\n",
- "Here, we import the required libraries, set environment variables and environment paths (if required).
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "01e63639-879d-4c8c-afd4-981d8aa49f79",
- "metadata": {},
- "outputs": [],
- "source": [
- "import sys\n",
- "import pandas as pd\n",
- "import networkx as nx\n",
- "from teradataml import *\n",
- "import teradataml\n",
- "\n",
- "import community\n",
- "import matplotlib.pyplot as plt\n",
- "import warnings\n",
- "import mplcursors\n",
- "\n",
- "# Suppress warnings\n",
- "warnings.filterwarnings('ignore')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "ced12adf-b2c4-49aa-b5c8-67dd090ff080",
- "metadata": {},
- "source": [
- "
\n",
- "1. Connect to Vantage\n",
- "You will be prompted to provide the password. Enter your password, press the Enter key, and then use the down arrow to go to the next cell.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3d506591-4e0e-4139-bb6a-b96f41377227",
- "metadata": {},
- "outputs": [],
- "source": [
- "%run -i ../startup.ipynb\n",
- "eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)\n",
- "print(eng)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "e7579c1f-eafd-4dc4-9ba9-f1396da4cde6",
- "metadata": {},
- "outputs": [],
- "source": [
- "%%capture\n",
- "execute_sql('''SET query_band='DEMO=Graph_Analysis_PY_SQL.ipynb;' UPDATE FOR SESSION; ''')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "3732eef0-96bd-417b-95e3-403c1999afa8",
- "metadata": {},
- "source": [
- "Begin running steps with Shift + Enter keys.
"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "1a96f1c6-cb43-440f-9682-a851fe09165f",
- "metadata": {},
- "source": [
- "Getting Data for This Demo
\n",
- "We have provided data for this demo on cloud storage. You can either run the demo using foreign tables to access the data without any storage on your environment or download the data to local storage, which may yield faster execution. Still, there could be considerations of available storage. Two statements are in the following cell, and one is commented out. You may switch which mode you choose by changing the comment string.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "57a490a6-75b8-4d4d-8986-46b03fc57081",
- "metadata": {},
- "outputs": [],
- "source": [
- "# %run -i ../run_procedure.py \"call get_data('DEMO_GraphAnalysis_cloud');\" # Takes 30 seconds\n",
- "%run -i ../run_procedure.py \"call get_data('DEMO_GraphAnalysis_local');\" # Takes 1 minute"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "658a3111-ab72-451c-85c2-de0a5f7a0188",
- "metadata": {},
- "source": [
- "Next is an optional step – if you want to see the status of databases/tables created and space used.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "34c657c8-2122-4426-9d38-c5a619f7817a",
- "metadata": {},
- "outputs": [],
- "source": [
- "%run -i ../run_procedure.py \"call space_report();\" # Takes 10 seconds"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "e601e5e6-c6b3-466a-9a7b-f5ff16d2610a",
- "metadata": {},
- "source": [
- "
\n",
- "2. Using Script Table Operator\n",
- "Below is a sample of the data provided, where 'fromuserid' represents the caller and 'touserid' represents the callee.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "7142faca-a943-4bef-bb62-1b09db8fff1d",
- "metadata": {},
- "outputs": [],
- "source": [
- "graph_data = DataFrame(in_schema('DEMO_GraphAnalysis', 'graph_data'))"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "97340221-0690-4a0f-814b-5a25c14a5c8a",
- "metadata": {},
- "source": [
- "
\n",
- "Community Detection using Louvain Algorithm
\n",
- "The below cell will perform the following steps:
\n",
- "\n",
- " - Set SEARCHUIFDBPATH to demo_user
\n",
- " - Install the communities.py file on Vantage
\n",
- " - If the file is already installed, it will remove the file and install it again. This ensures we always have latest script in Vantage.
\n",
- "
\n",
- "\n",
- "Code: [communities.py](./communities.py)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "c04d90c3-03d9-4f9b-8fc7-174dc014e5e9",
- "metadata": {},
- "outputs": [],
- "source": [
- "sto = Script(\n",
- " data = graph_data,\n",
- " script_name = 'communities.py',\n",
- " files_local_path = '/home/jovyan/JupyterLabRoot/UseCases/Graph_Analysis/',\n",
- " script_command = 'tdpython3 ./demo_user/communities.py',\n",
- " returns = OrderedDict([('graph_node', INTEGER()), ('community_id', INTEGER())])\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "3a857307-6441-4546-ba2b-3aad2fc68dc8",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "# Install file on Vantage\n",
- "out = execute_sql(\"SET SESSION SEARCHUIFDBPATH = demo_user;\")\n",
- "\n",
- "try:\n",
- " sto.install_file(file_identifier = 'communities', file_name = 'communities.py')\n",
- "except:\n",
- " sto.remove_file(file_identifier = 'communities')\n",
- " sto.install_file(file_identifier = 'communities', file_name = 'communities.py')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "5916690f-42d6-40b9-a663-669cd0349c1c",
- "metadata": {},
- "source": [
- "The below cell will run the script installed in the above step and store the result in the communities variable.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "e797867b-ba7f-48ea-815a-2863140ea646",
- "metadata": {
- "tags": []
- },
- "outputs": [],
- "source": [
- "communities = sto.execute_script()\n",
- "communities"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "f15a9780-7fb3-4994-9d77-0e4d7a6be492",
- "metadata": {},
- "source": [
- "\n",
- "We have a big group of customers, and they all like to talk to each other on the phone. When we talk about communities, we are interested in finding groups of users who are closely connected to each other and interact more frequently among themselves.\n",
- "
\n",
- "
\n",
- "In our usecase, communities would be like different groups of users who are really close and talk to each other frequently. It's like having different circles of friends within the larger group. Some friends may belong to multiple communities if they have connections with people in different groups.
"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "b95dfc16-8e0d-4729-80d7-58d582ab8a75",
- "metadata": {},
- "source": [
- "
\n",
- "Eigenvector Centrality
\n",
- "Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores.\n",
- "
\n",
- "
\n",
- "The below cell will perform the following steps:
\n",
- "\n",
- " - Set SEARCHUIFDBPATH to demo_user
\n",
- " - Install the centralities.py file on Vantage
\n",
- " - If the file is already installed, it will remove the file and install it again. This ensures we always have latest script in Vantage.
\n",
- "
\n",
- "\n",
- "Code: [centralities.py](./centralities.py)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "dce5e556-9cb9-461a-927b-46ed62a24667",
- "metadata": {},
- "outputs": [],
- "source": [
- "sto = Script(\n",
- " data = graph_data,\n",
- " script_name = 'centralities.py',\n",
- " files_local_path = '/home/jovyan/JupyterLabRoot/UseCases/Graph_Analysis/',\n",
- " script_command = 'tdpython3 ./demo_user/centralities.py',\n",
- " returns = OrderedDict([('graph_node', INTEGER()), ('centrality', FLOAT())])\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "1157a8d1-e59c-422b-b7b4-e480e59a259c",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Install file on Vantage\n",
- "out = execute_sql(\"SET SESSION SEARCHUIFDBPATH = demo_user;\")\n",
- "\n",
- "try:\n",
- " sto.install_file(file_identifier = 'centralities', file_name = 'centralities.py')\n",
- "except:\n",
- " sto.remove_file(file_identifier = 'centralities')\n",
- " sto.install_file(file_identifier = 'centralities', file_name = 'centralities.py')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "07ae94f0-8368-4a44-98d9-c8e26c45679c",
- "metadata": {},
- "source": [
- "The below cell will run the script installed in the above step and store the result in the centralities variable.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "ac294e74-30c5-46ed-8f80-1bec9ea20cae",
- "metadata": {},
- "outputs": [],
- "source": [
- "centralities = sto.execute_script()\n",
- "centralities.sort('centrality', ascending = False)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "b51bfbc5-66a4-4eda-b7f8-b5a7bd93abb0",
- "metadata": {},
- "source": [
- "\n",
- "We have a large group of customers, and they to talk to each other on the phone. Some of the customers are very popular and talk to lots of other customers, while others talk to only a few customers. Eigenvector centrality is a way to measure how important or popular each person is in this group based on the phone calls they make. So, in our graph with phone calls, eigenvector centrality helps us identify the people who are most connected to others and who have important connections. These people are considered more influential or popular in the group. This information can be used to efficiently target the the influential users and in turn the respective communities.
"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "4e691b4f-0599-40cc-992d-2d12b02544a1",
- "metadata": {},
- "source": [
- "
\n",
- "Betweenness Centrality
\n",
- "Betweenness centrality is a way of detecting the amount of influence a node has over the flow of information in a graph. It is often used to find nodes that serve as a bridge from one part of a graph to another. The algorithm calculates shortest paths between all pairs of nodes in a graph.\n",
- "
\n",
- "
\n",
- "The below cell will perform the following steps:
\n",
- "\n",
- " - Set SEARCHUIFDBPATH to demo_user
\n",
- " - Install the betweenness.py file on Vantage
\n",
- " - If the file is already installed, it will remove the file and install it again. This ensures we always have latest script in Vantage.
\n",
- "
\n",
- "\n",
- "Code: [betweenness.py](./betweenness.py)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "078b0fbb-f7e8-4dfb-867e-767f0ef410f9",
- "metadata": {},
- "outputs": [],
- "source": [
- "sto = Script(\n",
- " data = graph_data,\n",
- " script_name = 'betweenness.py',\n",
- " files_local_path = '/home/jovyan/JupyterLabRoot/UseCases/Graph_Analysis/',\n",
- " script_command = 'tdpython3 ./demo_user/betweenness.py',\n",
- " returns = OrderedDict([('graph_node', INTEGER()), ('betweenness', FLOAT())])\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "985bf6a9-741b-4e6a-b29a-cbf5e9eb1743",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Install file on Vantage\n",
- "out = execute_sql(\"SET SESSION SEARCHUIFDBPATH = demo_user;\")\n",
- "\n",
- "try:\n",
- " sto.install_file(file_identifier = 'betweenness', file_name = 'betweenness.py')\n",
- "except:\n",
- " sto.remove_file(file_identifier = 'betweenness')\n",
- " sto.install_file(file_identifier = 'betweenness', file_name = 'betweenness.py')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "446805c5-c139-4e11-ab36-0df3cb5c5bb5",
- "metadata": {},
- "source": [
- "The below cell will run the script installed in the above step and store the result in the betweenness variable.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "e9caddb0-2c92-479c-814c-51c05af828fd",
- "metadata": {},
- "outputs": [],
- "source": [
- "betweenness = sto.execute_script()\n",
- "betweenness.sort('betweenness', ascending = False)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "7ae8c526-04af-441f-aa8f-2a8180b49c60",
- "metadata": {},
- "source": [
- "\n",
- "We have a group of customers, and they all like to talk to each other on the phone. Betweenness centrality is a way to measure how important or influential you are in this group based on the phone calls made by everyone. So, if you have a lot of customers who rely on you to connect with each other, it means you have high betweenness centrality. You're like a central hub in the group, helping users communicate and making sure everyone stays connected.\n",
- "
\n",
- "
\n",
- "Betweenness centrality helps us identify the people who are important in keeping the group connected and maintaining communication. They are like the go-to person when someone needs to reach another user. They play a special role in making sure that everyone in the group stays connected and can talk to each other easily.
"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "9bf951e4-016b-4fbf-b82f-29e3aac86ca0",
- "metadata": {},
- "source": [
- "
\n",
- "Closeness Centrality
\n",
- "Closeness centrality is a way of detecting nodes that are able to spread information very efficiently through a graph. The closeness centrality of a node measures its average farness (inverse distance) to all other nodes. Nodes with a high closeness score have the shortest distances to all other nodes.\n",
- "
\n",
- "
\n",
- "The below cell will perform the following steps:
\n",
- "\n",
- " - Set SEARCHUIFDBPATH to demo_user
\n",
- " - Install the closeness.py file on Vantage
\n",
- " - If the file is already installed, it will remove the file and install it again. This ensures we always have latest script in Vantage.
\n",
- "
\n",
- "\n",
- "Code: [closeness.py](./closeness.py)"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "f2b2f20b-67a6-4d78-b242-a19dcbad333a",
- "metadata": {},
- "outputs": [],
- "source": [
- "sto = Script(\n",
- " data = graph_data,\n",
- " script_name = 'closeness.py',\n",
- " files_local_path = '/home/jovyan/JupyterLabRoot/UseCases/Graph_Analysis/',\n",
- " script_command = 'tdpython3 ./demo_user/closeness.py',\n",
- " returns = OrderedDict([('graph_node', INTEGER()), ('closeness', FLOAT())])\n",
- ")"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "9c530787-b523-43b1-9fd8-90a8e0df904f",
- "metadata": {},
- "outputs": [],
- "source": [
- "# Install file on Vantage\n",
- "out = execute_sql(\"SET SESSION SEARCHUIFDBPATH = demo_user;\")\n",
- "\n",
- "try:\n",
- " sto.install_file(file_identifier = 'closeness', file_name = 'closeness.py')\n",
- "except:\n",
- " sto.remove_file(file_identifier = 'closeness')\n",
- " sto.install_file(file_identifier = 'closeness', file_name = 'closeness.py')"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "56692b56-b859-4091-9152-c6b49420fd5d",
- "metadata": {},
- "source": [
- "The below cell will run the script installed in the above step and store the result in the closeness variable.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "90deea2e-9a1d-4f15-8823-c4a7d53d743b",
- "metadata": {},
- "outputs": [],
- "source": [
- "closeness = sto.execute_script()\n",
- "closeness.sort('closeness', ascending = False)"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "418b16e6-6090-4169-ad45-066df91b56d3",
- "metadata": {},
- "source": [
- "\n",
- "We have a group of customers, and you all enjoy talking to each other on the phone. Closeness centrality is a way to measure how close or connected you are to all users in the group. When we talk about closeness centrality, we are interested in figuring out how quickly you can reach all the users when you make a phone call. If you can reach the users easily and quickly, then you have high closeness centrality.\n",
- "
\n",
- "
\n",
- "Think of it like this: Imagine you want to invite all your friends to a party at your house. If your house is located in the middle of the neighborhood and all your friends live nearby, it will be easy and quick for them to come to your party. In the same way, if you can reach most of your friends with just a few phone calls, it means you are centrally located in the group and have high closeness centrality.\n",
- "
"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "2b5b1d8b-18b5-4236-b05e-83d265030882",
- "metadata": {},
- "source": [
- "
\n",
- "3. Visualization"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "a9b00e2a-6c38-4fb5-8424-b591dc0c5ec6",
- "metadata": {},
- "outputs": [],
- "source": [
- "%matplotlib widget"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "8f00c200-60d3-473e-9a5f-30613f946a62",
- "metadata": {},
- "outputs": [],
- "source": [
- "G = nx.from_pandas_edgelist(\n",
- " DataFrame(in_schema('DEMO_GraphAnalysis', 'graph_data')).to_pandas().reset_index(),\n",
- " source = 'fromuserid',\n",
- " target = 'touserid',\n",
- " create_using = nx.Graph()\n",
- ")\n",
- "centrality = centralities.to_pandas().reset_index()\n",
- "cdf = communities.to_pandas().reset_index()\n",
- "\n",
- "# Define the leader nodes\n",
- "df_sorted = communities.merge(\n",
- " centralities,\n",
- " left_on = 'graph_node',\n",
- " right_on = 'graph_node',\n",
- " how = 'inner',\n",
- " lsuffix = 'community',\n",
- " rsuffix = 'centrality'\n",
- ").to_pandas().sort_values('centrality', ascending = False)\n",
- "\n",
- "leader_nodes = df_sorted.groupby('community_id')['graph_node_community'].first().tolist()\n",
- "\n",
- "# Draw the graph with nodes colored based on communities\n",
- "pos = nx.spring_layout(G)\n",
- "cmap = plt.get_cmap('tab10') # Color map for communities\n",
- "\n",
- "plt.figure(figsize=(10, 6))\n",
- "\n",
- "# Create a dictionary to store community colors\n",
- "community_colors = {}\n",
- "\n",
- "init_nodes = nx.draw_networkx_nodes(G, pos, node_size=200)\n",
- "\n",
- "for community_id in set(cdf.community_id):\n",
- " nodes = list(cdf[cdf['community_id'] == community_id].graph_node)\n",
- "\n",
- " # Check if a node is a leader node\n",
- " node_sizes = [800 if node in leader_nodes else 200 for node in nodes]\n",
- " node_colors = [cmap(community_id) for node in nodes] \n",
- " scatter = nx.draw_networkx_nodes(G, pos, nodelist = nodes, node_color = node_colors, node_size = node_sizes)\n",
- "\n",
- " # Store community color in the dictionary\n",
- " community_colors[community_id] = scatter.get_facecolor()[0]\n",
- "\n",
- "nx.draw_networkx_edges(G, pos, width=0.5, alpha=0.5)\n",
- "\n",
- "# Create a custom legend with community colors\n",
- "legend_labels = [f'Community {community_id}' for community_id in set(cdf.community_id)]\n",
- "custom_legend = [plt.Line2D([], [], marker='o', markersize=10, color=community_colors[community_id], linestyle='', label=label) for community_id, label in zip(set(cdf.community_id), legend_labels)]\n",
- "plt.legend(handles=custom_legend)\n",
- "\n",
- "plt.title('Graph with Communities')\n",
- "plt.axis('off')\n",
- "\n",
- "# Add hover functionality to nodes\n",
- "def update_annot(sel):\n",
- " node_index = sel.target.index\n",
- " node_name = list(G.nodes)[node_index]\n",
- " text = 'Customer ID: ' + str(node_name) + '\\n EigenVector Centrality Score: ' + str(max(centrality[centrality['graph_node'] == node_name]['centrality']))\n",
- " sel.annotation.set_text(text)\n",
- "\n",
- "cursor = mplcursors.cursor(init_nodes, hover=True)\n",
- "cursor.connect('add', update_annot)\n",
- "\n",
- "plt.show()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "ad1c330f-85d2-4022-88bd-acb4700d52a8",
- "metadata": {},
- "source": [
- "\n",
- "
Note: Please hover over the nodes to see additional information.
\n",
- "
"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "5db12928-d747-4f04-bcd3-671092f1aab7",
- "metadata": {},
- "source": [
- "The above graph displays the data in graph format. On hovering on the node, you might see Customer ID and the EigenVector Centrality Score i.e., the influence score. The larger nodes are influential and are connected to other influential nodes. These are the leader nodes of the respective communities.\n",
- "
\n",
- "
Targeting the leader of the communities in a telecom dataset can provide several benefits to a telecom company. Here are some ways it can help:
\n",
- "\n",
- "\n",
- " - Influencing the community: Leaders of communities often hold significant influence over their members. By targeting and engaging with these leaders, a telecom company can leverage their influence to promote their products or services within the community. This can lead to increased brand awareness, customer acquisition, and loyalty.
\n",
- " \n",
- " - Word-of-mouth marketing: Community leaders are typically respected and trusted individuals within their communities. When they endorse or recommend a telecom company's offerings, it can have a powerful impact on the community members' decisions. This word-of-mouth marketing can result in positive brand perception, organic referrals, and a higher likelihood of community members becoming customers.
\n",
- " \n",
- " - Feedback and insights: Community leaders often have a deep understanding of their community's needs, preferences, and pain points. By establishing a relationship with these leaders, a telecom company can gain valuable insights and feedback about their products, services, and customer experiences. This information can be used to refine offerings, address specific community concerns, and tailor marketing strategies to better serve the target audience.
\n",
- " \n",
- " - Co-creation and partnerships: Collaborating with community leaders can lead to mutually beneficial partnerships. Telecom companies can work with community leaders to co-create content, events, or initiatives that cater to the community's interests. This collaborative approach enhances engagement, fosters a sense of community, and establishes the telecom company as a trusted ally within the community.
\n",
- " \n",
- " - Customer retention and satisfaction: Targeting community leaders allows a telecom company to prioritize customer satisfaction and address any issues or concerns promptly. By providing personalized support and assistance to these influential individuals, the company can improve overall customer experience, potentially leading to higher customer retention rates within the community.
\n",
- "
\n",
- "\n",
- "In summary, targeting leaders of telecom communities enables a company to tap into their influence, leverage word-of-mouth marketing, gain valuable insights, foster partnerships, and enhance customer satisfaction. These efforts can result in increased brand visibility, customer acquisition, and long-term success for the telecom company.
\n"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "4a811fed-0b58-4770-ae1f-aae0e6b9be23",
- "metadata": {},
- "source": [
- "
\n",
- "4. Cleanup"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "23c6332a-22e7-422b-a121-0b68540ca5e8",
- "metadata": {},
- "source": [
- " Databases and Tables
\n",
- "The following code will clean up tables and databases created above.
"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "9472c2ca-65d0-4116-9c45-6411fcea52ef",
- "metadata": {},
- "outputs": [],
- "source": [
- "%run -i ../run_procedure.py \"call remove_data('DEMO_GraphAnalysis');\" # Takes 5 seconds"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "adjacent-romantic",
- "metadata": {},
- "outputs": [],
- "source": [
- "remove_context()"
- ]
- },
- {
- "cell_type": "markdown",
- "id": "592b1e85-5365-43e1-a7cc-e07eb1a35210",
- "metadata": {},
- "source": [
- ""
- ]
- }
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3 (ipykernel)",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.9.10"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 5
-}
From 416aaf5ca55a6918763c7c33d2587046a140ef37 Mon Sep 17 00:00:00 2001
From: Mohd Hassan <109505395+Mohd-Hassan@users.noreply.github.com>
Date: Wed, 29 Oct 2025 20:46:35 +0530
Subject: [PATCH 2/3] Add interactive hover tooltips and improve graph
formatting.
1.Implemented mplcursors hover functionality to show Customer ID, community, and centrality scores.
2.Improved layout consistency, legend formatting, and overall graph readability.
---
.../Graph_Analysis_PY_SQL.ipynb | 801 ++++++++++++++++++
1 file changed, 801 insertions(+)
create mode 100644 UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
diff --git a/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb b/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
new file mode 100644
index 00000000..1ef237ee
--- /dev/null
+++ b/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
@@ -0,0 +1,801 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "5f462bad-9f88-4416-a3de-bae6324319cb",
+ "metadata": {},
+ "source": [
+ "\n",
+ " \n",
+ " Graph Analysis using Script Table Operator(STO)\n",
+ "
\n",
+ "
\n",
+ "
\n",
+ ""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "05bc242a-7ee9-4b33-b837-ff52a9cd76d2",
+ "metadata": {},
+ "source": [
+ "Introduction
\n",
+ "\n",
+ "Call Detail Records (CDRs) contain valuable information about communication patterns and interactions between users. By leveraging community detection algorithms on CDR data, businesses can gain insights into the underlying network structure and uncover meaningful communities within their user base.
\n",
+ "\n",
+ "The objective of this analysis is to identify distinct communities or groups of users within the CDR network. Communities are like smaller social circles or friend groups within the larger group of friends. It helps us understand how people naturally form different clusters based on their interactions and relationships. This analysis also identifies influential people in the graph.\n",
+ "
\n",
+ "
\n",
+ "By grouping users into communities based on their calling patterns, the business can better understand the dynamics and relationships among users, leading to several potential applications and benefits like Customer Segmentation, Fraud Detection, Network Optimization, Cross-Selling and Upselling Opportunities, Customer Support and Retention, etc.\n",
+ "
\n",
+ "\n",
+ "In this demo, we'll be using Script Table Operator(STO) to execute custom python scripts on Vantage. The STO operates by executing R and Python scripts from the command line of the Advanced SQL Engine underlying operating system, according to\n",
+ "the following sequence:\n",
+ "
\n",
+ "\n",
+ "\n",
+ " - The language script is installed on the Advanced SQL Engine of the target Teradata Vantage system via a call to\n",
+ "an External Stored Procedure (XSP)
\n",
+ " - The script is invoked by executing a SQL query that calls the STO
\n",
+ " - Each Advanced SQL Engine AMP provides its own portion of input table data, if any, to the script. The script\n",
+ "reads its input from the standard input STDIN
\n",
+ " - Each Advanced SQL Engine AMP runs a different instance of the same script. Hence, the script execution is an\n",
+ "operation that scales through system architecture as the same script is run concurrently on all AMPs
\n",
+ " - The script executes its code and sends its results to the standard output STDOUT. Each Advanced SQL Engine\n",
+ "AMP individually picks up the corresponding script instance results, and returns them to the STO
\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "92c9fc4a-42bb-454a-8f7e-5b0983004501",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "Downloading and installing additional software needed"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c2c074f7-9866-4af0-b822-434435c361a2",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%capture\n",
+ "# '%%capture' suppresses the display of installation steps of the following packages\n",
+ "!pip install python-louvain\n",
+ "!pip install mplcursors\n",
+ "!pip install ipympl"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d035bbaa-84c0-4401-af88-8bff031caa9d",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "
Note: The above statements may need to be uncommented if you run the notebooks on a platform other than ClearScape Analytics Experience that does not have the libraries installed. If you uncomment those installs, be sure to restart the kernel after executing those lines to bring the installed libraries into memory. The simplest way to restart the Kernel is by typing zero zero: 0 0
\n",
+ "
\n",
+ "\n",
+ "Here, we import the required libraries, set environment variables and environment paths (if required).
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "01e63639-879d-4c8c-afd4-981d8aa49f79",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "import pandas as pd\n",
+ "import networkx as nx\n",
+ "from teradataml import *\n",
+ "import teradataml\n",
+ "\n",
+ "import community\n",
+ "import matplotlib.pyplot as plt\n",
+ "import warnings\n",
+ "import mplcursors\n",
+ "\n",
+ "# Suppress warnings\n",
+ "warnings.filterwarnings('ignore')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ced12adf-b2c4-49aa-b5c8-67dd090ff080",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "1. Connect to Vantage\n",
+ "You will be prompted to provide the password. Enter your password, press the Enter key, and then use the down arrow to go to the next cell.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3d506591-4e0e-4139-bb6a-b96f41377227",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%run -i ../startup.ipynb\n",
+ "eng = create_context(host = 'host.docker.internal', username='demo_user', password = password)\n",
+ "print(eng)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e7579c1f-eafd-4dc4-9ba9-f1396da4cde6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%%capture\n",
+ "execute_sql('''SET query_band='DEMO=Graph_Analysis_PY_SQL.ipynb;' UPDATE FOR SESSION; ''')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "3732eef0-96bd-417b-95e3-403c1999afa8",
+ "metadata": {},
+ "source": [
+ "Begin running steps with Shift + Enter keys.
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1a96f1c6-cb43-440f-9682-a851fe09165f",
+ "metadata": {},
+ "source": [
+ "Getting Data for This Demo
\n",
+ "We have provided data for this demo on cloud storage. You can either run the demo using foreign tables to access the data without any storage on your environment or download the data to local storage, which may yield faster execution. Still, there could be considerations of available storage. Two statements are in the following cell, and one is commented out. You may switch which mode you choose by changing the comment string.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "57a490a6-75b8-4d4d-8986-46b03fc57081",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# %run -i ../run_procedure.py \"call get_data('DEMO_GraphAnalysis_cloud');\" # Takes 30 seconds\n",
+ "%run -i ../run_procedure.py \"call get_data('DEMO_GraphAnalysis_local');\" # Takes 1 minute"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "658a3111-ab72-451c-85c2-de0a5f7a0188",
+ "metadata": {},
+ "source": [
+ "Next is an optional step – if you want to see the status of databases/tables created and space used.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "34c657c8-2122-4426-9d38-c5a619f7817a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%run -i ../run_procedure.py \"call space_report();\" # Takes 10 seconds"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "e601e5e6-c6b3-466a-9a7b-f5ff16d2610a",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "2. Using Script Table Operator\n",
+ "Below is a sample of the data provided, where 'fromuserid' represents the caller and 'touserid' represents the callee.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "7142faca-a943-4bef-bb62-1b09db8fff1d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "graph_data = DataFrame(in_schema('DEMO_GraphAnalysis', 'graph_data'))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "97340221-0690-4a0f-814b-5a25c14a5c8a",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "Community Detection using Louvain Algorithm
\n",
+ "The below cell will perform the following steps:
\n",
+ "\n",
+ " - Set SEARCHUIFDBPATH to demo_user
\n",
+ " - Install the communities.py file on Vantage
\n",
+ " - If the file is already installed, it will remove the file and install it again. This ensures we always have latest script in Vantage.
\n",
+ "
\n",
+ "\n",
+ "Code: [communities.py](./communities.py)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "c04d90c3-03d9-4f9b-8fc7-174dc014e5e9",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sto = Script(\n",
+ " data = graph_data,\n",
+ " script_name = 'communities.py',\n",
+ " files_local_path = '/home/jovyan/JupyterLabRoot/UseCases/Graph_Analysis/',\n",
+ " script_command = 'tdpython3 ./demo_user/communities.py',\n",
+ " returns = OrderedDict([('graph_node', INTEGER()), ('community_id', INTEGER())])\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "3a857307-6441-4546-ba2b-3aad2fc68dc8",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "# Install file on Vantage\n",
+ "out = execute_sql(\"SET SESSION SEARCHUIFDBPATH = demo_user;\")\n",
+ "\n",
+ "try:\n",
+ " sto.install_file(file_identifier = 'communities', file_name = 'communities.py')\n",
+ "except:\n",
+ " sto.remove_file(file_identifier = 'communities')\n",
+ " sto.install_file(file_identifier = 'communities', file_name = 'communities.py')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5916690f-42d6-40b9-a663-669cd0349c1c",
+ "metadata": {},
+ "source": [
+ "The below cell will run the script installed in the above step and store the result in the communities variable.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e797867b-ba7f-48ea-815a-2863140ea646",
+ "metadata": {
+ "tags": []
+ },
+ "outputs": [],
+ "source": [
+ "communities = sto.execute_script()\n",
+ "communities"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f15a9780-7fb3-4994-9d77-0e4d7a6be492",
+ "metadata": {},
+ "source": [
+ "\n",
+ "We have a big group of customers, and they all like to talk to each other on the phone. When we talk about communities, we are interested in finding groups of users who are closely connected to each other and interact more frequently among themselves.\n",
+ "
\n",
+ "
\n",
+ "In our usecase, communities would be like different groups of users who are really close and talk to each other frequently. It's like having different circles of friends within the larger group. Some friends may belong to multiple communities if they have connections with people in different groups.
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b95dfc16-8e0d-4729-80d7-58d582ab8a75",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "Eigenvector Centrality
\n",
+ "Eigenvector Centrality is an algorithm that measures the transitive influence of nodes. Relationships originating from high-scoring nodes contribute more to the score of a node than connections from low-scoring nodes. A high eigenvector score means that a node is connected to many nodes who themselves have high scores.\n",
+ "
\n",
+ "
\n",
+ "The below cell will perform the following steps:
\n",
+ "\n",
+ " - Set SEARCHUIFDBPATH to demo_user
\n",
+ " - Install the centralities.py file on Vantage
\n",
+ " - If the file is already installed, it will remove the file and install it again. This ensures we always have latest script in Vantage.
\n",
+ "
\n",
+ "\n",
+ "Code: [centralities.py](./centralities.py)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "dce5e556-9cb9-461a-927b-46ed62a24667",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sto = Script(\n",
+ " data = graph_data,\n",
+ " script_name = 'centralities.py',\n",
+ " files_local_path = '/home/jovyan/JupyterLabRoot/UseCases/Graph_Analysis/',\n",
+ " script_command = 'tdpython3 ./demo_user/centralities.py',\n",
+ " returns = OrderedDict([('graph_node', INTEGER()), ('centrality', FLOAT())])\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "1157a8d1-e59c-422b-b7b4-e480e59a259c",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Install file on Vantage\n",
+ "out = execute_sql(\"SET SESSION SEARCHUIFDBPATH = demo_user;\")\n",
+ "\n",
+ "try:\n",
+ " sto.install_file(file_identifier = 'centralities', file_name = 'centralities.py')\n",
+ "except:\n",
+ " sto.remove_file(file_identifier = 'centralities')\n",
+ " sto.install_file(file_identifier = 'centralities', file_name = 'centralities.py')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "07ae94f0-8368-4a44-98d9-c8e26c45679c",
+ "metadata": {},
+ "source": [
+ "The below cell will run the script installed in the above step and store the result in the centralities variable.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "ac294e74-30c5-46ed-8f80-1bec9ea20cae",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "centralities = sto.execute_script()\n",
+ "centralities.sort('centrality', ascending = False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "b51bfbc5-66a4-4eda-b7f8-b5a7bd93abb0",
+ "metadata": {},
+ "source": [
+ "\n",
+ "We have a large group of customers, and they to talk to each other on the phone. Some of the customers are very popular and talk to lots of other customers, while others talk to only a few customers. Eigenvector centrality is a way to measure how important or popular each person is in this group based on the phone calls they make. So, in our graph with phone calls, eigenvector centrality helps us identify the people who are most connected to others and who have important connections. These people are considered more influential or popular in the group. This information can be used to efficiently target the the influential users and in turn the respective communities.
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4e691b4f-0599-40cc-992d-2d12b02544a1",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "Betweenness Centrality
\n",
+ "Betweenness centrality is a way of detecting the amount of influence a node has over the flow of information in a graph. It is often used to find nodes that serve as a bridge from one part of a graph to another. The algorithm calculates shortest paths between all pairs of nodes in a graph.\n",
+ "
\n",
+ "
\n",
+ "The below cell will perform the following steps:
\n",
+ "\n",
+ " - Set SEARCHUIFDBPATH to demo_user
\n",
+ " - Install the betweenness.py file on Vantage
\n",
+ " - If the file is already installed, it will remove the file and install it again. This ensures we always have latest script in Vantage.
\n",
+ "
\n",
+ "\n",
+ "Code: [betweenness.py](./betweenness.py)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "078b0fbb-f7e8-4dfb-867e-767f0ef410f9",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sto = Script(\n",
+ " data = graph_data,\n",
+ " script_name = 'betweenness.py',\n",
+ " files_local_path = '/home/jovyan/JupyterLabRoot/UseCases/Graph_Analysis/',\n",
+ " script_command = 'tdpython3 ./demo_user/betweenness.py',\n",
+ " returns = OrderedDict([('graph_node', INTEGER()), ('betweenness', FLOAT())])\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "985bf6a9-741b-4e6a-b29a-cbf5e9eb1743",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Install file on Vantage\n",
+ "out = execute_sql(\"SET SESSION SEARCHUIFDBPATH = demo_user;\")\n",
+ "\n",
+ "try:\n",
+ " sto.install_file(file_identifier = 'betweenness', file_name = 'betweenness.py')\n",
+ "except:\n",
+ " sto.remove_file(file_identifier = 'betweenness')\n",
+ " sto.install_file(file_identifier = 'betweenness', file_name = 'betweenness.py')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "446805c5-c139-4e11-ab36-0df3cb5c5bb5",
+ "metadata": {},
+ "source": [
+ "The below cell will run the script installed in the above step and store the result in the betweenness variable.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "e9caddb0-2c92-479c-814c-51c05af828fd",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "betweenness = sto.execute_script()\n",
+ "betweenness.sort('betweenness', ascending = False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "7ae8c526-04af-441f-aa8f-2a8180b49c60",
+ "metadata": {},
+ "source": [
+ "\n",
+ "We have a group of customers, and they all like to talk to each other on the phone. Betweenness centrality is a way to measure how important or influential you are in this group based on the phone calls made by everyone. So, if you have a lot of customers who rely on you to connect with each other, it means you have high betweenness centrality. You're like a central hub in the group, helping users communicate and making sure everyone stays connected.\n",
+ "
\n",
+ "
\n",
+ "Betweenness centrality helps us identify the people who are important in keeping the group connected and maintaining communication. They are like the go-to person when someone needs to reach another user. They play a special role in making sure that everyone in the group stays connected and can talk to each other easily.
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9bf951e4-016b-4fbf-b82f-29e3aac86ca0",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "Closeness Centrality
\n",
+ "Closeness centrality is a way of detecting nodes that are able to spread information very efficiently through a graph. The closeness centrality of a node measures its average farness (inverse distance) to all other nodes. Nodes with a high closeness score have the shortest distances to all other nodes.\n",
+ "
\n",
+ "
\n",
+ "The below cell will perform the following steps:
\n",
+ "\n",
+ " - Set SEARCHUIFDBPATH to demo_user
\n",
+ " - Install the closeness.py file on Vantage
\n",
+ " - If the file is already installed, it will remove the file and install it again. This ensures we always have latest script in Vantage.
\n",
+ "
\n",
+ "\n",
+ "Code: [closeness.py](./closeness.py)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "f2b2f20b-67a6-4d78-b242-a19dcbad333a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "sto = Script(\n",
+ " data = graph_data,\n",
+ " script_name = 'closeness.py',\n",
+ " files_local_path = '/home/jovyan/JupyterLabRoot/UseCases/Graph_Analysis/',\n",
+ " script_command = 'tdpython3 ./demo_user/closeness.py',\n",
+ " returns = OrderedDict([('graph_node', INTEGER()), ('closeness', FLOAT())])\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9c530787-b523-43b1-9fd8-90a8e0df904f",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Install file on Vantage\n",
+ "out = execute_sql(\"SET SESSION SEARCHUIFDBPATH = demo_user;\")\n",
+ "\n",
+ "try:\n",
+ " sto.install_file(file_identifier = 'closeness', file_name = 'closeness.py')\n",
+ "except:\n",
+ " sto.remove_file(file_identifier = 'closeness')\n",
+ " sto.install_file(file_identifier = 'closeness', file_name = 'closeness.py')"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "56692b56-b859-4091-9152-c6b49420fd5d",
+ "metadata": {},
+ "source": [
+ "The below cell will run the script installed in the above step and store the result in the closeness variable.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "90deea2e-9a1d-4f15-8823-c4a7d53d743b",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "closeness = sto.execute_script()\n",
+ "closeness.sort('closeness', ascending = False)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "418b16e6-6090-4169-ad45-066df91b56d3",
+ "metadata": {},
+ "source": [
+ "\n",
+ "We have a group of customers, and you all enjoy talking to each other on the phone. Closeness centrality is a way to measure how close or connected you are to all users in the group. When we talk about closeness centrality, we are interested in figuring out how quickly you can reach all the users when you make a phone call. If you can reach the users easily and quickly, then you have high closeness centrality.\n",
+ "
\n",
+ "
\n",
+ "Think of it like this: Imagine you want to invite all your friends to a party at your house. If your house is located in the middle of the neighborhood and all your friends live nearby, it will be easy and quick for them to come to your party. In the same way, if you can reach most of your friends with just a few phone calls, it means you are centrally located in the group and have high closeness centrality.\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "2b5b1d8b-18b5-4236-b05e-83d265030882",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "3. Visualization"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "13ab2a61-3575-4714-83c1-ad8499c34ba0",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import ipympl\n",
+ "import plotly.io as pio\n",
+ "pio.renderers.default = 'notebook'"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "a9b00e2a-6c38-4fb5-8424-b591dc0c5ec6",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%matplotlib inline"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "4cde90b6-81cc-49eb-96f8-dc5e4b18a941",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%matplotlib widget"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "dd68f930-2bb9-4c56-a94b-a518be9384e0",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import networkx as nx\n",
+ "import mplcursors\n",
+ "from teradataml import DataFrame, in_schema\n",
+ "\n",
+ "# Load data from Teradata to pandas DataFrames\n",
+ "graph_df = DataFrame(in_schema('DEMO_GraphAnalysis', 'graph_data')).to_pandas().reset_index()\n",
+ "centrality = centralities.to_pandas().reset_index()\n",
+ "cdf = communities.to_pandas().reset_index()\n",
+ "\n",
+ "# Build graph from edge list\n",
+ "G = nx.from_pandas_edgelist(\n",
+ " graph_df,\n",
+ " source='fromuserid',\n",
+ " target='touserid',\n",
+ " create_using=nx.Graph()\n",
+ ")\n",
+ "\n",
+ "# Determine leader nodes (highest centrality per community)\n",
+ "df_sorted = communities.merge(\n",
+ " centralities,\n",
+ " left_on='graph_node',\n",
+ " right_on='graph_node',\n",
+ " how='inner',\n",
+ " lsuffix='community',\n",
+ " rsuffix='centrality'\n",
+ ").to_pandas().sort_values('centrality', ascending=False)\n",
+ "\n",
+ "leader_nodes = df_sorted.groupby('community_id')['graph_node_community'].first().tolist()\n",
+ "\n",
+ "# Create a mapping from node → community for hover\n",
+ "node_community_map = dict(zip(cdf.graph_node, cdf.community_id))\n",
+ "\n",
+ "# Plot setup\n",
+ "pos = nx.spring_layout(G, seed=42) # fixed seed for consistent layout\n",
+ "cmap = plt.get_cmap('tab10')\n",
+ "\n",
+ "plt.figure(figsize=(10, 6))\n",
+ "\n",
+ "community_colors = {}\n",
+ "scatter_plots = []\n",
+ "\n",
+ "# Draw community nodes separately to assign colors and sizes\n",
+ "for community_id in sorted(set(cdf.community_id)):\n",
+ " nodes = list(cdf[cdf['community_id'] == community_id].graph_node)\n",
+ " node_sizes = [800 if node in leader_nodes else 200 for node in nodes]\n",
+ " node_colors = [cmap(community_id) for _ in nodes]\n",
+ "\n",
+ " scatter = nx.draw_networkx_nodes(\n",
+ " G, pos,\n",
+ " nodelist=nodes,\n",
+ " node_size=node_sizes,\n",
+ " node_color=node_colors,\n",
+ " label=f'Community {community_id}'\n",
+ " )\n",
+ " scatter_plots.append((scatter, nodes))\n",
+ " community_colors[community_id] = scatter.get_facecolor()[0]\n",
+ "\n",
+ "# Draw edges\n",
+ "nx.draw_networkx_edges(G, pos, alpha=0.5, width=0.5)\n",
+ "\n",
+ "# Create legend with uniform marker size\n",
+ "legend_labels = [f'Community {cid}' for cid in sorted(set(cdf.community_id))]\n",
+ "custom_legend = [plt.Line2D([], [], marker='o', color=community_colors[cid],\n",
+ " linestyle='', markersize=10, label=label)\n",
+ " for cid, label in zip(sorted(set(cdf.community_id)), legend_labels)]\n",
+ "plt.legend(handles=custom_legend, loc='best')\n",
+ "\n",
+ "plt.title('Graph with Communities')\n",
+ "plt.axis('off')\n",
+ "\n",
+ "# Hover function factory\n",
+ "def make_update_annot(scatter, nodes):\n",
+ " def update_annot(sel):\n",
+ " node_index = sel.index\n",
+ " node_name = nodes[node_index]\n",
+ " community_id = node_community_map[node_name]\n",
+ " centrality_score = max(centrality[centrality['graph_node'] == node_name]['centrality'])\n",
+ " sel.annotation.set_text(\n",
+ " f'Customer ID: {node_name}\\nCommunity: {community_id}\\nEigenVector Centrality Score: {centrality_score:.4f}'\n",
+ " )\n",
+ " sel.annotation.get_bbox_patch().set_alpha(0.9) # slightly transparent background\n",
+ " return update_annot\n",
+ "\n",
+ "# Attach cursor with hover to each community scatter\n",
+ "for scatter, nodes in scatter_plots:\n",
+ " cursor = mplcursors.cursor(scatter, hover=True)\n",
+ " cursor.connect('add', make_update_annot(scatter, nodes))\n",
+ "\n",
+ "plt.show()\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ad1c330f-85d2-4022-88bd-acb4700d52a8",
+ "metadata": {},
+ "source": [
+ "\n",
+ "
Note: Please hover over the nodes to see additional information.
\n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "8e316db8-c74b-4da8-976b-1824e284ff0d",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "from IPython.display import clear_output\n",
+ "plt.close('all')\n",
+ "clear_output(wait=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "5db12928-d747-4f04-bcd3-671092f1aab7",
+ "metadata": {},
+ "source": [
+ "The above graph displays the data in graph format. On hovering on the node, you might see Customer ID and the EigenVector Centrality Score i.e., the influence score. The larger nodes are influential and are connected to other influential nodes. These are the leader nodes of the respective communities.\n",
+ "
\n",
+ "
Targeting the leader of the communities in a telecom dataset can provide several benefits to a telecom company. Here are some ways it can help:
\n",
+ "\n",
+ "\n",
+ " - Influencing the community: Leaders of communities often hold significant influence over their members. By targeting and engaging with these leaders, a telecom company can leverage their influence to promote their products or services within the community. This can lead to increased brand awareness, customer acquisition, and loyalty.
\n",
+ " \n",
+ " - Word-of-mouth marketing: Community leaders are typically respected and trusted individuals within their communities. When they endorse or recommend a telecom company's offerings, it can have a powerful impact on the community members' decisions. This word-of-mouth marketing can result in positive brand perception, organic referrals, and a higher likelihood of community members becoming customers.
\n",
+ " \n",
+ " - Feedback and insights: Community leaders often have a deep understanding of their community's needs, preferences, and pain points. By establishing a relationship with these leaders, a telecom company can gain valuable insights and feedback about their products, services, and customer experiences. This information can be used to refine offerings, address specific community concerns, and tailor marketing strategies to better serve the target audience.
\n",
+ " \n",
+ " - Co-creation and partnerships: Collaborating with community leaders can lead to mutually beneficial partnerships. Telecom companies can work with community leaders to co-create content, events, or initiatives that cater to the community's interests. This collaborative approach enhances engagement, fosters a sense of community, and establishes the telecom company as a trusted ally within the community.
\n",
+ " \n",
+ " - Customer retention and satisfaction: Targeting community leaders allows a telecom company to prioritize customer satisfaction and address any issues or concerns promptly. By providing personalized support and assistance to these influential individuals, the company can improve overall customer experience, potentially leading to higher customer retention rates within the community.
\n",
+ "
\n",
+ "\n",
+ "In summary, targeting leaders of telecom communities enables a company to tap into their influence, leverage word-of-mouth marketing, gain valuable insights, foster partnerships, and enhance customer satisfaction. These efforts can result in increased brand visibility, customer acquisition, and long-term success for the telecom company.
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "4a811fed-0b58-4770-ae1f-aae0e6b9be23",
+ "metadata": {},
+ "source": [
+ "
\n",
+ "4. Cleanup"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "23c6332a-22e7-422b-a121-0b68540ca5e8",
+ "metadata": {},
+ "source": [
+ " Databases and Tables
\n",
+ "The following code will clean up tables and databases created above.
"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "9472c2ca-65d0-4116-9c45-6411fcea52ef",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "%run -i ../run_procedure.py \"call remove_data('DEMO_GraphAnalysis');\" # Takes 5 seconds"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "id": "adjacent-romantic",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "remove_context()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "592b1e85-5365-43e1-a7cc-e07eb1a35210",
+ "metadata": {},
+ "source": [
+ ""
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.14"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
From 0a6dbe2935123031c7547d09705459886152e423 Mon Sep 17 00:00:00 2001
From: Mohd-Hassan
Date: Fri, 7 Nov 2025 06:39:57 +0000
Subject: [PATCH 3/3] Address Visuals for PR review comments for #1025
---
.../Graph_Analysis_PY_SQL.ipynb | 205 +++++++-----------
1 file changed, 82 insertions(+), 123 deletions(-)
diff --git a/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb b/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
index 1ef237ee..245bc87b 100644
--- a/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
+++ b/UseCases/Graph_Analysis/Graph_Analysis_PY_SQL.ipynb
@@ -64,9 +64,7 @@
"source": [
"%%capture\n",
"# '%%capture' suppresses the display of installation steps of the following packages\n",
- "!pip install python-louvain\n",
- "!pip install mplcursors\n",
- "!pip install ipympl"
+ "!pip install python-louvain"
]
},
{
@@ -95,9 +93,7 @@
"import teradataml\n",
"\n",
"import community\n",
- "import matplotlib.pyplot as plt\n",
"import warnings\n",
- "import mplcursors\n",
"\n",
"# Suppress warnings\n",
"warnings.filterwarnings('ignore')"
@@ -549,53 +545,18 @@
{
"cell_type": "code",
"execution_count": null,
- "id": "13ab2a61-3575-4714-83c1-ad8499c34ba0",
+ "id": "e49df417-7d32-43ab-84ac-21d9acee4e50",
"metadata": {},
"outputs": [],
"source": [
- "import ipympl\n",
- "import plotly.io as pio\n",
- "pio.renderers.default = 'notebook'"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "a9b00e2a-6c38-4fb5-8424-b591dc0c5ec6",
- "metadata": {},
- "outputs": [],
- "source": [
- "%matplotlib inline"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "4cde90b6-81cc-49eb-96f8-dc5e4b18a941",
- "metadata": {},
- "outputs": [],
- "source": [
- "%matplotlib widget"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "dd68f930-2bb9-4c56-a94b-a518be9384e0",
- "metadata": {},
- "outputs": [],
- "source": [
- "import matplotlib.pyplot as plt\n",
- "import networkx as nx\n",
- "import mplcursors\n",
- "from teradataml import DataFrame, in_schema\n",
+ "import plotly.graph_objects as go\n",
"\n",
- "# Load data from Teradata to pandas DataFrames\n",
- "graph_df = DataFrame(in_schema('DEMO_GraphAnalysis', 'graph_data')).to_pandas().reset_index()\n",
+ "# --- Load data from Teradata ---\n",
+ "graph_df = graph_data.to_pandas().reset_index()\n",
"centrality = centralities.to_pandas().reset_index()\n",
"cdf = communities.to_pandas().reset_index()\n",
"\n",
- "# Build graph from edge list\n",
+ "# --- Build NetworkX graph ---\n",
"G = nx.from_pandas_edgelist(\n",
" graph_df,\n",
" source='fromuserid',\n",
@@ -603,78 +564,88 @@
" create_using=nx.Graph()\n",
")\n",
"\n",
- "# Determine leader nodes (highest centrality per community)\n",
- "df_sorted = communities.merge(\n",
- " centralities,\n",
- " left_on='graph_node',\n",
- " right_on='graph_node',\n",
- " how='inner',\n",
- " lsuffix='community',\n",
- " rsuffix='centrality'\n",
- ").to_pandas().sort_values('centrality', ascending=False)\n",
- "\n",
- "leader_nodes = df_sorted.groupby('community_id')['graph_node_community'].first().tolist()\n",
+ "# --- Build position layout ---\n",
+ "pos = nx.spring_layout(G, seed=42)\n",
"\n",
- "# Create a mapping from node → community for hover\n",
+ "# --- Create community and centrality maps for quick lookup ---\n",
"node_community_map = dict(zip(cdf.graph_node, cdf.community_id))\n",
+ "node_centrality_map = dict(zip(centrality.graph_node, centrality.centrality))\n",
+ "\n",
+ "# --- Build edge traces ---\n",
+ "edge_x, edge_y = [], []\n",
+ "for u, v in G.edges():\n",
+ " if u not in pos or v not in pos:\n",
+ " continue\n",
+ " x0, y0 = pos[u]\n",
+ " x1, y1 = pos[v]\n",
+ " edge_x += [x0, x1, None]\n",
+ " edge_y += [y0, y1, None]\n",
+ "\n",
+ "edge_trace = go.Scatter(\n",
+ " x=edge_x,\n",
+ " y=edge_y,\n",
+ " line=dict(width=0.5, color='#999'),\n",
+ " hoverinfo='none',\n",
+ " mode='lines'\n",
+ ")\n",
"\n",
- "# Plot setup\n",
- "pos = nx.spring_layout(G, seed=42) # fixed seed for consistent layout\n",
- "cmap = plt.get_cmap('tab10')\n",
- "\n",
- "plt.figure(figsize=(10, 6))\n",
- "\n",
- "community_colors = {}\n",
- "scatter_plots = []\n",
- "\n",
- "# Draw community nodes separately to assign colors and sizes\n",
- "for community_id in sorted(set(cdf.community_id)):\n",
- " nodes = list(cdf[cdf['community_id'] == community_id].graph_node)\n",
- " node_sizes = [800 if node in leader_nodes else 200 for node in nodes]\n",
- " node_colors = [cmap(community_id) for _ in nodes]\n",
- "\n",
- " scatter = nx.draw_networkx_nodes(\n",
- " G, pos,\n",
- " nodelist=nodes,\n",
- " node_size=node_sizes,\n",
- " node_color=node_colors,\n",
- " label=f'Community {community_id}'\n",
+ "# --- Build node traces ---\n",
+ "node_x, node_y, hover_text, node_color, node_size = [], [], [], [], []\n",
+ "\n",
+ "for node in G.nodes():\n",
+ " if node not in pos:\n",
+ " continue\n",
+ " x, y = pos[node]\n",
+ " node_x.append(x)\n",
+ " node_y.append(y)\n",
+ "\n",
+ " comm = node_community_map.get(node, 'Unknown')\n",
+ " cent = node_centrality_map.get(node, 0)\n",
+ "\n",
+ " node_color.append(comm)\n",
+ " node_size.append(10) # All nodes same size, no leader emphasis\n",
+ " hover_text.append(\n",
+ " f\"Customer ID: {node}\"\n",
+ " f\"
Community ID: {comm}\"\n",
+ " f\"
Eigenvector Centrality: {cent:.4f}\"\n",
" )\n",
- " scatter_plots.append((scatter, nodes))\n",
- " community_colors[community_id] = scatter.get_facecolor()[0]\n",
- "\n",
- "# Draw edges\n",
- "nx.draw_networkx_edges(G, pos, alpha=0.5, width=0.5)\n",
"\n",
- "# Create legend with uniform marker size\n",
- "legend_labels = [f'Community {cid}' for cid in sorted(set(cdf.community_id))]\n",
- "custom_legend = [plt.Line2D([], [], marker='o', color=community_colors[cid],\n",
- " linestyle='', markersize=10, label=label)\n",
- " for cid, label in zip(sorted(set(cdf.community_id)), legend_labels)]\n",
- "plt.legend(handles=custom_legend, loc='best')\n",
- "\n",
- "plt.title('Graph with Communities')\n",
- "plt.axis('off')\n",
- "\n",
- "# Hover function factory\n",
- "def make_update_annot(scatter, nodes):\n",
- " def update_annot(sel):\n",
- " node_index = sel.index\n",
- " node_name = nodes[node_index]\n",
- " community_id = node_community_map[node_name]\n",
- " centrality_score = max(centrality[centrality['graph_node'] == node_name]['centrality'])\n",
- " sel.annotation.set_text(\n",
- " f'Customer ID: {node_name}\\nCommunity: {community_id}\\nEigenVector Centrality Score: {centrality_score:.4f}'\n",
- " )\n",
- " sel.annotation.get_bbox_patch().set_alpha(0.9) # slightly transparent background\n",
- " return update_annot\n",
+ "node_trace = go.Scatter(\n",
+ " x=node_x,\n",
+ " y=node_y,\n",
+ " mode='markers',\n",
+ " hoverinfo='text',\n",
+ " text=hover_text,\n",
+ " marker=dict(\n",
+ " size=node_size,\n",
+ " color=node_color,\n",
+ " colorscale='Turbo',\n",
+ " colorbar=dict(title='Community ID'),\n",
+ " line_width=1\n",
+ " )\n",
+ ")\n",
"\n",
- "# Attach cursor with hover to each community scatter\n",
- "for scatter, nodes in scatter_plots:\n",
- " cursor = mplcursors.cursor(scatter, hover=True)\n",
- " cursor.connect('add', make_update_annot(scatter, nodes))\n",
+ "# --- Combine and show ---\n",
+ "fig = go.Figure(\n",
+ " data=[edge_trace, node_trace],\n",
+ " layout=go.Layout(\n",
+ " title='Graph with Communities ',\n",
+ " showlegend=False,\n",
+ " hovermode='closest',\n",
+ " margin=dict(b=20, l=5, r=5, t=40),\n",
+ " height=500, # Maintain taller plot\n",
+ " annotations=[\n",
+ " dict(\n",
+ " text='Interactive visualization of communities',\n",
+ " showarrow=False,\n",
+ " xref='paper', yref='paper',\n",
+ " x=0, y=-0.1\n",
+ " )\n",
+ " ]\n",
+ " )\n",
+ ")\n",
"\n",
- "plt.show()\n"
+ "fig.show()\n"
]
},
{
@@ -687,24 +658,12 @@
""
]
},
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "8e316db8-c74b-4da8-976b-1824e284ff0d",
- "metadata": {},
- "outputs": [],
- "source": [
- "from IPython.display import clear_output\n",
- "plt.close('all')\n",
- "clear_output(wait=True)"
- ]
- },
{
"cell_type": "markdown",
"id": "5db12928-d747-4f04-bcd3-671092f1aab7",
"metadata": {},
"source": [
- "The above graph displays the data in graph format. On hovering on the node, you might see Customer ID and the EigenVector Centrality Score i.e., the influence score. The larger nodes are influential and are connected to other influential nodes. These are the leader nodes of the respective communities.\n",
+ "
The above graph displays the data in graph format. On hovering on the node, you might see Customer ID, the EigenVector Centrality Score i.e., the influence score and the Community ID. The larger nodes are influential and are connected to other influential nodes. These are the leader nodes of the respective communities.\n",
"
\n",
"
Targeting the leader of the communities in a telecom dataset can provide several benefits to a telecom company. Here are some ways it can help:
\n",
"\n",