-
Notifications
You must be signed in to change notification settings - Fork 3
Translator optimizations #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
paul-tqh-nguyen
wants to merge
8
commits into
metagraph-dev:main
Choose a base branch
from
paul-tqh-nguyen:master
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…tor ; optimize CuDFEdgeSet->ScipyEdgeSet translator The CuDFEdgeSet->ScipyEdgeSet translator used to grab unique values on the CPU-side, which is expensive. This is now done on the GPU-side. We shave off some time on the part of the translator that gets the unique nodes. The speedup is from 43.716705560684204 seconds to 41.033896923065186 seconds using the code below: from typing import Callable, Generator from contextlib import contextmanager @contextmanager def timer(section_name: str = None, exitCallback: Callable[[float], None] = None) -> Generator: import time start_time = time.time() yield end_time = time.time() elapsed_time = end_time - start_time if exitCallback != None: exitCallback(elapsed_time) elif section_name: print(f'{section_name} took {elapsed_time} seconds.') else: print(f'Execution took {elapsed_time} seconds.') return import cudf import random import numpy as np import networkx as nx import metagraph as mg r = mg.resolver print("Generating graph.") num_nodes = int(1e7) source_nodes = list(range(num_nodes)) target_nodes = list(range(num_nodes)) random.shuffle(source_nodes) random.shuffle(target_nodes) df = cudf.DataFrame({"source": source_nodes, "target": target_nodes}) print("Loading into metagraph.") edge_list = r.wrappers.EdgeSet.CuDFEdgeSet(df, src_label="source", dst_label="target", is_directed=False) print("Timing translation.") with timer(section_name="cupy"): r.translate(edge_list, r.types.EdgeSet.ScipyEdgeSetType)
This commit moves translation to the GPU for the CuDFEdgeSet -> ScipyEdgeSet translator. We get a 10x speed up from 3.3889002799987793 seconds to 0.3172187805175781 seconds.
This commit moves translation to the GPU for the CuDFEdgeMap -> ScipyEdgeMap translator. We get a 10x speed up from 3.7685587406158447 seconds to 0.3261408805847168 seconds using the following code: from typing import Callable, Generator from contextlib import contextmanager @contextmanager def timer(section_name: str = None, exitCallback: Callable[[float], None] = None) -> Generator: import time start_time = time.time() yield end_time = time.time() elapsed_time = end_time - start_time if exitCallback != None: exitCallback(elapsed_time) elif section_name: print(f'{section_name} took {elapsed_time} seconds.') else: print(f'Execution took {elapsed_time} seconds.') return import cudf import random import numpy as np import networkx as nx import metagraph as mg r = mg.resolver print("Generating graph.") num_nodes = int(1e6) source_nodes = list(range(num_nodes)) target_nodes = list(range(num_nodes)) weights = list(range(num_nodes)) random.shuffle(source_nodes) random.shuffle(target_nodes) df = cudf.DataFrame({"source": source_nodes, "target": target_nodes, "weight": weights}) print("Loading into metagraph.") edge_list = r.wrappers.EdgeMap.CuDFEdgeMap(df, src_label="source", dst_label="target", is_directed=False) print("Timing translation.") with timer(section_name="cupy"): r.translate(edge_list, r.types.EdgeMap.ScipyEdgeMapType)
CuDFNodeMap -> PythonNodeMapType translator speed up from 24.827038526535034 seconds to 0.006268501281738281 seconds on 1e4 size node map. ScipyEdgeMap -> CuDFEdgeMap translator speed up from 0.037711374759674072 seconds to 0.003190011978149414 seconds on 1e4 size edge map. ScipyEdgeSet -> CuDFEdgeSet translator speed up from 0.033192789554595947 seconds to 0.002845954895019531 seconds on 1e4 size edge set.
This commit optimizes the ScipyEdgeSet -> CuGraphEdgeSet translator. Original Time: 0.08482241868972779 New Time: 0.05516828298568725 We used this code to get those numbers: from typing import Callable, Generator from contextlib import contextmanager @contextmanager def timer(section_name: str = None, exitCallback: Callable[[float], None] = None) -> Generator: import time start_time = time.time() yield end_time = time.time() elapsed_time = end_time - start_time if exitCallback != None: exitCallback(elapsed_time) elif section_name: print(f'{section_name} took {elapsed_time} seconds.') else: print(f'Execution took {elapsed_time} seconds.') return from statistics import mean import cudf import random import numpy as np import networkx as nx import metagraph as mg r = mg.resolver print("Generating data.") num_nodes = int(1e4) source_nodes = list(range(num_nodes)) target_nodes = list(range(num_nodes)) random.shuffle(source_nodes) random.shuffle(target_nodes) df = cudf.DataFrame({"source": source_nodes, "target": target_nodes}) print("Loading into metagraph.") edge_list = r.wrappers.EdgeSet.CuDFEdgeSet(df, src_label="source", dst_label="target", is_directed=False) edge_list = r.translate(edge_list, r.types.EdgeSet.ScipyEdgeSetType) print("Timing translation.") times = [] for _ in range(int(1e2)): with timer(exitCallback=lambda new_time: times.append(new_time)): r.translate(edge_list, r.types.EdgeSet.CuGraphEdgeSetType) time = mean(times) print(f"time {repr(time)}")
cugraph.Graph data structures can store data as adjacency lists (CSR) rather than edge lists (COO). When a cugraph is stored this way and we're translating to SciPy sparse matrix, we shouldn't go out of our way to convert the cugraph CSR matrix to COO (especially when the cugraph data structure does not have the edge list / COO data already computed). This commit makes it so that the CuGraph* -> Scipy* translators do not always convert to COO before creating the SciPy sparse matrix. It'll simply translate the CSR data if it's already available or otherwise translate to COO. There's now no longer any COO<->CSR translation as before in these translators.
This commit includes 2 cugraph translator improvements: * Removing calls to the method toccsr when converting to SciPy graphs as it's not always clear that this is a neccessary or will lead to better performance. There's no reason to pay this O(n) translation cost unless it's necessary. * For the Scipy* -> CuGraph* translators, if the Scipy* matrix is in CSR format, we now translate it directly into the CuGraph*. Before, we converted the CSR data to COO before translating it to the CuGraph* Tests were updated as well as a result of these changes.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains several translator optimizations.
Some timings are below.
CuDFEdgeSet->ScipyEdgeSet (1e6 edges)
CuDFEdgeMap->ScipyEdgeMap (1e6 edges)
CuDFNodeMap->PythonNodeMap (1e4 nodes)
ScipyEdgeMap->CuDFEdgeMap (1e4 edges)
ScipyEdgeSet->CuDFEdgeSet (1e4 edges)
ScipyEdgeSet -> CuGraphEdgeSet (1e4 edges)
ScipyEdgeMap -> CuGraphEdgeMap (1e4 edges)
CuGraphEdgeSet -> ScipyEdgeSet (1e4 edges)
CuGraphEdgeMap -> ScipyEdgeMap (1e4 edges)
CuGraph -> ScipyGraph (1e4 edges)
ScipyGraph -> CuGraph (1e4 edges)