graphlearn_torch.loader

neighbor_loader

class NeighborLoader(data: Dataset, num_neighbors: List[int] | Dict[Tuple[str, str, str], List[int]], input_nodes: Tensor | str | Tuple[str, Tensor], neighbor_sampler: NeighborSampler | None = None, batch_size: int = 1, shuffle: bool = False, drop_last: bool = False, with_edge: bool = False, strategy: str = 'random', device: device = device(type='cuda', index=0), as_pyg_v1: bool = False, **kwargs)[source]

Bases: NodeLoader

A data loader that performs node neighbor sampling for mini-batch training of GNNs on large-scale graphs.

Parameters:

data (Dataset) – The graphlearn_torch.data.Dataset object.
num_neighbors (List[int] or Dict[Tuple[str, str, str], List[int]]) – The number of neighbors to sample for each node in each iteration. In heterogeneous graphs, may also take in a dictionary denoting the amount of neighbors to sample for each individual edge type. If an entry is set to -1, all neighbors will be included.
input_nodes (torch.Tensor or str or Tuple[str, torch.Tensor]) – The indices of nodes for which neighbors are sampled to create mini-batches. Needs to be either given as a torch.LongTensor or torch.BoolTensor. In heterogeneous graphs, needs to be passed as a tuple that holds the node type and node indices.
batch_size (int) – How many samples per batch to load (default: 1).
shuffle (bool) – Set to True to have the data reshuffled at every epoch (default: False).
drop_last (bool) – Set to True to drop the last incomplete batch, if the dataset size is not divisible by the batch size. If False and the size of dataset is not divisible by the batch size, then the last batch will be smaller. (default: False).
with_edge (bool) – Set to True to sample with edge ids and also include them in the sampled results. (default: False).
strategy – (str): Set sampling strategy for the default neighbor sampler provided by graphlearn-torch. (default: "random").
as_pyg_v1 (bool) – Set to True to return result as the NeighborSampler in PyG v1. (default: False).

transform

to_data(sampler_out: SamplerOutput, batch_labels: Tensor | None = None, node_feats: Tensor | None = None, edge_feats: Tensor | None = None, **kwargs) → Data[source]

to_hetero_data(hetero_sampler_out: HeteroSamplerOutput, batch_label_dict: Dict[str, Tensor] | None = None, node_feat_dict: Dict[str, Tensor] | None = None, edge_feat_dict: Dict[Tuple[str, str, str], Tensor] | None = None, **kwargs) → HeteroData[source]