Sampling
The graph-based machine learning community has developed various strategies for training and using models on large graphs. One of the most common approaches is training on sampled subgraphs (aka communities) pseudorandomly sampled from the larger graph.
In general, the specifics of the sampling strategy depend on the graph and the intended application. This is even more pronounced for the Bitcoin Graph, given its scale, the heterogeneity of its nodes and edges, and its extensive longitudinal nature, as it spans over a decade.
We provide customizable community sampling strategies that can be adjusted to fit the data requirements of a wide spectrum of application areas. The strategies we currently provide are:
Additionally, we provide the following ready-to-use sampled communities:
-
200kcommunities sampled using Forest Fire method containing only Script-to-Script edges.