Skip to main content

Sampling

The graph-based machine learning community has developed various strategies for training and using models on large graphs. One of the most common approaches is training on sampled subgraphs (aka communities) pseudorandomly sampled from the larger graph.

In general, the specifics of the sampling strategy depend on the graph and the intended application. This is even more pronounced for the Bitcoin Graph, given its scale, the heterogeneity of its nodes and edges, and its extensive longitudinal nature, as it spans over a decade.

We provide customizable community sampling strategies that can be adjusted to fit the data requirements of a wide spectrum of application areas. The strategies we currently provide are:

Additionally, we provide the following ready-to-use sampled communities:

Sample Your Own Communities

  • Setup a Neo4j database

  • Run the sampling method; you may run the following command for a documentation on the command's arguments.

    .\eba.exe bitcoin sample --help