CMiner is an algorithm for mining patterns from graphs using a user-defined support technique. This implementation provides a command-line interface for running the algorithm, with configurable options like minimum and maximum nodes, support, and search approach.
Make sure you have the following requirements to run the project:
- Python: Version 3.11.6
- pip: Version 24.2
-
Clone the repository:
git clone https://github.com/SimoneAvellino/CMiner
-
Download the repository from https://github.com/SimoneAvellino/CMiner.
-
Move into the repository folder:
cd CMiner -
Install the dependencies:
pip install -r requirements.txt
-
Install the library in
editablemode:pip install -e .
CMiner <db_file> <support> [options]db_file: Absolute path to the graph database file.support: Minimum support for pattern extraction: Specify a value between0and1to represent a percentage (e.g.,0.2for 20%) or an absolute number (e.g.,20for at least 20 graphs). To find patterns in all graphs, use1(100%). For patterns in at least one graph, use a value greater than1(e.g.,1.1).
-l,--min_nodes: Minimum number of nodes in the pattern (default: 1).-u,--max_nodes: Maximum number of nodes in the pattern (default: infinite).-n,--num_nodes: Exact number of nodes in the pattern (if this option is set, -l and -u are not considered).-d,--directed: Flag to indicate if the graphs are directed (default: 1, directed).-m,--show_mappings: Display mappings of found patterns (default: 0, not displayed).-t,--templates_path: File paths to start the search. The index of the nodes must start from 0.-f,--with_frequencies: Display for each pattern the frequency in each graph. (default: 0, not displayed).-x,--pattern_type: Flag to indicate the type of pattern that CMiner return. It can be 'all', 'maximum' (default: all) NOTE: this feature is under development, it could have bug.-o,--output_path: File path to save results, if not set the results are shown in the console.-w,--worker: Number of parallel workers to mine the patterns.
- Mine patterns from 2 up to 3 nodes, present in at least 50% of graphs in the database.
CMiner /path/to/db.data 0.5 -l 2 -u 3- Mine all patterns present in at least 2 graphs in the database that have exactly 5 nodes.
CMiner /path/to/db.data 2 -n 5Some usage examples from the folder experiments/Datasets/OntoUML:
- Mine all patterns present in at least 2 graphs in the database that match the template defined in
S1.txt:
CMiner ./ontographs.data 2 -t ./S1.txt -n 3Note: we specify -n 3 so that only solutions that are exactly the template are returned.
- Same as before, but this time node labels are not specified:
CMiner ./ontographs.data 2 -t ./S2.txt -n 3- You can also partially or completely omit labels for both nodes and edges:
CMiner ./ontographs.data 2 -t ./S3.txt -n 3

