FluidC on large and real complex networks

This small project evaluates the FluidC algorithm for community detection in large-scale and real-data complex networks. It includes Python scripts to run experiments, analyze results, and generate plots.

This was developed in July 2025 as part of a computer science doctoral course exam called "Graph Theory and Algorithms" at the University of Milano-Bicocca for the 2024-2025 academic year. A brief report on the experiments and their results is available for download. You can also get the raw results files (plots and CSV files). The project will not be updated after submission, and the code is provided as-is.

The git repository is available online on both GitLab and GitHub. However, GitHub is only a mirror of GitLab.

How to run

Prepare the virtual environment (Python 3 is required):

$ git clone https://gitlab.com/ema-pe/fluidc-large-network-analysis.git
$ cd fluidc
$ python3 -m venv .env
$ source .env/bin/activate
$ pip install -r requirements.txt

Download the complex networks and the ground-truth communities (from SNAP). A Bash script is provided to automatically download them:
```
$ chmod u+x dataset/download.sh
$ ./dataset/download.sh
```
Get the number of ground-truth communities for each network, using the ground_truth.py script. This number is a required parameter (k) for the FluidC algorithm. You may need to update the networks variable in run.py with the correct ground truth values for each network.
```
$ python ground_truth.py --communities dataset/com-amazon.all.dedup.cmty.txt.gz
# Example output: "dataset/com-amazon.all.dedup.cmty.txt.gz": 75149
```
Run FluidC algorithm with various configurations. Warning: this process runs sequentially and can be very time-consuming, depending on the size of the networks. The communities are saved in the results/ directory. If you want to run just a single FluidC execution, you can call directly the fluidc.py script.
```
$ python run.py
```
Finally, use the plot.py script to analyze the output from the experiments. This script generates metric CSV files and plots and saves them in the results/ and results/plots/ directories. The calculated metrics are execution time, normalized mutual information (NMI), adjusted rand index (ARI), and cluster purity for each FluidC execution compared to the ground truth.
```
$ python plot.py --results-dir results --graph-name com-amazon com-dblp com-youtube
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

FluidC on large and real complex networks

How to run

License

About

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
dataset		dataset
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
fluidc.py		fluidc.py
ground_truth.py		ground_truth.py
plot.py		plot.py
requirements.txt		requirements.txt
run.py		run.py

License

ema-pe/fluidc-large-network-analysis

Folders and files

Latest commit

History

Repository files navigation

FluidC on large and real complex networks

How to run

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages