Building and Analyzing RNA Co-Expression Networks

Project Overview

Given RNA-seq gene expression quantification files (TSV format), this project:

Computes pairwise gene–gene Pearson correlations across samples
Constructs a weighted, undirected gene co-expression network
Detects gene modules using the Louvain community detection algorithm
Computes network topology statistics (e.g., degree distribution and clustering behavior)
Compares real networks to Fixed-m random graph models and, when two datasets are provided, performs cross-dataset comparisons

An R Shiny interface is included that allows users to run the Go analysis pipeline and interactively explore the resulting networks. Users can switch between datasets, view Louvain communities ranked by density, filter edges by correlation sign (positive/negative), customize colors, toggle gene labels, and export network visualizations.

Pipeline Overview

Top-down pipeline structure:

Input: RNA-seq gene expression TSV files
Preprocessing: filtering and organization of expression data
Correlation: computation of Pearson gene–gene correlations
Graph Construction: weighted, undirected co-expression network
Community Detection: Louvain modularity optimization
Network Analysis: topology statistics and statistical comparisons
Visualization: interactive exploration using R Shiny

Repository Structure

main.go
Orchestrates the full analysis pipeline: loads datasets, builds networks, runs Louvain community detection, computes statistics, performs comparisons, and writes output CSV files.
functions.go
Contains core helper functions for data parsing, preprocessing, correlation computation, graph construction, community analysis, and statistical testing.
io.go
File input/output operations for reading TSV files and writing CSV outputs.
datatypes.go
Defines data structures used throughout the pipeline.
louvain/
Implementation of the Louvain community detection algorithm used by the pipeline.
GeneExpressionData/
Directory for input datasets:
- GeneExpressionData/Dataset1/
- GeneExpressionData/Dataset2/
ShinyApp/
R Shiny interface (app.R) and CSV output files generated by the Go pipeline.

Installation

Prerequisites

Go 1.24+ (tested with Go 1.24.5)
R 4.2+

Required R Packages

install.packages(c(
  "shiny",
  "visNetwork",
  "colourpicker",
  "shinycssloaders",
  "here"
))

Usage

Input Data

Place RNA-seq gene expression quantification TSV files into:

GeneExpressionData/Dataset1/ (required)
GeneExpressionData/Dataset2/ (optional, for cross-dataset comparison)

Each TSV file should have:

First column: Gene identifiers
Subsequent columns: Sample names with TPM (Transcripts Per Million) unstranded count values
Tab-separated format
The pipeline specifically parses TPM unstranded count data from the expression files

Each dataset directory may contain one or more TSV files corresponding to samples from the same condition (e.g., cancer type).

Running the Go Analysis

Make sure you have Go 1.24 or higher installed and Go modules enabled.

Clone this repository:

git clone https://github.com/efranken-25/02-601_Project_Fall2025.git
cd 02-601_Project_Fall2025

Install any Go dependencies (if not already in go.mod):
```
go get ./...
```
Build the Go executable:
```
go build -o 02-601_Project_Fall2025
```
Run the main Go program:
- For one dataset:
```
./02-601_Project_Fall2025 1
```
- For two datasets:
```
./02-601_Project_Fall2025 2
# or simply:
./02-601_Project_Fall2025
```
  Note: Two datasets is the default behavior.
The Go program prints progress updates and summary statistics to the terminal and writes output CSV files to the ShinyApp/ directory.

Running the Shiny App (R)

Note: The Go executable must be built before launching the Shiny app, as the app calls the compiled Go program directly.

Open the R script in R or RStudio

Make sure you have the required packages installed:

install.packages(c("shiny", "visNetwork", "colourpicker", "shinycssloaders", "here"))

Set your working directory to the project folder (so that the Shiny app can access the CSVs):
```
setwd("/path/to/02-601_Project_Fall2025")
```

Launch the Shiny app:

shiny::runApp("ShinyApp", launch.browser = TRUE)

Note: If your R working directory is not set to the project root, you may alternatively provide the full path:

shiny::runApp("/path/to/02-601_Project_Fall2025/ShinyApp", launch.browser = TRUE)

The app will open in your web browser.
Click "Run Full Analysis in Go" to execute the Go pipeline.
Progress updates and analysis summaries will print to the R console.

Once the analysis completes, you can:
- Switch between datasets
- Select Louvain communities
- Filter edges by correlation sign
- Customize visualization settings
- Export network images

Features

The Shiny app allows users to:

Run the Go pipeline
Switch between datasets
Explore Louvain communities ranked by density
Filter edges by correlation sign
Customize visualization options
Toggle labels
Export network images

Outputs

After running the Go analysis pipeline, the following outputs are generated:

File Outputs

The Go pipeline writes CSV files to the ShinyApp/ directory, including:

Node tables: gene identifiers with assigned Louvain community labels
Edge tables: weighted, undirected edges with Pearson correlation values
Community statistics: per-module size, density, and related structural metrics

These CSV files are used by the R Shiny interface for visualization and interaction.

Console Output

The Go pipeline also prints human-readable summaries during execution.
These summaries appear:

in the terminal when Go is run directly, or
in the R console when Go is executed through the Shiny app.

Printed summaries include:

Graph properties: number of nodes and edges, mean degree, edge density, and proportions of positive vs. negative edges
Module analysis: number of modules, module size statistics (min / median / mean ± SD / max), and largest module details
Module structure: per-module tables reporting node counts, edge counts, and densities
Network measures: Louvain modularity scores and global clustering coefficients (mean ± standard deviation)
Statistical comparisons (KS tests):
- Real vs. Fixed-m random graph degree distributions (per dataset)
- Cross-dataset degree distribution comparisons (when two datasets are provided)
- Cross-dataset clustering coefficient distribution comparisons (when two datasets are provided)
- Corresponding D-statistics, p-values, and significance interpretations

Coding Demonstration

A short recorded coding demonstration showing how to run the pipeline and explore results is available here:

https://drive.google.com/file/d/1mlOepfbSLY5wKjs8FrVG2FCqNKTcczVb/view?usp=sharing

Authors

Beth Vazquez Smith - @bvazquezsmith
Noemi Banda - @b-noemi
Emma Franken - @efranken-25

Name		Name	Last commit message	Last commit date
Latest commit History 133 Commits
ExtraData		ExtraData
GeneExpressionData		GeneExpressionData
ShinyApp		ShinyApp
Tests		Tests
graph-clustering-service-main		graph-clustering-service-main
louvain		louvain
.DS_Store		.DS_Store
.RData		.RData
.Rhistory		.Rhistory
02-601_Project_Fall2025		02-601_Project_Fall2025
README.md		README.md
datatypes.go		datatypes.go
functions.go		functions.go
functions_test.go		functions_test.go
go.mod		go.mod
go.sum		go.sum
helper_testing_functions.go		helper_testing_functions.go
io.go		io.go
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Building and Analyzing RNA Co-Expression Networks

Table of Contents

Project Overview

Pipeline Overview

Repository Structure

Installation

Prerequisites

Required R Packages

Usage

Input Data

Running the Go Analysis

Running the Shiny App (R)

Features

Outputs

File Outputs

Console Output

Coding Demonstration

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Building and Analyzing RNA Co-Expression Networks

Table of Contents

Project Overview

Pipeline Overview

Repository Structure

Installation

Prerequisites

Required R Packages

Usage

Input Data

Running the Go Analysis

Running the Shiny App (R)

Features

Outputs

File Outputs

Console Output

Coding Demonstration

Authors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages