Skip to content

Hi everyone archr in scenicplus ??,  #631

@A-legac45

Description

@A-legac45

Hi everyone,

I am trying to understand properly what should I give from my archr project build in R to scenic+ in order to make it run.

My Archr object is already clean, with a UMAP ATAC and UMAP RNA (miltiomic data) and preprocess with cisbp annotation .... How can I implement this informations to scenic plus ??

Thanks in advance for your advice and your time

I saw this previous issue but not get the point of everything

ISSUE OLD FROM DEC 6 2023 l @laurajohanne

I am working with an ArchR project with 64654 cells.

Obtaining count matrix (for peaks) by converting the count matrix to a h5ad file formate (due to size) ;

matrix <- ArchRtoSignac::getPeakMatrix(ArchRProject= ArchRPro)
scenic_plus_obj <- Seurat::CreateSeuratObject(counts = matrix, project = "count_matrix")
sceasy::convertFormat(scenic_plus_obj, from="seurat", to ="anndata",
outFile='/work/ArchR/data_trine_for_scenicplus/scenic_plus_obj.h5ad')

Next I follow the tutorial; https://pycistopic.readthedocs.io/en/latest/Toy_melanoma-RTD.html with the exception of when creating the cisTopic object since the count matrix is a Sparse matrix.

Create cisTopic object

cistopic_obj = create_cistopic_object(fragment_matrix= count_matrix, cell_names = cell_annotations_list , region_names = region_annotations_list, path_to_blacklist='/work/Scenicplus_pipeline/data/mm10-blacklist.v2.bed')

Adding cell information

cistopic_obj.add_cell_data(cell_data)

pickle.dump(cistopic_obj,
open(os.path.join(outDir, 'cistopic_obj.pkl'), 'wb'))
print(cistopic_obj)

Run models

models=run_cgs_models_mallet(path_to_mallet_binary,
cistopic_obj,
n_topics=[2,5,10,15,20,25,30,35,40,45,50],
n_cpu=55,
n_iter=500,
random_state=1,
alpha=50,
alpha_by_topic=True,
eta=0.1,
eta_by_topic=False,
tmp_path= tmpDir, #Use SCRATCH if many models or big data set
save_path = None)

with open(outDir+'Mallet_models_500.pkl', 'wb') as f:
pickle.dump(models, f)

model=evaluate_models(models,
select_model=None,
return_model=True,
metrics=['Arun_2010','Cao_Juan_2009', 'Minmo_2011', 'loglikelihood'],
plot_metrics= False,
save= outDir + 'models/model_selection.pdf')

cistopic_obj.add_LDA_model(model)

Skærmbillede 2023-12-06 kl  14 01 53

Find clusters

from pycisTopic.clust_vis import *
find_clusters(cistopic_obj,
target = 'cell',
k = 40,
res = [0.6, 1.2],
seed = 1,
prefix = 'pycisTopic_',
scale = True,
split_pattern = '-')

UMAP

from pycisTopic.clust_vis import *
run_umap(cistopic_obj, target = 'cell', scale = True, random_state = 1)

plot_metadata(cistopic_obj,
reduction_name = 'UMAP',
variables = ['OrderCelltype_merged', 'pycisTopic_leiden_40_0.6', 'pycisTopic_leiden_40_1.2'],
num_columns=3,
text_size=10,
dot_size=0.5,
figsize=(15,7),
save= '/work/Scenicplus_pipeline/dataoutput/visualization/dimensionality_reduction_label_uncorrected_merged_merged.pdf')

I do not encounter any issues in the next steps of the tutorial until I reach visualization with UMAP where I am having issues.

When using the cell annotation from the ArchR project the cells seems to be randomly clustered together. When using their own clustering method leiden they cluster nicely visualize but it does not match the cell annotation.

Skærmbillede 2023-12-06 kl  13 44 37

When using ArchR to cluster the data it also appear that the annotation is correct. The data have already been preprocessed and used by multiple people and I am confident that the annotation is correct.

UMAP from ArchR

Skærmbillede 2023-12-06 kl  13 46 49

And when using the UMAP coordinates from ArchR within pycisTopic it appear that the two methods find quite different coordinates.

UMAP with coordinates from ArchR in pycisTopic

Skærmbillede 2023-12-06 kl  13 49 28

I have already checked if the order of the cells are the same in both the ArchR project and the cisTopic object, which they are. And I have used same seed parameter and number of neighbors within both methods.

Does anyone might have an idea what my next step should be?

Thanks in advance,

Laura

Originally posted by @laurajohanne in #266

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions