Skip to content

Errors and Warnings with ImmuCellAI_new and ImmuCellAI functions when requesting ICB response predictions #2

@s-bell

Description

@s-bell

Hello,

Thank you for putting together this package. Unfortunately I've encountered a few issues when attempting to use it - particularly when invoking the option to calculate ICB response predictions.

Issue description:
A series of errors and warnings while using both ImmuCellAI_new and ImmuCellAI functions. The issues range from missing objects to type coercion warnings and deprecated array recycling. This happens when the input is a filtered matrix.

Reproducible example:

# Read the data into a data frame
data_df <- read.table("http://bioinfo.life.hust.edu.cn/static/ImmuCellAI/data/AML_sample.txt", header = TRUE, sep = "\t", stringsAsFactors = FALSE)

# Convert the data frame to a matrix
data_matrix <- as.matrix(data_df)

# Set row and column names
rownames(data_matrix) <- data_matrix[,1]
colnames(data_matrix) <- names(data_df)

# Remove the first column as it's now the row names
data_matrix <- data_matrix[,-1]

# Identify columns that start with a specific letter, e.g., 'X'
cols_to_remove <- grepl("^X", colnames(data_matrix))

# Remove those columns
data_matrix <- data_matrix[, !cols_to_remove]

# Filter out genes with constant expression across all samples
non_constant_genes <- which(apply(data_matrix, 1, function(x) length(unique(x)) > 1))
filtered_expression_matrix <- data_matrix[non_constant_genes, ]

# Try analysis
test <- ImmuCellAI_new(filtered_expression_matrix, "rnaseq", 0, 1)

# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 10 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 14 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Error in result_layer(t(round(all_norm, 3)), group_tag, response_tag) : 
#   object 'train_data' not found
# In addition: Warning message:
#   In .filterFeatures(expr, method) :
#   1 genes with constant expression values throuhgout the samples.

# Turn off ICM

test2 <- ImmuCellAI_new(filtered_expression_matrix, "rnaseq",0,0)

# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 10 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 14 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Warning message:
#   In .filterFeatures(expr, method) :
#   1 genes with constant expression values throuhgout the samples.

# Find genes with constant expression across all samples
constant_gene <- which(apply(filtered_expression_matrix, 1, function(x) length(unique(x)) == 1))

# Get the names of these genes
constant_gene_name <- rownames(filtered_expression_matrix)[constant_gene]

# Report the genes
print(constant_gene_name)
# character(0)

# Training data not loaded with ImmuCellAI_new function (https://github.com/lydiaMyr/ImmuCellAI/blob/main/R/ImmuCellAI.R - beginning line 317)
data("train_data")
data("train_tag")

test3 <- ImmuCellAI_new(filtered_expression_matrix, "rnaseq", 0, 1)

# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 10 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 14 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Error in if (any(as.integer(y) != y)) stop("dependent variable has to be of factor or integer type for classification mode.") : 
#   missing value where TRUE/FALSE needed
# In addition: Warning messages:
#   1: In .filterFeatures(expr, method) :
#   1 genes with constant expression values throuhgout the samples.
# 2: In cret$cresults * scale.factor :
#   Recycling array of length 1 in vector-array arithmetic is deprecated.
# Use c() or as.vector() instead.
# 
# 3: In svm.default(x, y, scale = scale, ..., na.action = na.action) :
#   NAs introduced by coercion

# Use original function
test4 <- ImmuCellAI(filtered_expression_matrix, "rnaseq", 0, 1, 0)

# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 24 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Error in if (any(as.integer(y) != y)) stop("dependent variable has to be of factor or integer type for classification mode.") : 
#   missing value where TRUE/FALSE needed
# In addition: Warning messages:
#   1: In .filterFeatures(expr, method) :
#   1 genes with constant expression values throuhgout the samples.
# 2: In cret$cresults * scale.factor :
#   Recycling array of length 1 in vector-array arithmetic is deprecated.
# Use c() or as.vector() instead.
# 
# 3: In svm.default(x, y, scale = scale, ..., na.action = na.action) :
#   NAs introduced by coercion

# Convert train tag to a dataframe and set as factor
train_tag <- as.data.frame(train_tag, stringsAsFactors = TRUE)

# Use new function
test5 <- ImmuCellAI_new(filtered_expression_matrix, "rnaseq", 0, 1)

# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 10 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 14 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Warning messages:
#   1: In .filterFeatures(expr, method) :
#   1 genes with constant expression values throuhgout the samples.
# 2: In cret$cresults * scale.factor :
#   Recycling array of length 1 in vector-array arithmetic is deprecated.
# Use c() or as.vector() instead.

# Sanity check for original function
test6 <- ImmuCellAI(filtered_expression_matrix, "rnaseq", 0, 1, 0)

# Setting parallel calculations through a MulticoreParam back-end
# with workers=20 and tasks=100.
# Estimating ssGSEA scores for 24 gene sets.
# [1] "Calculating ranks..."
# [1] "Calculating absolute values from ranks..."
# |=========================================================================================| 100%
# 
# [1] "Normalizing..."
# Error in if (any(as.integer(y) != y)) stop("dependent variable has to be of factor or integer type for classification mode.") : 
#   missing value where TRUE/FALSE needed
# In addition: Warning messages:
#   1: In .filterFeatures(expr, method) :
#   1 genes with constant expression values throuhgout the samples.
# 2: In cret$cresults * scale.factor :
#   Recycling array of length 1 in vector-array arithmetic is deprecated.
# Use c() or as.vector() instead.
# 
# 3: In svm.default(x, y, scale = scale, ..., na.action = na.action) :
#   NAs introduced by coercion

# Reads train_tag as a series of 'R' and 'NR' characters

Expected behaviour:
The function should have been able to handle the filtered matrix without errors or warnings.

Actual behaviour:

  1. Error in result_layer: object 'train_data' not found.
  2. Warning: 1 genes with constant expression values throughout the samples.
  3. Error: missing value where TRUE/FALSE needed.
  4. Warning: Recycling array of length 1 in vector-array arithmetic is deprecated.

Environment:

> sessionInfo()
R version 4.3.1 (2023-06-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 22.04.3 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3 
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0

locale:
 [1] LC_CTYPE=en_GB.UTF-8       LC_NUMERIC=C               LC_TIME=en_GB.UTF-8       
 [4] LC_COLLATE=en_GB.UTF-8     LC_MONETARY=en_GB.UTF-8    LC_MESSAGES=en_GB.UTF-8   
 [7] LC_PAPER=en_GB.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=en_GB.UTF-8 LC_IDENTIFICATION=C       

time zone: Europe/London
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ImmuCellAI_0.1.0 quadprog_1.5-8   pracma_2.4.2     e1071_1.7-13     gridExtra_2.3   
[6] GSVA_1.48.3      devtools_2.4.5   usethis_2.2.2   

Thank you for your assistance with this issue.

All the best,

Steven.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions