Skip to content

Conversation

@pranavm2109
Copy link

Background

The NormalizeData generic has several different implementations depending on the class of the first object passed as a parameter to the method. These are NormalizeData.Assay, NormalizeData.default, NormalizeData.Seurat, NormalizeData.StdAssay, and NormalizeData.V3Matrix*. Each of these methods has three possible in-built normalization methods: "LogNormalize", "CLR" and "RC". Excluding "CLR", the other two methods make use of a scale.factor parameter to multiply the normalized values during the normalization process. This value is defaulted to 1e4 for LogNormalize and 1 for RelativeCounts (RC).

Updates

I have added in the capacity for the implementations of LogNormalize, RelativeCounts and .SparseNormalize to compute the median of the counts across all columns (cells) (or rows (genes) if margin = 1L in the case of LogNormalize.default) and use this as the scale.factor, if the value passed to the scale.factor parameter is "median".

I have also tested the modifications to these functions by writing unit tests in test_preprocessing.R that make sure that the median is being computed correctly if the value passed to the scale.factor parameter is "median".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant