All notable changes will be documented in this file.
- Fixed column name setting in
fit_nb_offset
- Verbose messages when invoking
v2
: messages are only invoked if verbosity > 1.
- Add
fit_nb_offset
to support vst.flavor='v2' by default
- Updated cpp utilities to adhere to C++17 standards (
std::random_shuffle
->std::shuffle
) - Handling of extra variables in
latent_var
whenvst.flavor="v2"
- Changed
get_nz_median2
to supportgenes
argument; thanks @boomanaiden154 and @ScreachingFire. #155 - Replaced
get_nz_median
with faster alternativeget_nz_median2
across all calls - Removed
get_nz_median
- Updated
make_cell_attr
to be flexible for named vectors; thanks @moi-taga #171
- Specify required Matrix version to >= 1.5.0
- Add
make.sparse
to handledgCMatrix
coercsions
- Convert bitwise operators to boolean operators in utils.cpp
vst.flavor
argument tovst()
to allow for invoking running updated regularization (sctransform v2, proposed in Satija and Choudhary, 2021. See paper for details.scale_factor
tocorrect()
to allow for a custom library size when correcting counts
- Add future.seed = TRUE to all
future_lapply()
calls
- Wrap MASS::theta.ml() in suppressWarnings()
- Fix logical comparison of vectors of length one in
diff_mean_test()
compare
argument to the nonparametric differential expression testdiff_mean_test()
to allow for multiple comparisons and various ways to specify which groups to compare- Input checking at various places in
vst()
anddiff_mean_test()
- Major speed improvements for
diff_mean_test()
- Changed the
labels
argument togroup_labels
indiff_mean_test()
- Fix bug where factors in cell attributes gave error when checking for NA, NaN, inf
- Ability to control the values of latent variables when calculating corrected counts
- Offset model as method, including the ability to use a single estimated theta for all genes
- Nonparametric differential expression test for sparse non-negative data
- Improve poor coefficient initialization in quasi poisson regression
- When plotting model, do not show density by default; change bandwidth to
bw.nrd0
- Updates to C++ code to use sparse matrices as S4 objects
- Add check for NA, NaN, Inf values in cell attributes
- Remove biocViews from DESCRIPTION - not needed and was causing problems with deploying shiny apps
- Fix bug where a coefficient was given the wrong name when using
glmGamPoi
(only affected runs with a batch variable set)
- Add a
qpoisson
method for parameter estimation that uses fast Rcpp quasi poisson regression where possible (based onRfast
package); this addsRcppArmadillo
dependency
- Remove
poisson_fast
method (replaced byqpoisson
) - Use
matrixStats
package and removeRcppEigen
dependency - Use quasi poisson regression where possible
- Define cell detection event as counts >= 0.01 (instead of > 0) - this only matters to people playing around with fractional counts (see issue #65)
- Internal code restructuring and improvements
- Fix inefficiency of using
match.call()
invst()
when called viado.call
- Add support for
glmGamPoi
as method to estimate the model parameters; thanks @yuhanH for his pull request - Add option to use
theta.mm
ortheta.ml
to estimate theta whenmethod = 'poisson'
ormethod = 'nb_fast'
- Add a
poisson_fast
method for parameter estimation that uses thespeedglm
package andtheta.mm
by default - Add ability to plot overdispersion factor in
plot_model_pars
- Add and return time stamps at various steps in the
vst
function - Add functions to calculate grouped arithmetic and geometric mean per row for sparse matrices (
dgCMatrix
) - might come in handy some time
- Default theta regularization is now based on overdispersion factor (
1 + m / theta
where m is the geometric mean of the observed counts) notlog10(theta)
; old behavior available viatheta_regularization
parameter - Refactored model fitting code - is now more efficient when using parallel processing
- Changed how message and progress bar output is controlled; integer
verbosity
parameter controls all output: 0 for no output, 1 for only messages, 2 for messages and progress bars - Increased default bin size (genes being processed simultaneously) from 256 to 500
- Better input checking for cell attributes; more efficient calculation of missing ones
- Some non-regularized model parameters were not plotted
- Add function to generate data given the output of a vst run
- Add cpp support for dense integer matrices
- Minimum variance parameter added to vst function
- Rcpp versions of utility functions
- Helper functions to get corrected UMI and variance of pearson residuals for large UMI matrices
- lots of things