This extends vctrs::vec_slice to S4Vectors::Vector class by masking
vec_slice with S7::new_generic. Atomic vectors and other base S3 classes
(list, data.frame, factor, Dat, POSIXct) will dispatch to the
vctrs::vec_slice method as normal. Dispatch support on the
S4Vectors::Vector and S4Vectors::DataFrame classes provides a unified
framework for working with base R vectors and S4Vectors.
S4Vectors::Vector Implementation
This method will naively call the [ method for any S4 class that inherits
from the S4Vectors::Vector class. This may not be a very efficient way to
slice up an S4 class, but will work.
With this implementation, the x@mcol data is expected to be retained after
a call to plyxp::vec_slice(x, i).
S4Vectors::DataFrame Implementation
The DataFrame implementation works similar to how vctrs::vec_slice works
on a data.frame object. What is being sliced is the rows of x@listData.
To maintain the size stability of the DataFrame object, we change @nrows
to the appropriate value, and perform a recursive call if @elementMetadata
is not NULL.
Performance
Depending on the size and complexity of your S4 Vector object, you may find
the standard subset operation is extremely slow. For example, consider a
SummarizedExperiment whose rowData contains a CompressedGRangesList
object assigned to the name "exons" and whose length is 250,000 and
underlying @unlistData is length 1,600,000. Performing a by .features
grouping operation and attempting to evaluate the exons within the row
context would force the CompressedGRangesList object to be
chopped element-wise.
Unfortunately, there is a massive performance hit in attempting to construct
250,000 GRanges. Unless you do not mind waiting over an hour for each
dplyr verb in which exons gets evaluated, doing so is not recommended.
The plyxp package is planning to export a new generic
named plyxp_s4_proxy_vec().
This attempts to reconstruct certain standard S4Vectors::Vectors as
standard vectors or tibbles. The equivalent exons object would require
much more memory use, but at the advantage of only taking several seconds to
construct. When you are done, you can attempt to restore the original S4
Vector with plyxp_restore_s4_proxy().
In development, plyxp_s4_proxy_vec() is faster to work with because there
are less checks on the object validity and all @elementMetadata and
@metadata are dropped from the objects.
Arguments
- x
A vector
- i
An integer, character or logical vector specifying the locations or names of the observations to get/set. Specify
TRUEto index all elements (as inx[]), orNULL,FALSEorinteger()to index none (as inx[NULL]).- ...
These dots are for future extensions and must be empty.