Miscellaneous Functions
StatsBase.rle
— Functionrle(v) -> (vals, lens)
Return the run-length encoding of a vector as a tuple. The first element of the tuple is a vector of values of the input and the second is the number of consecutive occurrences of each element.
Examples
julia> using StatsBase
julia> rle([1,1,1,2,2,3,3,3,3,2,2,2])
([1, 2, 3, 2], [3, 2, 4, 3])
StatsBase.inverse_rle
— Functioninverse_rle(vals, lens)
Reconstruct a vector from its run-length encoding (see rle
). vals
is a vector of the values and lens
is a vector of the corresponding run lengths.
StatsBase.levelsmap
— Functionlevelsmap(a)
Construct a dictionary that maps each of the n
unique values in a
to a number between 1 and n
.
StatsBase.indexmap
— Functionindexmap(a)
Construct a dictionary that maps each unique value in a
to the index of its first occurrence in a
.
StatsBase.indicatormat
— Functionindicatormat(x, k::Integer; sparse=false)
Construct a boolean matrix I
of size (k, length(x))
such that I[x[i], i] = true
and all other elements are set to false
. If sparse
is true
, the output will be a sparse matrix, otherwise it will be dense (default).
Examples
julia> using StatsBase
julia> indicatormat([1 2 2], 2)
2×3 Matrix{Bool}:
1 0 0
0 1 1
indicatormat(x, c=sort(unique(x)); sparse=false)
Construct a boolean matrix I
of size (length(c), length(x))
. Let ci
be the index of x[i]
in c
. Then I[ci, i] = true
and all other elements are false
.
StatsBase.midpoints
— FunctionStatsBase.midpoints(v)
Calculate the midpoints (pairwise mean of consecutive elements).
StatsAPI.pairwise
— Functionpairwise(f, x[, y];
symmetric::Bool=false, skipmissing::Symbol=:none)
Return a matrix holding the result of applying f
to all possible pairs of entries in iterators x
and y
. Rows correspond to entries in x
and columns to entries in y
. If y
is omitted then a square matrix crossing x
with itself is returned.
As a special case, if f
is cor
, diagonal cells for which entries from x
and y
are identical (according to ===
) are set to one even in the presence missing
, NaN
or Inf
entries.
Keyword arguments
symmetric::Bool=false
: Iftrue
,f
is only called to compute for the lower triangle of the matrix, and these values are copied to fill the upper triangle. Only allowed wheny
is omitted. Defaults totrue
whenf
iscor
orcov
.skipmissing::Symbol=:none
: If:none
(the default), missing values in inputs are passed tof
without any modification. Use:pairwise
to skip entries with amissing
value in either of the two vectors passed tof
for a given pair of vectors inx
andy
. Use:listwise
to skip entries with amissing
value in any of the vectors inx
ory
; note that this might drop a large part of entries. Only allowed when entries inx
andy
are vectors.
Examples
julia> using StatsBase, Statistics
julia> x = [1 3 7
2 5 6
3 8 4
4 6 2];
julia> pairwise(cor, eachcol(x))
3×3 Matrix{Float64}:
1.0 0.744208 -0.989778
0.744208 1.0 -0.68605
-0.989778 -0.68605 1.0
julia> y = [1 3 missing
2 5 6
3 missing 2
4 6 2];
julia> pairwise(cor, eachcol(y), skipmissing=:pairwise)
3×3 Matrix{Float64}:
1.0 0.928571 -0.866025
0.928571 1.0 -1.0
-0.866025 -1.0 1.0
StatsAPI.pairwise!
— Functionpairwise!(f, dest::AbstractMatrix, x[, y];
symmetric::Bool=false, skipmissing::Symbol=:none)
Store in matrix dest
the result of applying f
to all possible pairs of entries in iterators x
and y
, and return it. Rows correspond to entries in x
and columns to entries in y
, and dest
must therefore be of size length(x) × length(y)
. If y
is omitted then x
is crossed with itself.
As a special case, if f
is cor
, diagonal cells for which entries from x
and y
are identical (according to ===
) are set to one even in the presence missing
, NaN
or Inf
entries.
Keyword arguments
symmetric::Bool=false
: Iftrue
,f
is only called to compute for the lower triangle of the matrix, and these values are copied to fill the upper triangle. Only allowed wheny
is omitted. Defaults totrue
whenf
iscor
orcov
.skipmissing::Symbol=:none
: If:none
(the default), missing values in inputs are passed tof
without any modification. Use:pairwise
to skip entries with amissing
value in either of the two vectors passed tof
for a given pair of vectors inx
andy
. Use:listwise
to skip entries with amissing
value in any of the vectors inx
ory
; note that this might drop a large part of entries. Only allowed when entries inx
andy
are vectors.
Examples
julia> using StatsBase, Statistics
julia> dest = zeros(3, 3);
julia> x = [1 3 7
2 5 6
3 8 4
4 6 2];
julia> pairwise!(cor, dest, eachcol(x));
julia> dest
3×3 Matrix{Float64}:
1.0 0.744208 -0.989778
0.744208 1.0 -0.68605
-0.989778 -0.68605 1.0
julia> y = [1 3 missing
2 5 6
3 missing 2
4 6 2];
julia> pairwise!(cor, dest, eachcol(y), skipmissing=:pairwise);
julia> dest
3×3 Matrix{Float64}:
1.0 0.928571 -0.866025
0.928571 1.0 -1.0
-0.866025 -1.0 1.0