Title: | Conditional Random Sampling Sparse Matrices |
---|---|
Description: | Conducts conditional random sampling on observed values in sparse matrices. Useful for training and test set splitting sparse matrices prior to model fitting in cross-validation procedures and estimating the predictive accuracy of data imputation methods, such as matrix factorization or singular value decomposition (SVD). Although designed for applications with sparse matrices, CRASSMAT can also be applied to complete matrices, as well as to those containing missing values. |
Authors: | Nick Kunz |
Maintainer: | Nick Kunz <[email protected]> |
License: | GPL-3 |
Version: | 0.0.6 |
Built: | 2025-01-21 06:12:03 UTC |
Source: | https://github.com/nickkunz/crassmat |
Data for implementing the example given for CRASSMAT.
data(A)
data(A)
A sparse matrix containing 15 columns and 3000 observations
Nick Kunz <[email protected]>
Conducts conditional random sampling on observed values in sparse matrices. Useful for training and test set splitting sparse matrices prior to model fitting in cross-validation procedures and estimating the predictive accuracy of data imputation methods, such as matrix factorization or singular value decomposition (SVD). Although designed for applications with sparse matrices, CRASSMAT can also be applied to complete matrices, as well as to those containing missing values.
crassmat(data, sample_thres, conditional)
crassmat(data, sample_thres, conditional)
data |
a matrix (supports sparsity, missing values, and complete matrices) |
sample_thres |
a non-negative decimal specifying the percentage of observed values sampled out |
conditional |
a non-negative integer specifying the number of observed values to remain per row |
Takes a matrix Aij and samples out a single jth value on the condition that the number of jth values within the ith observation is greater than the specified conditional (minimum number of values to remain per ith observation). This process repeats itself until the specified sampling threshold is met.
Returns a matrix object with observed values removed according to the specified sample_thres
and conditional
.
Nick Kunz <[email protected]>
Kunz, N. (2019). Unsupervised Learning for Submarket Modeling: A Proxy for Neighborhood Change (Master's Thesis). Columbia University, New York, NY.
## test set A_test <- A ## training set A_train <- crassmat(data = A, # matrix sample_thres = 0.20, # remove 20% of observed values conditional = 1) # keep > 1 observed values per row
## test set A_test <- A ## training set A_train <- crassmat(data = A, # matrix sample_thres = 0.20, # remove 20% of observed values conditional = 1) # keep > 1 observed values per row