Introduction to frequentdirections
Download example data
Here, we use MNIST
package developped by @stillmatic as sample data.
You can install this package like the following:
Load data
Once you install stillmatic/MNIST
, MNIST data is
exported as MNIST::mnist_train
Example the number 8

There are 60,000 records in the data, it is little bit too much data
for usual SVD (for usual PC).
That’s why we would like to do sampling here.
df <- MNIST::mnist_train[sample(seq_len(nrow(MNIST::mnist_train)), size=10^4), ]
Plot SVD
Plot the original data on the first and second singular vector
# Last column is y column
x <- as.matrix(df[, -ncol(df)])/255
y <- df$y
frequentdirections::plot_svd(x, y)

Matrix Sketching
l = 8 case
eps <- 10^(-8)
# 10000 x 256 -> 8 * 256 matrix
b <- frequentdirections::sketching(x, 8, eps)
frequentdirections::plot_svd(x, y, b)

l = 32 case
# 10000 x 256 -> 32 * 256 matrix
b <- frequentdirections::sketching(x, 32, eps)
frequentdirections::plot_svd(x, y, b)

l = 128 case
# 10000 x 256 -> 128 * 256 matrix
b <- frequentdirections::sketching(x, 128, eps)
frequentdirections::plot_svd(x, y, b)

This result is almost the same with the original data SVD
That’s why we can think that the original data is expressed with only