Saturday 14 November 2015

Find the k Smallest Elements in a List using R

This program is definitely not the most efficient way to find the k smallest elements in a List. But I simply need this now. Improvements might be done later and you are mostly welcome to share any ideas of improvements.

l = c(7,3,6,1,0,2,10,9)

KSmallestElem <- function(DataList, k)
{
  n = length(DataList)
  kmin = {}
  
  initMinIndex = which.min(DataList)
  initMinVal = min(DataList)
  kmin = rbind(kmin,c(initMinIndex,initMinVal))
  
  # 1st level loop:
  #     k times to find the smallest k elemenst
  for(i in 1:k)
  {
    minIndex = 1
    minValue = DataList[1]
    # 2nd level loop:
    #     over the original DataList
    for(j in 1:n)
    {
      if(DataList[j] > kmin[i,2] &&
         DataList[j] < minValue)
      {
        minIndex = j
        minValue = DataList[j]
      }
      
    }
    
    kmin = rbind(kmin,c(minIndex,minValue))
  }
  
  return(kmin)
}


KSmallestElem(l,3)

#################################
Results:
> KSmallestElem(l,3)
     [,1] [,2]
[1,]    5    0
[2,]    4    1
[3,]    6    2
[4,]    2    3









Saturday 7 November 2015

Reverse scaling data/ Unscaling data in R

In machine learning, data should usually be scaled before feeding to the training model since variables might be in different kinds of ranges. If not scaled, it shall make the prediction far from accuracy when calculating the distances among  data.

However, we still need a prediction result in the original range, which is the reverse of scaling.

In r, we could do this:

attributes(d$s.x)

d$s.x * attr(d$s.x, 'scaled:scale') + attr(d$s.x, 'scaled:center')
for example:
> x <- 1:10
> s.x <- scale(x)
> s.x
            [,1]
 [1,] -1.4863011
 [2,] -1.1560120
 [3,] -0.8257228
 [4,] -0.4954337
 [5,] -0.1651446
 [6,]  0.1651446
 [7,]  0.4954337
 [8,]  0.8257228
 [9,]  1.1560120
[10,]  1.4863011
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765
> s.x * attr(s.x, 'scaled:scale') + attr(s.x, 'scaled:center')
      [,1]
 [1,]    1
 [2,]    2
 [3,]    3
 [4,]    4
 [5,]    5
 [6,]    6
 [7,]    7
 [8,]    8
 [9,]    9
[10,]   10
attr(,"scaled:center")
[1] 5.5
attr(,"scaled:scale")
[1] 3.02765



ref:
http://stackoverflow.com/questions/10287545/backtransform-scale-for-plotting

If it helps, you could also try the R library:
http://www.inside-r.org/packages/cran/DMwR/docs/unscale