Monday, September 30, 2013

How to run R scripts in batch mode with arguments

Run R script as:

R CMD BATCH --no-save --no-restore '--args 1 c(2,3,4) matrix(c(5,6,7,8),nrow=2)' script.R

The script.R has content:

# This just reads the two arguments passed from the command line
# and assigns them to a vector of characters.
args <- commandArgs (TRUE)
# Here you should add some error exception handling code
# in case the number of passed arguments doesn't match what
# you expect (check what Forester did in his example)

# Parse the arguments (in characters) and evaluate them
vec1 <- eval( parse( text= args[1] ) )
vec2 <- eval( parse( text= args[2] ) )
mat1 <- eval( parse( text= args[3] ) )

# Check
print(vec1) # prints a vector of length 1
print(vec2) # prints a vector of length 3
print(mat1) # prints a 2 x 2 matrix


Reference:
http://shihho.wordpress.com/2012/11/30/r-how-to-run-r-scripts-in-batch-mode-with-arguments/

Saturday, September 21, 2013

Tuning NFS Performance

On a Ubuntu system, by default the NFS daemon will only spawn up to 8 processes. The default number may not be sufficient to handle multiple nfs connections by the clients for a heavily loaded system. To check whether the default is sufficient, we can look at RPC statistics using nfsstat command on the NFS client:

# nfsstat -rc
Client rpc stats:
calls      retrans    authrefrsh
236317426   2          236317430


In the example above, the retrans (retransmissions) value is larger than 0, indicating that the number of available NFS kernel threads on the server is insufficient to handle the requests from this client.

To increase the number of NFS threads on the server, we need to change the configuration of item RPCNFSDCOUNT in files /etc/default/nfs-kernel-server and /etc/init.d/nfs-kernel-server. Increase this number to 32 on a moderately busy server, or up to 128 on a more heavily used system. Restart NFS service and then run command nfsstat -rc on the client to check whether the number of NFS threads is sufficient. If the retrans value is 0, it is enough; otherwise, increase the number of threads further.

On the clients, we can change the mount command options rsize and wsize to optimize transfer speeds. These two options specify the size of the chunks of data that the client and server pass back and forth to each other. By default, most clients will mount remote NFS file systems with an 8-KB read/write block size. Significant performance gains can be made to NFS performance with some simple tweaks to the rsize/wsize options. It is suggested (see reference 1) to mount with the following options on the client for improved NFS performance:
    rsize=32768,wsize=32768,intr,noatime
If the NFS filesystem is mounted via /etc/fstab, change the mount configuration there like the following:
   server:/path/to/shared /shared nfs rsize=32768,wsize=32768,intr,noatime

Reference:
1. http://www.techrepublic.com/blog/linux-and-open-source/tuning-nfs-for-better-performance/