Sunday, July 6, 2014

Creating Perl Modules

How to create a Perl Module for code reuse?

http://perlmaven.com/how-to-create-a-perl-module-for-code-reuse

José's Guide for creating Perl modules

http://www.perlmonks.org/?node_id=431702

Simple Module Tutorial

http://www.perlmonks.org/?node_id=102347

Thursday, July 3, 2014

What is principal component analysis?

A really good post by Lior Pachter
http://liorpachter.wordpress.com/2014/05/26/what-is-principal-component-analysis/

Print specific column of a text file

To print the second last column of a tab delimited file,

awk -F '\t' '{print $(NF-1)}' file

NF is a special awk variable that contains the number of fields in the current record.

Merge multiple text files while deleting the first line of all files

tail -q -n +2 file1 file2 file3

Reference
http://stackoverflow.com/questions/10103619/unix-merge-many-files-while-deleting-first-line-of-all-files

Tuesday, June 17, 2014

Convert VCF chromosome notation

VCF files from difference sources may use different chromosome notations, either with or without chr. To make a consistent notation, Vivek provided two lines of awk code to swiftly convert vcf chromosome naming format from one to another.

1. Remove 'chr' from the chromosome notation:

awk '{gsub(/^chr/,""); print}' with_chr.vcf > no_chr.vcf

2. Add chr before chromosome id

awk '{if($0 !~ /^#/) print "chr"$0; else print $0}' no_chr.vcf > with_chr.vcf


Reference:
https://www.biostars.org/p/98582/

Thursday, June 12, 2014

Bash array examples

http://www.thegeekstuff.com/2010/06/bash-array-tutorial/

Wednesday, June 11, 2014

Sort every n lines in a file

http://edwards.sdsu.edu/labsite/index.php/robert/399-sorting-fastq-files-by-their-sequence-identifiers

Sorting FASTQ files by their sequence identifiers

In certain cases, you need to sort FASTQ files by their sequence identifiers (e.g. to fix the order of paired-end or mate-pair sequences). There are several ways of sorting the FASTQ files, but the simplest way is usually the best. Here is a one liner to do the job:
 cat file.fastq | paste - - - - | sort -k1,1 -t " " | tr "\t" "\n" > file_sorted.fastq
The cat command will print the file content (to STDOUT).
The paste command will join the four lines of a FASTQ entry into a single line, each original line separated by a tab.
The sort command will sort each line using everything before the first space (which is our sequence identifer).
The tr command will replace the tabs with line breaks, which is basically an undo of the paste command (in a simplified explanation).
The ">" sign will write the sorted output to the file specified after it.


Thursday, June 5, 2014

Install Mac style launcher on Ubuntu


Step 1

Install cairo/dock program:

sudo add-apt-repository ppa:cairo-dock-team/ppa
sudo apt-get update
sudo apt-get install cairo-dock cairo-dock-plug-ins




Step 2

First launch by running command:

cairo-dock &

Caveat: On the first launcher, Cairo-Dock prompts whether to enable OpenGL but the OpenGL can be badly supported by your video drivers though most of them support it well.


Step 3

Allow cairo-dock to start automatically after login Ubuntu:

Type in command 'gnome-session-properties '
or
Open System Tools -> Preferences -> Startup Applications,

Click the “Add” button -> name your item “Cairo Dock” and in the command box type “cairo-dock” without the quotes. You can leave the comments field blank. Then click Add.


Step 4 (optional)

Hide the system default launcher panel:

Right click on desktop -> choose 'Change Desktop Background' -> in the Appearance setting window, go to Behavior tab -> switch on 'Auto-hide the launcher'

Press Alt+F2 will bring up the system launcher window.



Reference

https://help.ubuntu.com/community/CairoDock

http://glx-dock.org/ww_page.php?p=First%20Steps&lang=en

Simplify SSH Login

On a Linux machine, visiting a remote Unix/Linux machine is usually running command like 'ssh foo@hpc.example.com' and then typing in password once prompted. This process is trivial but  a little bit of annoying that we need to type in the whole lengthy address and password every time. It would be nice if we can simplify the login process so that we access the remote server without typing in the full address, user id and password. Here is a solution (note that all the following procedures are done on the local machine).

First of all, edit/add the SSH configuration file $HOME/.ssh/config with content like the following:
Host hpc
    HostName hpc.example.com
    Port 21
    User foo
Set file mode so that only the current user can read/write this configuration file:
chmod 600 $HOME/.ssh/config

Here we have set up an alias 'hpc' for the full remote machine address and 'ssh hpc' can initialize the login process without using the lengthy one 'ssh foo@hpc.example.com'. But we still need to type in password to get access.

Let us next set up passwordless ssh login.

Make a pair of private and public keys by:
ssh-keygen -t rsa

Note that passphrase should be left empty when prompted. By default, two files id_rsa.pub (the public key) and id_rsa (the private key) will be generated in the folder ~/.ssh/.

Copy the public key to the remote machine and then append its content to file ~/.ssh/authorized_keys:
ssh hpc cat id_rsa.pub >>~/.ssh/authorized_keys <~/.ssh/id_rsa.pub

Finally, change the file mode of the private key file so that other users can not meddle with it:
chmod 600 ~/.ssh/id_rsa

Now everything is set and we should be able to access the remote server without a password:

ssh hpc




Wednesday, June 4, 2014

Friday, May 23, 2014

Setup color for ls command

Linux shell terminal uses dircolors to manage the color scheme. To change the color setting for the ls output, we can do the following.

Step 1 Add the following into file ~/.bashrc so that the changes will be made permanent

if [ "$TERM" != "dump" ]; then
        if [ -e $HOME/.dircolors ]; then
                eval "`dircolors -b $HOME/.dircolors`"
        fi
fi

Step 2 Print current color scheme and save it to file ~/.dircolors

dircolors -p >~/.dircolors

Step 3 Edit color settings for different file types as specified in file ~/.dircolors, e.g., change the color code for directory as "DIR 01;36" where 01 stands for bold font face and 36 indicates cyan for text color.

Step 4 Test  the changes

source ~/.bashrc
ls -l ~/


Monday, April 7, 2014

Resolve broken dependency on Ubuntu

It happens at times that there is a broken package dependency stopping apt-get command from functioning properly. E.g., I recently encountered an error when running 'apt-get upgrade' on my Ubuntu 12.04 machine:

The following packages have unmet dependencies.
 libavahi-common3 : Depends: libavahi-common-data but it is not going to be installed
E: Unmet dependencies. Try 'apt-get -f install' with no packages (or specify a solution)


Running 'apt-get -f install' turned out no good:

dpkg: error processing libavahi-common-data (--configure):
 libavahi-common-data:amd64 0.6.30-5ubuntu2 cannot be configured because libavahi-common-data:i386 is in a different version (0.6.30-5ubuntu2.1)

'apt-get remove libavahi-common-data' did not work because of the same stupid dependency problem. After trying various approaches, I finally got the following solution:

sudo dpkg --force-all -P libavahi-common-data:i386 libavahi-common-data
The above command would force remove the two broken packages and hence fix the broken dependency. Now we can run apt-get command to install whatever we want.

Monday, March 31, 2014

R install package from source on Windows

To install from package source code for both i386 and x64 arch:
 R CMD INSTALL --compile-both Package_Directory

To install and build *.zip archive at the same time:
 R CMD INSTALL --build --compile-both Package_Directory

Wednesday, January 29, 2014

Install CPAN modules in Perl

To install modules from CPAN, run the following command as administrator

perl -MCPAN -e shell
or
cpan
Under the CPAN shell, type in

install Module_Name

For example, to support the command line history in CPAN shell, we need to install the modules

install Term::ReadLine
install Term::ReadLine::Perl

To customize installation path, eg, when installing without root privilege, run the following commands in cpan (Ref 1):
o conf mbuildpl_arg '--install_base /home/user/.local/perl5'
o conf makepl_arg INSTALL_BASE=/home/user/.local/perl5
o conf commit

Once installation path is customized, set environment variable PERL5LIB to tell perl where to look for modules from, eg, add one line in ~/.bash_profile:
export PERL5LIB=/home/user/.local/perl5/lib/perl5:$PERL5LIB
Setup MANPATH
MANPATH=$MANPATH:/home/user/.local/man

Update
We can also use cpanm to facilitate installation from CPAN or from local (*.tar.gz) file. Firstly type in this command in terminal

cpan App::cpanminus

Now, install module

cpanm CPAN_Module_Name
cpanm -l /path/to/install perl_package_file #-l option specifies installation location

Reference
1. Setting up customized installation path for cpan:
http://stackoverflow.com/questions/540640/how-can-i-install-a-cpan-module-into-a-local-directory