Tuesday, June 17, 2014

Convert VCF chromosome notation

VCF files from difference sources may use different chromosome notations, either with or without chr. To make a consistent notation, Vivek provided two lines of awk code to swiftly convert vcf chromosome naming format from one to another.

1. Remove 'chr' from the chromosome notation:

awk '{gsub(/^chr/,""); print}' with_chr.vcf > no_chr.vcf

2. Add chr before chromosome id

awk '{if($0 !~ /^#/) print "chr"$0; else print $0}' no_chr.vcf > with_chr.vcf


Reference:
https://www.biostars.org/p/98582/

Thursday, June 12, 2014

Bash array examples

http://www.thegeekstuff.com/2010/06/bash-array-tutorial/

Wednesday, June 11, 2014

Sort every n lines in a file

http://edwards.sdsu.edu/labsite/index.php/robert/399-sorting-fastq-files-by-their-sequence-identifiers

Sorting FASTQ files by their sequence identifiers

In certain cases, you need to sort FASTQ files by their sequence identifiers (e.g. to fix the order of paired-end or mate-pair sequences). There are several ways of sorting the FASTQ files, but the simplest way is usually the best. Here is a one liner to do the job:
 cat file.fastq | paste - - - - | sort -k1,1 -t " " | tr "\t" "\n" > file_sorted.fastq
The cat command will print the file content (to STDOUT).
The paste command will join the four lines of a FASTQ entry into a single line, each original line separated by a tab.
The sort command will sort each line using everything before the first space (which is our sequence identifer).
The tr command will replace the tabs with line breaks, which is basically an undo of the paste command (in a simplified explanation).
The ">" sign will write the sorted output to the file specified after it.


Thursday, June 5, 2014

Install Mac style launcher on Ubuntu


Step 1

Install cairo/dock program:

sudo add-apt-repository ppa:cairo-dock-team/ppa
sudo apt-get update
sudo apt-get install cairo-dock cairo-dock-plug-ins




Step 2

First launch by running command:

cairo-dock &

Caveat: On the first launcher, Cairo-Dock prompts whether to enable OpenGL but the OpenGL can be badly supported by your video drivers though most of them support it well.


Step 3

Allow cairo-dock to start automatically after login Ubuntu:

Type in command 'gnome-session-properties '
or
Open System Tools -> Preferences -> Startup Applications,

Click the “Add” button -> name your item “Cairo Dock” and in the command box type “cairo-dock” without the quotes. You can leave the comments field blank. Then click Add.


Step 4 (optional)

Hide the system default launcher panel:

Right click on desktop -> choose 'Change Desktop Background' -> in the Appearance setting window, go to Behavior tab -> switch on 'Auto-hide the launcher'

Press Alt+F2 will bring up the system launcher window.



Reference

https://help.ubuntu.com/community/CairoDock

http://glx-dock.org/ww_page.php?p=First%20Steps&lang=en

Simplify SSH Login

On a Linux machine, visiting a remote Unix/Linux machine is usually running command like 'ssh foo@hpc.example.com' and then typing in password once prompted. This process is trivial but  a little bit of annoying that we need to type in the whole lengthy address and password every time. It would be nice if we can simplify the login process so that we access the remote server without typing in the full address, user id and password. Here is a solution (note that all the following procedures are done on the local machine).

First of all, edit/add the SSH configuration file $HOME/.ssh/config with content like the following:
Host hpc
    HostName hpc.example.com
    Port 21
    User foo
Set file mode so that only the current user can read/write this configuration file:
chmod 600 $HOME/.ssh/config

Here we have set up an alias 'hpc' for the full remote machine address and 'ssh hpc' can initialize the login process without using the lengthy one 'ssh foo@hpc.example.com'. But we still need to type in password to get access.

Let us next set up passwordless ssh login.

Make a pair of private and public keys by:
ssh-keygen -t rsa

Note that passphrase should be left empty when prompted. By default, two files id_rsa.pub (the public key) and id_rsa (the private key) will be generated in the folder ~/.ssh/.

Copy the public key to the remote machine and then append its content to file ~/.ssh/authorized_keys:
ssh hpc cat id_rsa.pub >>~/.ssh/authorized_keys <~/.ssh/id_rsa.pub

Finally, change the file mode of the private key file so that other users can not meddle with it:
chmod 600 ~/.ssh/id_rsa

Now everything is set and we should be able to access the remote server without a password:

ssh hpc




Wednesday, June 4, 2014