Friday, July 11, 2014

Cool Unix Commands

I will add to this list as I discover new ones.  If you have a favorite or useful command feel free to include it in a comment on this post.


Convert a FASTQ file to FASTA (originally posted here):
sed -n '1~4s/^@/>/p;2~4p' 

NOTE: this assumes that each FASTQ entry spans only four lines as is customary.



Convert a SAM file to FASTA

awk '{OFS=""}{print $1, "\n", $10; }' file.sam > file.fasta

NOTE: You will loose a lot of information in the sam file.  You can save more of that info by adding column variables to the print statement.  Also, you may have to change the column variable numbers depending on your sam file format.  This is just a general example.



Replace spaces in file names with underscore (originally posted here)

rename ' ' '_' *

NOTE:  do NOT put spaces in file names!!  This is so annoying!



Get a histogram of sequence lengths from FASTA/Q files (from Surge Biswas)

FASTQ:  cat <fastq file> | awk '{if(NR%4==2) print length($1)}' | sort -n | uniq -c
FASTA:  cat <fasta file> | awk '{if(NR%4==0) print length($1)}' | sort -n | uniq -c



Do arithmetic operations on the bash command line

echo $((1 + 1))
echo $((1 - 1))
echo $((1 * 1))
echo $((1 / 1))
echo $(((1+3) / (1+1)))

For floating point operations you can use the bc tool.  For example

echo "scale=1; 1/2" | bc



Add a comment to a bash command on the command line

<command>; # this is a comment line

A practical example:  mv file1 old_file1; # there is now a new file1 is a more recent version

NOTE:  Do you ever have a long and complex command for which you would like to save a simple note?  You can use this little trick and the note will be saved along side your command in your history.  The next time you look through your history to rerun the command you will also see the associated note.



Count the number of bases in a FASTA file

grep -v ">" file.fasta | wc | awk '{print $3 - $1}'

(from martinghunt on SEQanswers)