Work 092412

2015-01-13

azim58 - Work 092412


There is an error with how MotifGroup extracts its own information so I
need to go back and fix this part. Then I can move on to adding bepipred
functionality, and then blast functionality.

1143 finished fixing MotifGroup. Now I can add bepipred functionality.
First I need to install bepipred.

downloaded bepipred
and tried to follow instructions here
http://www.cbs.dtu.dk/services/doc/bepipred-1.0.readme

I changed the paths like they told me to.
I don't know if the "sticky" bit was set on tmp or not.
received the following error when running ./bepipred in the appropriate
directory in cygwin
/bin/tcsh: bad interpreter: Permission denied
This site seemed to know the problem.
http://www.linuxquestions.org/questions/linux-newbie-8/bin-csh-bad-interpre
ter-175062/
So then I tried to install the tcsh shell by following the instructions
here
http://cs.nyu.edu/~yap/prog/cygwin/FAQs.html
However, after following the instructions and then starting the cygwin
that should use tcsh, the shell just opens and closes.
This process seems daunting. Maybe I will try to install blast and get
that working instead.

I found out how to check the sticky bit.
ls -ld /tmp
this gives a result like
drwxrwxrwt+ 1 kwhittem Domain Users 0 Sep 23 11:57 /tmp
the sticky bit is represented by the letter t in the final character-place

Kevin helped me get a little farther

Oh okay I now have the program working a little farther. I
reinstalled Cygwin so that it would install more shells. I then added
the following code to the Bepipred file. However, now when running
./bepipred I receive the following error message.

bepipred: no "fasta2proppred" executable for CYGWIN_NT-5.1-i686

Kevin wrote a program to run Bepipred by calling the website. However,
their website has changed and his program no longer works. Here is the
directory of his program.
S:\Administration\software\custom software\IEDB Analysis

Here's an example command and some output
C:\Documents and Settings\kwhittem\My
Documents>"S:\Administration\software\cust
om software\IEDB Analysis\ImmepRetrieveBepipred.exe"
.fa --output=output.txt
Epitopes Predicted: 0


===========================================================================
It looks like the best solution for Bepipred is to run it on Linux
(perhaps in a virtual box), and then call commands to this Linux machine.
see this e-mail from Kristoffer Rapacki
https://mail.google.com/mail/u/0/?ui=2&shva=1#label/Career/139fa1268414
8679


===========================================================================
Okay onto blast

How to: Run BLAST software on a local computer
http://www.ncbi.nlm.nih.gov/guide/howto/run-blast-local/

Blast can be downloaded here:
ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
All of the databases for blast can be updated with this command

./update_blastdb.pl htgs
as seen here

http://www.ncbi.nlm.nih.gov/books/NBK1763/#CmdLineAppsManual.3_Quick_start
or individual databases can be installed

ftp://ftp.ncbi.nlm.nih.gov/blast/db/
For example, one might just want to download the nr database for the
non-redundant sequences

I'm having trouble getting blast setup. It looks like this website might
help me.
http://www.blaststation.com/freestuff/en/howtoNCBIBlastWin.html

Alright now it works. I just had to call the database nr.00 since that's
what the downloaded database files were called rather than just nr.

Here's an example command:
C:\cygwin\home\kwhittem\NCBI\blast-2.2.27+\bin>blastp -db nr.00 -query
test_fast
a.fsa -out results.out

When I try to blast with a short query sequence such as "pqregs" no hits
are found. Therefore, I need to find out how to blast short query
sequences.
Tried psiblast. This does not find short input sequences.

To do a short query blast I can use the regular blast program, but change
the parameters.
This page describes the process.
http://www.ncbi.nlm.nih.gov/blast/Why.shtml
This table shows how this should be done.

Program
Word Size
Filter
E Value
Composition based Statistics
Score Matrix

Standard protein BLAST
3
On (SEG)
10
On
BLOSUM62

Search for short/nearly exact matches
2
Off
20000
Off
PAM30


Here are some BLAST parameters I might be interested in.
-word_size int_value
-evalue evalue
-threshold float_value //I'm probably interested in evalue rather than
threshold
-num_descriptions int_value
-num_alignments int_value
-matrix matrix_name

Actually the best way to search with short queries is probably to just
set the task parameter
e.g.
this page describes this
http://www.ncbi.nlm.nih.gov/books/NBK1763/

using -task blastp-short did not yield results, but searching like this
./blastp -db nr.00 -query pqregs.fsa -word_size 2 -evalue 20000
did.


blastp parameter descriptions

Before I found the parameter descriptions I was in the process of writing
the blast help desk an e-mail. It looks like I don't need to send this
e-mail any longer though.
Blast help desk email 9-25-12

This command has all of the parameters set correctly and gave me lots of
results for a short query sequence
./blastp -db nr.00 -query pqregs.fsa -word_size 2 -seg no -evalue 20000

I can also take away all of the alignments at the bottom of the text file.
./blastp -db nr.00 -query pqregs.fsa -word_size 2 -seg no -evalue 20000
0 -out results.out

I think this is the command I'll use for the final command in my program.

Here is an example about how to blast or align two proteins against
eachother.
./blastp -subject test_fasta3.fsa -query test_fasta4.fsa -out results3.out



===========================================================================

Now that I know that I can get BLAST setup and working, I can try to get
Bepipred working on a Linux virtual box that I can reach with ssh on the
same windows computer that the virtual box is running on.

My Cygwin did not already have ssh by default. I am reinstalling it with
the appropriate packages according to the directions here:
http://animatedlight.net/node/2

I ended up getting a virtualbox with linux with bepipred controlled by
ssh by cygwin up and running.

Now I would like to know how to copy files with ssh.
info from
http://hintsforums.macworld.com/archive/index.php/t-29244.html
copy from a remote machine to my machine:
scp [email protected]:/home/remote_user/Desktop/file.txt
/home/me/Desktop/file.txt

copy from my machine to a remote machine:
scp /home/me/Desktop/file.txt
[email protected]:/home/remote_user/Desktop/file.txt

copy all file*.txt from a remote machine to my machine (file01.txt,
file02.txt, etc.; note the quotation marks:
scp "[email protected]:/home/remote_user/Desktop/file*.txt"
/home/me/Desktop/file.txt

copy a directory from a remote machien to my machine:
scp -r [email protected]:/home/remote_user/Desktop/files
/home/me/Desktop/.

How do I exit ssh?
just type
exit

Okay I now have all of the programs up and operational. Now I need to
code my Java program to use these programs.