Decipher Immunosignature Code 07-09-2014d0907
2015-01-13Decipher Immunosignature Code 07-09-2014d0907
future code changes
- I want to be able to list what the epitope is
- I want to be able to view detailed output of alignments (optional parameter)
- I want to be able to remove redundant fragments from a database and results
- fix blast error that occurs sometimes when searching for epitopes from the random peptides
- make it so that only epitopes within a certain size are returned and remove duplicates from the final population
- setup the program so that it can go straight from finding the epitopes to searching a database
- clean up code so that unused code and classes are removed
- adjust code so that when epitopes are found, the population size, number of evolution cycles, and glam2 threshold are automatically output as well
Stages for running program.
- Perform genetic algorithm with 100 peptides from t-test from immunosignature
- Prepare the database you want to analyze by creating the curated sequences.
- Break the found motifs up into all 3, 4, and 5 mer epitopes and perform a genetic algorithm to match these fragments with 30 aa fragments in a database. Do this across many computers.
- Combine the results from all of the different 30aa fragments analyzed.
- Custom Epitope Group With Protein Fitness Evaluator code to evaluate in a parallel manner
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-06-2014d1041\CustomEpitopeGroupWithProteinFitnessEvaluator
- -Is this code faster? Not really.
- --cim_time evaluation 08-06-2014d2155
Instance of code on other computers
- C:\ativ6_storage\2014\8-24-14\decipher test 8-10-14
- /home/owner/ratel_ultra_storage/2014/8-24-14/Decipher/DecipherImmunosignature
Versions of Decipher Immunosignature Code
- current code 09-11-2014d2007
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\09-11-2014d2008\Decipher Immunosignature\DecipherImmunosignature
- fixed a bug in the code so that protein sequences only containing letters from DNA can be analyzed by adding the "--seqtype=protein" in the clustal command line part of the code
- -see C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\09-09-2014d0915\clustalo test
- current code 09-04-2014d1058
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\09-04-2014d1059\09-04-2014d1059\Decipher Immunosignature
- current code 09-01-2014d1256 .. this version of the code did a fairly good job of screening a database .. version before adding code so that only the top four mers are used rather than all of the four mers that pass a threshold
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\09-01-2014d1257\09-01-2014d1257\Decipher Immunosignature\DecipherImmunosignature
- -some results
- --C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\09-02-2014d1650
- current code 08-27-2014d1122. . right when I first fixed the following error: collections sort throws comparison method violates its general contract exception
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-27-2014d1011\08-27-2014d1123\Decipher Immunosignature\DecipherImmunosignature
- current code 08-27-2014d1012
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-27-2014d1011\Decipher Immunosignature\DecipherImmunosignature
- current code 08-26-2014d0930. . cod before modifying alignment evaluation so that broken epitopes were not considered
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-26-2014d0932\Decipher Immunosignature\DecipherImmunosignature
- current code 08-25-2014d0955. . code before changing epitope database search so that there is a max number of threads
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-25-2014d0953\Decipher Immunosignature\DecipherImmunosignature
- current code 08-20-2014d1647. . version before changing database search so that all four mair pairs are searched first for each protein fragment
- -"C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-20-2014d0937\08-20-2014d1647\DecipherImmunosignature"
- current code 08-20-2014d0936. . version before modifying genetic algorithm so that previously encountered chromosomes are not re-evaluated.
- -"C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-20-2014d0937\DecipherImmunosignature"
- current code 08-19-2014d0939. . version before modifying program so it could work a little better in Linux.
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-19-2014d0923\08-19-2014d0941\DecipherImmunosignature
- current code 8-19-14d0922. The parallel multi-thread method for finding epitope groups seems to be working in this version of the code.
- -"C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-19-2014d0923\DecipherImmunosignature"
- current code 08-17-2014d1059.
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-17-2014d1100\DecipherImmunosignature
- current code 08-09-2014d1125. Code before changing evolve epitope group code so that it could handle multiple threads. 08-17-2014d1059
- current code pre 08-09-2014d1124. . this is version of code before I started to change things so that an epitope group with a protein is not evaluated so many times, and I also wanted to try the genetic algorithm with multiple threads.
- -"C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-05-2014d1134\new\DecipherImmunosignature"
- This version of the code has a EpitopeGroupWithProteinMatchesFromOmega class which assigns a score based on the percentage of the max possible score. I wanted to save this before I modified the class to score in my new way which emphasizes the importance of lots of consecutive matches.
- -"C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\08-05-2014d1134\DecipherImmunosignature"
- I will start using multiple instances of eclipse with the same code at a location on the hard drive (not in eclipse). .version of code which uses blast to make sure current motif is not similar to one of the previously collected motifs. I also just started adding some age code to the chromosomes so that chromosomes that are too old will be discarded.
- -"C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\07-18-2014d0956\Decipher Code versions\07-18-2014d0957\DecipherImmunosignature"
- --length of path is 148 characters so this should be acceptable
- working version which checks if a chromosome has a score greater than 33 from glam2. If it does, then if it matches an existing motif already found, then give this chromosome a low fitness score. This program found the HEE motif and SS motif in the top 10 best motifs. I would like to adjust the program though so that a motif is checked against the existing best collected motifs. If it is very similar to an existing motif, then this motif should get a low fitness score so that new motifs can be found. Maybe I will use blast to perform this comparison.
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\07-11-2014d0935\DecipherImmunosignature
- --some results obtained with this code
- ---C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\07-08-2014d0933\decipher work 07-08-2014d0933\evolution with avoidance of high scoring already found motifs 07-11-2014d0950
- working version which can take a list of peptides and evolve them starting with a population with no more than an allowed number of positive genes (the min is specified as well (to the value of 2)) 07-09-2014d1050
- -"C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\07-09-2014d0909\07-09-2014d1050\DecipherImmunosignature"
- working version 07-09-2014d0908
- -C:\Users\kurtw_000\Documents\kurt\storage\CIM Research Folder\DR\2014\07-09-2014d0909\DecipherImmunosignature