Home
News
Feed
search engine
by
freefind
advanced
work analyzing most significant peptides 10-31-13
2014-08-29
++ work analyzing most significant peptides 10-31-13 Need to analyze the timecourse data in more detail. I'd like to look at when they just analyzed the data over one month. This seems to have occurred around November 2011. Cluster order for individual 84 for one month entropy normalized_shannon_entropy ninety_fifth_percentile mean median fifth_percentile stdev max fifth_percentile_normalized min min_normalized kurtosis skew dynamic_range ninety_fifth_percentile_normalized cv stdev_normalized mean_normalized max_normalized ----- copying info blinded samples created 10-26-13 5:39 PM too long training samples 10-26-13 5:40 PM final point by point response to send to reviewers 10-26-13 7:12 PM ------- cluster order for individual 43 entropy normalized_shannon_entropy mean median fifth_percentile ninety_fifth_percentile stdev max min min_normalized fifth_percentile_normalized kurtosis skew cv stdev_normalized max_normalized mean_normalized ninety_fifth_percentile_normalized dynamic_range --------- 10-27-13 Datasets I want to include in final analysis -mouse young vs old experiment -young vs old data from Muskan -Tiger's mouse tumor samples (FVBN time series data) -Bart's dog lymphoma samples -human normals at different ages (possibly; I think this data is across many different wafers so I may not want to include it) -First Chip Disease Dataset -LLNL dataset -Valley fever data from Krupa (10kv2 data) -Alzheimers data from Lucas -same individual monitored over time -antibody mix experiment from Josh -antibody mix experiment from Heidi and Krupa -Rebecca's monoclonal antibody data -two antibodies mixed from Daniel ------ <notes on immune status without disease> specific peptides> { I'd like to see how the immunosignature changes when I take away the disease specific peptides. I'll take a look at a chronic disease and an infectious disease on the 330k, and also a chronic and infectious disease on the 10k. For the chronic disease 330k data I'll look at wafer 46 with breast cancer. For the infectious disease 330k data I'll look at the First Chip Disease Dataset. I'll just start with this for now. } -</notes on immune status without disease> ---------- Need to contact Juan Ramon Molina jrmolina@cnio.es --------- 10-28-13 Investigation of significant peptides to immune temperature I would like to remove the following numbers of peptides (both random and unique to immunosignature) (there are a total of 330,173) 330173-y=100000 1 5 50 100 500 1000 5000 10000 100000 230173 320173 325173 329173 329673 330073 330123 330153 330168 X Axis goes from 0 to 6 by 1 Y axis goes from 0 to 3.5 by 0.5 1 0 330172 5 0.698970004 330168 50 1.698970004 330123 100 2 330073 500 2.698970004 329673 1000 3 329173 5000 3.698970004 325173 10000 4 320173 100000 5 230173 230173 5.362054378 100000 320173 5.505384705 10000 325173 5.512114478 5000 329173 5.517424206 1000 329673 5.51808338 500 330073 5.51861 100 330123 5.518675783 50 1 0.32 0.1 0.032 0.01 3.2E-03 1.0E-03 3.2E-04 What if I removed the top 5,000 highest intensity peptides, removed the top 50 lowest intensity peptides, kept the peptides ranked from peptides 100 to 5,000 by intensity, kept peptides with a p-value <0.30 and then used the entropy to compare the two groups? ^This would take some time to do so I don't think I will do this for now. need a scale for 5.511 to 5.52 325173 5.512114 5000 329173 5.517424 1000 329673 5.518083 500 330073 5.51861 100 330123 5.518676 50 330153 5.518715 20 0.001 0.000316228 1.00E-03 3.16E-04 Flu vaccination records "S:\Administration\PeptideArrayCore\2011 sample run\30-Days-Normals\Tetanus samples.xlsx" ------------ pre 10-31-13 notes on analyzing most important peptides 1 5 50 100 500 1000 5000 10000 100000 230173 320173 325173 329173 329673 330073 330123 330153 330168 command for getting all summary numbers java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "C:\temp_sync\sig 330073r" "intensity values of all gprs 2 10-27-13d1318.txt" 0 8 36 8 36 4 new command for high intensities java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "C:\temp_sync\sig 330073r" "intensity values of all gprs 2 10-27-13d1318.txt" 0 2 30 2 30 4 =TTEST(B2:B6,B7:B30,2,3) What happens if we calculate the entropy using the top 1000 p-value peptides and then calculate the entropy? for least significant peptides java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "C:\temp_sync\least significant removed\" "intensity values of all gprs 2 10-27-13d1318.txt" 0 9 37 9 37 4 r r5 to r104 K 100-5000 KLI = Keep low intensity LI = Low intensity HI = High intensity KP = Keep P-value max-101,834 peptides have a p-value less than 0.3 remove if LI, HI, not KLI or not KP excel function used =IF(OR(AZ330178="HI",BA330178="LI"),"REMOVE",IF(OR(AY330178="KLI",BB330178="KP"),"KEEP","REMOVE")) excel function 2 IF(OR(AY330178="KLI",BB330178="KP"),"KEEP","REMOVE") ^Based on sorting from n1 this ends up being the same as keeping all peptides with p-value less than 0.3. 103469 peptides removed java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "C:\temp_sync\custom selected peptides" "intensity values of all gprs 2 10-27-13d1318.txt" 0 11 39 11 39 4 java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "C:\temp_sync\bot 1000 without bot 100" "intensity values of all gprs 2 10-27-13d1318.txt" 0 2 30 2 30 4 I was able to get the best p-value by selecting the bottom 1000 values for each sample by intensity and removing the the bottom 100. I obtained a p-value of 0.000409 (-log10 of that equals 3.39) What if I just chop off the top 5,000, and the bottom 100, and then remove least significant peptides? java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "C:\temp_sync\chop off top 5000 and bottom 100" "intensity values of all gprs 2 10-27-13d1318.txt" 0 11 39 11 39 4 p-value for entropy was java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "C:\temp_sync\chop off top 5000 bottom 100 and pvalue greater than 0p7" "intensity values of all gprs 2 10-27-13d1318.txt" 0 11 39 11 39 4
azim58wiki: