Home
News
Feed
search engine
by
freefind
advanced
work analyzing many different datasets with entropy and other measures 10-26-13
2014-08-29
++ work analyzing many different datasets with entropy and other measures 10-26-13 10-9-13 I would now like to get the summary numbers for the 2013 DTRA data. I'll copy the data I want to look at just in case anything wierd occurs (I don't want to damage the original data). -Copied gpr files from here S:\Research\CIM-HealthTell\Experiments\20130724 HTChipV7P-128 Production Run - 100% Density (HT-128)\Good GPRs Slides 3 to 8 on 900 to here S:\Research\Cancer_Eradication\Users\kwhittem\temp -copied gpr files from here S:\Research\CIM-HealthTell\Experiments\20130724 HTChipV7P-130 Production Run - 100% Density (HT-130)\Good GenePix GPR from Slides 3 to 8 from 900 to here S:\Research\Cancer_Eradication\Users\kwhittem\temp2 It's taking some time to copy the files so I could probably analyze some other data as well. I'll look at this data: F:\kurt\storage\CIM Research Folder\DR\2013\9-26-13\old and young mice gpr files Need to use folder of gpr code. java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\9-26-13\old and young mice gpr files" "F647 Median" files finished copying. can now run program. (F532 Median) java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp" "F532 Median" and the other file java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp2" "F532 Median" short names for summary measures entropy norm_entropy max min cv stdev mean median 5th_perc 95th_perc max_norm min_norm stdev_norm mean_norm 5th_perc_norm 95th_perc_norm kurtosis skew dynamic_range summary numbers for wafer 128 summary numbers for wafer 130 table_of_summary_numbers_clean_100913d1421 at F:\kurt\storage\CIM Research Folder\DR\2013\9-26-13\old and young mice gpr files\summary "F:\kurt\storage\CIM Research Folder\DR\2013\9-26-13\old and young mice gpr files\summary\table_of_summary_numbers_clean_100913d1421.xlsx" parameters I like for a boxplot combined with a dot plot (scatterplot or strip chart) as of 9-26-13 --set group and type --alpha at 0.5 --binwidth at 0.01 --dotsize at 3.0 how to save a ggplot() from command line in R png("C:\Users\kwhittem\Desktop\temp\myplot.png") ....plot code here.... dev.off() ^this file will be placed in the current working directory which can be found with getwd() The directory can be set with setwd(dir) png("myplot.png", height = 800, width = 600) ggplot() + geom_boxplot(aes(y = entropy,x = Classification),data=table_of_summary_numbers_clean_100913d1421) + coord_flip() + geom_dotplot(aes(x = Classification,y = entropy),data=table_of_summary_numbers_clean_100913d1421,alpha = 0.5043,binaxis = 'y',binwidth = 0.01,stackdir = 'center') dev.off() I'll continue copying files over. copied this S:\Research\CIM-HealthTell\Experiments\20130807 HTChipV7P-135 Production Run - 100% Density (HT-135)\Mapix GPR from 900 First Pass to here S:\Research\Cancer_Eradication\Users\kwhittem\temp copied this S:\Research\CIM-HealthTell\Experiments\20130814 HTChipV7P-136 Production Run - 100% Density (HT-136)\Good GenePix GPRs slides 3 to 8 on the 900 to here S:\Research\Cancer_Eradication\Users\kwhittem\temp2 wafer 135 results wafer 136 results command1 java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp" "F532 Median" command2 java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp2" "F532 Median" I would like to automate making graphs of all of the measures. Then I can just copy and paste the commands for specific situations. tiff("myplot.tiff") ggplot() + geom_boxplot(aes(y = entropy,x = Classification),data=table_of_summary_numbers_clean_100913d1421) + coord_flip() + geom_dotplot(aes(x = Classification,y = entropy),data=table_of_summary_numbers_clean_100913d1421,alpha = 0.5043,binaxis = 'y',binwidth = 0.01,stackdir = 'center') dev.off() tiff("myplot2.tiff") ggplot() + geom_boxplot(aes(y = entropy,x = Classification),data=table_of_summary_numbers_clean_100913d1421) + coord_flip() + geom_dotplot(aes(x = Classification,y = entropy),data=table_of_summary_numbers_clean_100913d1421,alpha = 0.5043,binaxis = 'y',binwidth = 0.01,stackdir = 'center') dev.off() I also need to specify the binwidth like this bindwidth =(max(subset(table_of_summary_numbers, select=c("entropy")))-min(subset(table_of_summary_numbers, select=c("entropy"))))*0.1 All the commands can be found here "F:\kurt\storage\CIM Research Folder\DR\2013\10-10-13\commands for automatically plotting summary numbers in r 10-10-13.txt" "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\wafer 128 results\table_of_summary_numbers_with_classification.txt" Copied this S:\Research\CIM-HealthTell\Experiments\20130814 HTChipV7P-137 Production Run - 100% Density (HT-137)\Good GPRs GENEPIX on 900 from slides 3 to 8 to here S:\Research\Cancer_Eradication\Users\kwhittem\temp2 Copied this S:\Research\CIM-HealthTell\Experiments\20130717 HTChipV7P-108 Production Run - 100% Density (HT-108)\Good GPR from Slides 3 to 8 on 900 Correct Labels to here S:\Research\Cancer_Eradication\Users\kwhittem\temp3 I moved the large wafer 135 results to my home desktop computer, and I'll try to run it using more RAM since these MAPPIX aligned files are larger. Command java -jar "C:\Users\Owner\Desktop\temp\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "C:\Users\Owner\BTSync\temp" "F532 Median" -Xmx6144 ^There was some type of error I'll try a new command on my work desktop computer in hope that I don't run out of RAM this time. java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp" "F532 Median" -Xmx2000 ^started around 101113d1515 I'd like to analyze this 2013 DTRA data on the 10ks as well. -found in file locations that look like this S:\Administration\PeptideArrayCore\2013 sample run\DTRA-go-no go\Set-L-good-gpr-files java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp2" "F539 Median" and java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp3" "F549 Median" and java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp2" "F649 Median" ^I won't do the F649 one since that is for IgM I'll test out java on my laptop java -jar "C:\Users\kurtw_000\Downloads\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "C:\Users\kurtw_000\BTSync\temp" "F532 Median" Now I'd like to analyze wafer 46 again. java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\9-26-13\46\all_gprs_in_one_folder" "F532 Median" I'll store the results around here: F:\kurt\storage\CIM Research Folder\DR\2013\9-10-13\entropy\summary of entropy values 9-10-13.xlsx here F:\kurt\storage\CIM Research Folder\DR\2013\9-10-13\entropy\new summary 10-12-13 Now I'll take a look at Josh's disease dataset. /home/josh/CIM/Research/labdata/jaricher/newDecipher/Data for Database/Array Results/First Chip Disease Dataset/llnl.csv -I will add a row to this data with a unique number identifier. --new file here "F:\kurt\storage\CIM Research Folder\DR\2013\10-13-13\disease data from josh\llnl.csv" --id from row 1 column 1 to 128. Sample data starts from row 2 column 1 to column 128 --this is normalized data command java -jar "C:\Users\kurtw_000\Downloads\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_normalized_data "C:\Users\kurtw_000\BTSync\temp" "llnl_tab.txt" 1 1 128 1 128 2 q: what is the max and min value of normalized shannon entropy? ------- 10-13-13 I need to make a letter of recommendation draft for Yung Chang saved here "F:\kurt\storage\Documents\Career\Letters of Recommendation\Letter of Recommendation Yung Chang 10-13-13.doc" ---------- 10-14-13 worked more with disease dataset from Josh aov.ex1= aov(entropy~name2,data=table_of_summary_numbers) performed analysis of variance in R software fit <- aov(entropy_normalized_data ~ Classification, data=table_of_summary_numbers) str(summary(fit)) code to obtain the analysis of variance for all numbers fit <- aov(entropy ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(norm_entropy ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(min ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(cv ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(stdev ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(mean ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(median ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(fifth_percentile ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(ninety_fifth_percentile ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(entropy_normalized_data ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(normalized_entropy_normalized_data ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(max_normalized ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(min_normalized ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(stdev_normalized ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(mean_normalized ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(fifth_percentile_normalized ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(ninety_fifth_percentile_normalized ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(kurtosis ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(skew ~ Classification, data=table_of_summary_numbers) str(summary(fit)) fit <- aov(dynamic_range ~ Classification, data=table_of_summary_numbers) str(summary(fit)) Now I'll look at Muskan's human data "F:\kurt\storage\CIM Research Folder\kwhittem\Records in CIM Folder\Categorical Records\Biodesign\Entropy of Immunosignature\human naive 2-27-12\human naive raw.txt" The command will look something like this java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "F:\kurt\storage\CIM Research Folder\kwhittem\Records in CIM Folder\Categorical Records\Biodesign\Entropy of Immunosignature\human naive 2-27-12" "human naive raw 2.txt" 2 1 323 1 323 8 It looks like the first time I looked at this data, I randomly picked 5 kids and 5 adults. 5 kids c1_kid c2_kid c3_kid c4_kid c10_kid 5 adults nc01t0 nc01t3 nc16t0 nc16t6 nc45t0 /home/josh/CIM/Research/labdata/jaricher/GFOD/results_processed.csv command java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "F:\kurt\storage\CIM Research Folder\DR\2013\10-14-13" "results_processed 10-14-13.txt" 0 1 14 1 14 2 "C:\Users\kwhittem\Desktop\temp\HT7-135 S3 B1 DUAL 09202013 S24 Sierra_b1.gpr" cut -d : -f 5 /etc/passwd ----------- 10-15-13 Now I'll look at Krupa's valley fever data S:\Administration\Biostatistics\Valley Fever\paper working directory\Random peptides paper\10K_v2\Original GPR's command java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-15-13\valley fever 10kv2" "F647 Median" ---------- 10-16-13 10Kv2 disease data 10-12 to 3-13 java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-16-13\10kv2 disease data 10-16-13" "F647 Median" Questions: What is PS202 another command for more normals java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-16-13\10kv2 disease data 10-16-13\extra normals" "F647 Median" another command for dengue java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-16-13\10kv2 disease data 10-16-13\dengue" "F647 Median" another command for Rebecca's monoclonal data java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-16-13\Rebecca Monoclonal Data\data" "F647 Median" t-V5 t-P53Ab8 t-P53Ab1 <100 pM t-none t-LeuEnk 1.95 nM t-cMyc b-V5 b-p53Ab8 b-P53Ab1 b-none b-LeuEnk b-cMyc 80 nM Why do some of the monoclonal antibody samples say t- and some say b-? Valley fever longitudinal data. In the tube# column in the Blinded set sheet. The study # indicates which person the tube # corresponds to and there are often multiple datapoints per person along with titer information. 0_13-4-CNS00048.gpr The first number matches with the CF Titer followed by the case number in the training set sheet. The case number can be matched up with an individual in the PTID column, and a date difference. command java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-16-13\valley fever longitudinal data\blinded samples" "F649 Median" and java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-16-13\valley fever longitudinal data\training samples" "F649 Median" ---------- 10-17-13 I tried using Weka to use a support vector machine to classify disease and normal based on summary numbers. The classification was actually quite good (about 80% correctly classified). SVM Classification with entropy only: 77.2% SVM Classification with all summary numbers: 79.7% I should start looking at the SVM data from weka more frequently Classification of summary number data with weka 10-17-13 Now I can finish taking a look at Krupa's data. Now I'll take a look at the Alzheimers data from Lucas. command for analyzing the alzheimers data java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-17-13\alzheimers data" "F647 Median" Now I'll look at some infectious diseases (some on both arrays) 10k S:\Administration\PeptideArrayCore\2013 sample run\LLNL_set-5 4-5 were 2013 S:\Administration\PeptideArrayCore\2012 sample run\LLNL Samples 1-3 330k LLNL samples S:\Research\CIM-HealthTell\Experiments\06252012 HTChipV4-22 Production Run 3 (HT-22) some more infectious diseases S:\Research\CIM-HealthTell\Experiments\06182012 HTchipV4-20 (HT-20) S:\Research\CIM-HealthTell\Experiments\06262012 HTChipV4-25 Production run 4 (HT-25) I'll start getting the data for these samples. java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-17-13\id some on both arrays\wafer 22" "F532 Median" wafer 4 has already been copied 10-18-13 java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp2\wafer 20" "F532 Median" start time: 0957 7*s*316 = 36 m 52 s estimated end time around 1040 started looking at wafer 25 some of the gprs are wavelength 635 and some are 532 Unknown sample 4-22 S7 G3 Hi P60 092812 M4_0532 5um K.gpr 4-22 S7 G1 Hi P60 092812 I5_0532 5um K.gpr SVM correctly classified 56.1% of the samples. java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp2\wafer 25" "F532 Median" wafer 20 svm could correctly classify 81.46% of the samples wafer 22 svm could correctly classify 97.1% of the samples java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp2\10k llnl" "F647 Median" some sample location information can be found in these spreadsheets "S:\Administration\PeptideArrayCore\2012 sample run\LLNL Samples\Sample Run LLNL Samples-set1.xlsx" "S:\Administration\PeptideArrayCore\2012 sample run\LLNL Samples\Sample Run LLNL Samples-set2.xlsx" "S:\Administration\PeptideArrayCore\2012 sample run\LLNL Samples\Sample Run LLNL Samples-set3.xlsx" "S:\Administration\PeptideArrayCore\2013 sample run\LLNL-Set-4\LLNL-Sample Run-Set-4.xlsx" "S:\Administration\PeptideArrayCore\2013 sample run\LLNL_set-5\LLNL-Sample Run-Set-5.xlsx" "S:\Administration\PeptideArrayCore\2013 sample run\LLNL-Set-6_06202013\LLNL Set-6 Sample Run.xlsx" I don't know what these samples are 10010979_bot_r1_07112012.gpr 10010981_top_rm5_07112012.gpr 10010976_top_m4_07112012.gpr SVM Correctly classified 91.9% of the samples. I'll try to run the remaining wafer 25 samples on my desktop computer java -jar "C:\Users\Owner\Desktop\temp\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "C:\Users\Owner\Desktop\temp\wafer 25" "F635 Median" -Xmx6144 10-19-13 I would like to parse the large Mappix aligned files. need to remove header from rows 1 to 31 need to extract column 27 for F532 Median UNIX commands will be like this awk 'NR >= 32' "test.gpr">temp.txt ; cut -f27 temp.txt>parsed/"test.gpr" Okay now that everything is working I just need to get the commands together to parse all of the files Some of the files have headers that end at line 27 and some at 32. Maybe I should just integrate this code with my Java code so that I can first find where F532 Median is found. I added cygwin to my environment path variable. Now I should try to make sure everything works with the windows command prompt. "C:\cygwin\bin\gawk.exe" 'NR >= 32' "C:/cygwin/home/kwhittem/test.gpr">"C:/cygwin/home/kwhittem/temp.txt" & cut -f27 "C:/cygwin/home/kwhittem/temp.txt">"C:/cygwin/home/kwhittem/parsed/test.gpr" "C:\cygwin\bin\gawk.exe" 'NR >= 32' "test.gpr">"temp.txt" & cut -f27 "C:/cygwin/home/kwhittem/temp.txt">"C:/cygwin/home/kwhittem/parsed/test.gpr" ^I was not able to get the awk commands to work (or work quickly anyway) from the windows command line. I think I will just need to prepare the data on linux and then use my Java program after that (or just manually prepare the data if the amount of data is not too great). Definitely not the ideal situation, but I couldn't find a nice solution. SVMs for idiots http://www.cs.ucf.edu/courses/cap6412/fall2009/papers/Berwick2003.pdf S:\Research\CIM-HealthTell\Experiments\06262012 HTChipV4-25 Production run 4 (HT-25) In wafer 25 there are some samples that I cannot find the idea of. While the computer is doing some searches, I'll start trying to analyze some other datasets as well. time course data java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Administration\PeptideArrayCore\2013 sample run\Normals-2013\Normals 1 month and 6 year 2 people" "F647 Median" For the next dataset I would like to take a look at Bart's lymphoma samples. java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "S:\Research\Cancer_Eradication\Users\kwhittem\temp2\bart_dog_lymphoma_1" "F649 Median" Now I can take a look at Bart's other (2nd one I have) dog lymphoma dataset. names at row 3 column 1 to column 25 and data at row 5 column 1 to column 25 java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "F:\kurt\storage\CIM Research Folder\DR\2013\10-20-13\bart_dog_lymphoma_2" "CIM10K Dogs.txt" 3 1 25 1 25 5 I finished extracting all of the auto-aligned wafer 135 data. Now I can run the program! java -jar "C:\Users\kurtw_000\BTSync\temp\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "C:\Users\kurtw_000\BTSync\temp\temp2" "F532 Median" ^The program was done in less than 1 min once the columns had been extracted from those huge auto-aligned files. Monoclonal ab mix experiment They tested 8 different antibodies on the 330k S:\Research\labdata\jaricher\newDecipher\Data for Database\Array Results\Monoclonals They also mixed the 8 different antibodies together. Found here: "S:\Research\labdata\jaricher\newDecipher\Data for Database\Array Results\VF and mAb mix\all.csv" The epitopes for these 8 different antibodies is found here: "S:\Research\labdata\jaricher\newDecipher\Data for Database\Array Results\VF and mAb mix\epitopes.xls" Process for making a heatmap in JMP -open some data -use graph cell plot or follow the directions below -analyze->multivariate methods->cluster -choose all of the columns you want to be clustered and click ok -click the arrow next to heirarchical clustering and choose -choose two way clustering I would like to analyze the monoclonal antibody data now. I'll run a command for this spreadsheet. "S:\Research\labdata\jaricher\newDecipher\Data for Database\Array Results\VF and mAb mix\all.csv" I copied the spreadsheet to here "S:\Research\Cancer_Eradication\Users\kwhittem\temp2\monoclonal\all.csv" sample names go from column 1 to 285 and row 0. data goes from column 1 to 285 and row 1 command java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data "S:\Research\Cancer_Eradication\Users\kwhittem\temp2\monoclonal" "all.txt" 0 1 284 1 284 1 --------- while copying all my data from the F drive there was an error for 2 files -runtime_info_find_summary_numbers_from_tabd... txt 40 bytes 10-14-13 2:42pm --found in human naive 2-27-12 folder -table of data non-normalized test 10-21-13 4:10pm --found in --------- 10-22-13 10K_v1 valley fever, median normalized dataset "F:\kurt\storage\CIM Research Folder\DR\2013\10-22-13\valley fever\10K_v1_Random_MedNorm(10,440).xlsx" Info from Bart The B and T refer to the top and bottom arrays of the slide. N and named dogs are normal. L and LSA are lymphoma. info for timecourse individual 84 caught a cold, around 20 days in, something like that. --------- 10-23-13 Fulbright Scholarship Would this be the right type of Fulbright scholarship for me? -The Fulbright U.S. Scholar Program sends American faculty members, scholars and professionals abroad to lecture and/or conduct research for up to a year. or maybe this one Fulbright-Hays Program or maybe the Core Fulbright Scholar program -http://www.cies.org/us_scholars/us_awards/ -eligibility info http://www.cies.org/us_scholars/us_awards/Eligibility.htm ------- 10-23-13 I want to extract the columns from the table containing 330k data including some monoclonal ab mix data. I want to extract columns 1 to 284. commands will be something like this cut -f2 all.txt>out2.txt spreadsheet for commands "F:\kurt\storage\CIM Research Folder\DR\2013\10-23-13\commands for extracting columns 10-23-13d1118.xlsx" program for extracting from monoclonal antibody data java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\kurt\storage\CIM Research Folder\DR\2013\10-23-13\Monoclonal ab data" "F532 Median" p53 located at columns 258, 259, 264, 265, 268, 269 cut -f258 test.txt>out1.txt cut -f259 test.txt>out2.txt cut -f264 test.txt>out3.txt cut -f265 test.txt>out4.txt cut -f268 test.txt>out5.txt cut -f269 test.txt>out6.txt java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "C:\Users\kurtw_000\Dropbox\ab mix" "F532 Median" java -jar "C:\temp_sync\EntropyOfArray100913d0921.jar" find_summary_numbers_from_tabdelimitedtext_raw_data directory filename sample_name_row sample_name_column_start sample_name_column_end data_column_start data_column_end data_starting_row java -jar "F:\kurt\storage\CIM Research Folder\DR\2013\10-9-13\EntropyOfArray092013\EntropyOfArray100913d0921.jar" find_summary_numbers_one_gpr "F:\kurt\storage\CIM Research Folder\DR\2013\10-23-13\Monoclonal ab data\ab mix experiment" "HT4-25 S6 D2 GRN-P80 RED P50 062113 p53Ab1_0635 SM.gpr" "F532 Median" p53Ab1 epitope RHSVV p53Ab8 epitope DLWKLL ------------ 10-25-13 Want to reserve room for defense November 18th 2-5pm need to email biodreception@asu.edu Two middle rooms on 2nd floor B262 A204 large room at the end of A building on 2nd floor A250 Large middle room on 3rd floor B362 Large room at the end of the A building on 3rd floor A350 Conference room in atrium towards building B (north) AL1-10/14 Conference room in atrium towards south entrance (A building entrance) AL1-50 ------- Need to contact committee members. Bert Jacobs, Stephen Johnston, Phillip Stafford, Valerie Stout, Kathryn Sykes bjacobs@asu.edu, Stephen.Johnston@asu.edu, phillip.stafford@asu.edu, vstout@asu.edu, kathryn.sykes@healthtell.com, Pattie.Madjidi@asu.edu Hi committee members, We can hold my oral defense on November 18th 2-5pm in the auditorium. Of course we will not need the full 3 hr time period. Note that in addition to presenting about screening the tumor cDNA library, a larger proportion of my defense will be oriented around the entropy "immune temperature" idea than you have seen in the past. I have introduced this concept several times in previous committee meetings, and we now have some interesting new data in this area. I will send out a copy of my dissertation closer to the time of the oral defense. Thanks for all of your help and suggestions over the years! Best regards, Kurt Whittemore Graduate Student Arizona State University BIODESIGN INSTITUTE Center for Innovations in Medicine 1001 S McALLISTER AVE TEMPE, AZ 85287 ---------- 10-26-13 I want to give a 330k disease and 330k normal to Lu so he can see what the minimum number of random peptides is necessary to distinguish between the groups. A good chronic disease sample looks like sample 30, 28, 21, or 35. A good infectious disease sample looks like 39. A good normal sample looks like 81. The infectious disease samples are actually DTRA samples, so maybe I will just give Lu a chronic disease and a normal. 81: 4-46 S7 E1 Hi P20 03262013 ND43 L.gpr 35: 4-46 S2 E2 Hi P20 BC009 110512 S.gpr F:\kurt\storage\CIM Research Folder\DR\2013\8-10-13\azim58 wikispaces download as of 8-10-13 example command for Lu java -jar "F:\some_path_here\EntropyOfArray100913d0921.jar" find_summary_numbers_from_folder_of_gprs "F:\some_path_here" "F532 Median" location of time series S:\Administration\PeptideArrayCore\2013 sample run\Normals-2013\Normals 1 month and 6 year 2 people
azim58wiki: