Data analysis techniques portal 02-25-2014d1146
2015-04-14Data analysis techniques portal 02-25-2014d1146
Interesting paper from Krupa
Statistically invalid classification of high throughput gene expression data
- "C:\Users\kurtw_000\Box Sync\DocDR\2014\02-25-2014d1144\Statistically invalid classification of high throughput gene expression data.pdf"
- Paper about invalid classification of high-throughput data. They make the argument that you should not first select the most important features before doing the classification or clustering. Instead, you should use a set of features known a priori before the experiment or just use all of the features. Alternatively, training and test sets can be used.