how to quickly get a feel for a dataset with a heatmap with gitools 04-14-2015d0933
2015-04-14how to quickly get a feel for a dataset with a heatmap with gitools 04-14-2015d0933
overall behavior of a dataset
This method certainly is not the ultimate ideal method for quickly making a heatmap, clustering, and obtaining a dendrogram and everything, but it is the easiest method I have found so far 04-14-2015d1000
- put the data in a text file
- -If you want the color bar scale to be unique for each column, then normalize each column. Normalization can be done by subtracting the current value from the min and dividing by the max-min (e.g. (value-min)/(max-min)). Then all of the values in that column will be inbetween 0 and 1.
- --Values inbetween 0 and 1 are good for the green to white to red colorbar in gitoools. However, if you want a colorbar with more different colors, then you could transform all of the values so that they lie inbetween -10 and 10 for use with the z colorbar in gitools. The values can be transformed by simply multiplying the 0 to 1 values by 20 and subtracting 10.
- -note that if you see that a certain column doesn't really have very many cells that change color, you could take the log10 for all of the values in that column before the normalization. Then there can be a better spread of the data for visualization.
- open the data using matrix view (unless you organized your data for the table layout)
- give the "data layer" a name like "data"
- Choose the colors that you want.
- -If you are using the linear scale with just three colors, then you probably want to change the min to 0 the mid to 0.5 and the max to 1
- -If you are using the linear scale, you may want to change the middle color to black (or you may prefer the default white).
- -If you are using the z-score scale, you may want to change the non-significant middle color to green (or you may prefer the default gray color)
- -note that the color scale can also be changed later on by going to edit-layers->choose your layer->color scale
- cluster the rows and columns by going to analysis-clustering. This will reorder the rows or columns in the heatmap, and also output a separate visual dendrogram
- -For some reason the reordered rows in the heatmap do not have exactly the same order as the samples in the dendrogram, but perhaps this is not too much of a problem.
- --If you really have a strong desire to line up the dendrogram with the heatmap rows, then you could go through some work to manually do this. You could type the names of the samples in order from the dendrogram or use ocr or something (after the image has been converted to a pdf). Then you could reorganize the data in the table so that it is in the order you want. Then in an image or diagram editing program you can combine the dendrogram with the properly ordered samples. For many situations, you may just want to get a quick look at the data so it may not be that necessary to have a dendrogram next to the rows or columns in your heatmap. For large heatmaps, the organization of the dendrogram might be too complex and hard to see in the small space on the side anyway.