RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)







RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

前 言


; 摘要




1. 软件安装


if (!requireNamespace("ConsensusClusterPlus", quietly = TRUE)) {


2. 数据读取

DEG = read.table("DEG-resdata.xls", sep = "\t", check.names = F, header = T)
## Down   Up
## 1296 2832
group <- 1 2 3 6 10 13 15 19 20 33 39 41 42 49 59 108 122 175 478 519 4128 read.table("deg-group.xls", sep="\t" , check.names="F," header="T)" table(group$group) ## nt tp exp <- deg[, 8:ncol(deg)] exp[1:3, 1:3] tcga-3l-aa1b-01a-11r-a37k-07 tcga-4n-a93t-01a-11r-a37k-07 tcga-4t-aa8h-01a-11r-a41b-07 dim(exp) [1] rownames(exp)="DEG[," 1] group[group$group %in% "tp", ]$sample tumormat exp[, tp] tumormat[1:3, tcga-d5-6530-01a-11r-1723-07 tcga-g4-6320-01a-11r-1723-07 ensg00000142959 ensg00000163815 ensg00000107611 tcga-ad-6888-01a-11r-1928-07 dim(tumormat) < code></->

3. 实操

实操之前我们先了解一些参数的设置,如下:pItem (item resampling, proportion of items to sample) : 80% pFeature (gene resampling, proportion of features to sample) : 80% maxK (a maximum evalulated k, maximum cluster number to evaluate) : 6 reps (resamplings, number of subsamples) : 50 clusterAlg (agglomerative heirarchical clustering algorithm) : ‘hc’ (hclust) distance : ‘pearson’ (1 – Pearson correlation)


title = tempdir()
results <- consensusclusterplus(as.matrix(tumormat), maxk="6," reps="50," pitem="0.8," pfeature="0.8," clusteralg="hc" , distance="pearson" title="title," plot="png" ) < code></->


consensusTree <- 1 478 results[[3]][["consensustree"]] consensustree ## call: hclust(d="as.dist(1" - fm), method="finalLinkage)" cluster : average number of objects: # hclust选项 consensusmatrix <- results[[3]][["consensusmatrix"]] consensusmatrix[1:5, 1:5] [,1] [,2] [,3] [,4] [,5] [1,] [2,] [3,] [4,] [5,] 样本分类 consensusclass results[[3]][["consensusclass"]] < code></->

4. 结果展示

1. 一致性矩阵

For each k, CM plots depict consensus values on a white to blue colour scale, are ordered by the consensus clustering which is shown as a dendrogram, and have items’ consensus clusters marked by coloured rectangles between the dendrogram and consensus values

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)


2. 一致性累积分布函数图

Empirical cumulative distribution function (CDF) plots display consensus distributions for each k.

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

; 3. 碎石图

这个之前我们有讲过,机器学习里面又提过 MachineLearning 3. 聚类分析(Cluster Analysis)。

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

4. 跟踪图

The item tracking plot shows the consensus cluster of items (in columns) at each k (in rows). This allows a user to track an item’s cluster assignments across different k, to identify promiscuous items that are suggestive of weak class membership, and to visualize the distribution of cluster sizes across k.

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

; 5. 聚类一致性(cluster-consensus)和样品一致性(item-consensus)

#&#x8BA1;&#x7B97;&#x805A;&#x7C7B;&#x4E00;&#x81F4;&#x6027; (cluster-consensus) &#x548C;&#x6837;&#x54C1;&#x4E00;&#x81F4;&#x6027; (item-consensus)
icl <- 1 2 3 4 5 6 20 9560 calcicl(results, #title="title," plot="png" ) ## 返回了具有两个元素的list,然后分别查看一下 dim(icl[["clusterconsensus"]]) [1] icl[["clusterconsensus"]] k cluster clusterconsensus [1,] 0.9979464 [2,] 0.9530334 [3,] 0.9701812 [4,] 0.9425426 [5,] 0.8425318 [6,] 0.9174626 [7,] 0.9312305 [8,] 0.7829472 [9,] nan [10,] 0.9724148 [11,] 0.9584888 [12,] 0.9302702 [13,] [14,] [15,] 0.9450896 [16,] 0.9548545 [17,] 0.9244208 [18,] 0.9655172 [19,] [20,] 1.0000000 dim(icl[["itemconsensus"]]) icl[["itemconsensus"]][1:5,] item itemconsensus tcga-aa-a00o-01a-02r-a089-07 0.3334626 tcga-aa-3664-01a-01r-0905-07 0.2818590 tcga-a6-3810-01a-01r-a278-07 0.1793995 tcga-a6-2677-01b-02r-a277-07 0.1349282 tcga-a6-2674-01a-02r-a278-07 0.1842922 < code></->


This plot is similar to colour maps (Hoffmann et al., 2007). Item-consensus (IC) is the average consensus value between an item and members of a consensus cluster, so that there are multiple IC values for an item at a k corresponding to the k clusters. IC plots display items as vertical bars of coloured rectangles whose height corresponds to IC values.

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)


Cluster-consensus (CLC) is the average pairwise IC of items in a consensus cluster. The CLC plot displays these values as a bar plot that are grouped at each k

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)


RNA 1. 基因表达那些事–基于 GEO
RNA 2. SCI文章中基于GEO的差异表达基因之 limma
RNA 3. SCI 文章中基于T CGA 差异表达基因之 DESeq
RNA 4. SCI 文章中基于TCGA 差异表达之 edgeR
RNA 5. SCI 文章中差异基因表达之 MA 图
RNA 6. 差异基因表达之– 火山图 (volcano)
RNA 7. SCI 文章中的基因表达——主成分分析 (PCA)
RNA 8. SCI文章中差异基因表达–热图 (heatmap)
RNA 9. SCI 文章中基因表达之 GO 注
RNA 10. SCI 文章中基因表达富集之–KEGG
RNA 11. SCI 文章中基因表达富集之 GSE
RNA 12. SCI 文章中肿瘤免疫浸润计算方法之 CIBERSORT
RNA 13. SCI 文章中差异表达基因之 WGCNA
RNA 14. SCI 文章中差异表达基因之 蛋白互作网络 (PPI)
RNA 15. SCI 文章中的融合基因之 FusionGDB2
RNA 16. SCI 文章中的融合基因之可视化
RNA 17. SCI 文章中的筛选 Hub 基因 (Hub genes)
RNA 18. SCI 文章中基因集变异分析 GSVA
RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)


RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)

; References:

  1. Wilkerson MD, Hayes DN. ConsensusClusterPlus: a class discovery tool with confidence assessments and item tracking. Bioinformatics. 2010;26(12):1572-1573.

  2. Monti, S., Tamayo, P., Mesirov, J., Golub, T. (2003) Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data. Machine Learning, 52, 91-11

Original: https://blog.csdn.net/weixin_41368414/article/details/124429249
Author: 桓峰基因
Title: RNA 19. SCI 文章中无监督聚类法 (ConsensusClusterPlus)





亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球