[. Unlike ViT that can only generate a single-resolution feature map, the goal of this module is, given an input image, to generate CNN-like multi-level features. These features provide high-resolution coarse features and low-resolution fine-grained features that usually boost the performance of semantic segmentation. More precisely, given an input image with a resolution of H × W × 3, we perform patch merging to obtain a hierarchical feature map Fi with a resolution of H 2 i+1 × 2 Wi+1 × Ci , where i ∈ {1, 2, 3, 4}, and Ci+1 is larger than Ci 翻译
最新发布](https://wenku.csdn.net/answer/2831a68466844c818ceba59fdd834e56)
Original: https://blog.csdn.net/qq_36618444/article/details/122819078
Author: 五月的echo
Title: Coarse2Fine: Fine-grained Text Classification on Coarsely-grained Annotated Data
原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/665855/
转载文章受原作者版权保护。转载请注明原作者出处!