安装Stanza(处理Stanza无法下载语言模型的错误:ConnectionError)

安装Stanza

(Debug记录)处理Stanza无法下载语言模型的错误:ConnectionError

根据官方文档进行stanza初始安装

pip install stantza
>>> import stanza
>>> stanza.download('en')
>>>nlp = stanza.pipeline('

在执行stanza.download(‘en’)报错:

ConnectionError: HTTPSConnectionPool(host='raw.githubusercontent.com', port=443): Max retries exceeded with url: /stanfordnlp/stanza-resources/main/resources_1.3.0.json (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 11004] getaddrinfo failed'))

产生此错误的原因是网络问题,有条件的直接科学上网应该就可以解决。
我这里边网络环境不允许,于是选择离线下载,并放到指定的文件夹下。

我们可以选择到github下载需要的 resource.json对应的github库为https://github.com/stanfordnlp/stanza-resources
我下载的文件为resource_1.3.0.json,下载后改名为resources.json并放置到~\stanza_resources\目录下。如果缺少该文件,你可能会遇到类似以下错误:

ResourcesFileNotFoundError: Resources file not found at: C:\Users\gz927\stanza_resources\resources.json  Try to download the model again.

有了resource.json,还需要相应的 语言包,语言包相对较大,这里我们选择到huggingface去下载。对应的地址为https://huggingface.co/stanfordnlp/stanza-en。下载后解压并放置到~\stanza_resources\en\目录下

============================

如果安装了resource.json和语言包后,提示缺少某个特定的模型,产生类似下面的报错:

FileNotFoundError: Could not find model file C:\Users\gz927\stanza_resources\en\tokenize\combined.pt, although there are other models downloaded for language en.  Perhaps you need to download a specific model.  Try: stanza.download(lang="en",package=None,processors={"tokenize":"combined"})

则说明你下载的resouce.json和语言模型的版本不是对应的,关于版本匹配问题,一个比较粗暴的解决方案是都下载最新的版本( 因为huggingface上的语言模型是最新的,所以我就是这样做的:把resource.json换成了最新的1.3.1版本)。下载完成后进行测试:

>>> nlp = stanza.Pipeline('en')
2021-12-11 08:06:27 INFO: Loading these models for language: en (English):
============================
| Processor    | Package   |
RuntimeError                              Traceback (most recent call last)
<ipython-input-3-543633ac586b> in <module>
| tokenize     | combined  |
| pos          | combined  |
| lemma        | combined  |
| depparse     | combined  |
| sentiment    | sstplus   |
| constituency | wsj       |
| ner          | ontonotes |
============================

2021-12-11 08:26:49 INFO: Use device: cpu
2021-12-11 08:26:49 INFO: Loading: tokenize
2021-12-11 08:26:49 INFO: Loading: pos
2021-12-11 08:26:49 INFO: Loading: lemma
2021-12-11 08:26:49 INFO: Loading: depparse
2021-12-11 08:26:49 INFO: Loading: sentiment
2021-12-11 08:26:49 INFO: Loading: constituency
2021-12-11 08:26:50 INFO: Loading: ner
2021-12-11 08:26:50 INFO: Done loading processors!

Original: https://blog.csdn.net/gz927cool/article/details/121868829
Author: gz927cool
Title: 安装Stanza(处理Stanza无法下载语言模型的错误:ConnectionError)

原创文章受到原创版权保护。转载请注明出处:https://www.johngo689.com/527945/

转载文章受原作者版权保护。转载请注明原作者出处!

(0)

大家都在看

亲爱的 Coder【最近整理,可免费获取】👉 最新必读书单  | 👏 面试题下载  | 🌎 免费的AI知识星球