关闭 urllib2 连接

Close urllib2 connection(关闭 urllib2 连接)
本文介绍了关闭 urllib2 连接的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我正在使用 urllib2 从 ftp 和 http 服务器加载文件.

某些服务器仅支持每个 IP 一个连接.问题是,urllib2 不会立即关闭连接.查看示例程序.

从 urllib2 导入 urlopen从时间导入睡眠url = 'ftp://user:pass@host/big_file.ext'定义加载文件(网址):f = urlopen(url)加载 = 0而真:数据 = f.read(1024)如果数据 == '':休息已加载 += len(数据)f.close()#睡眠(1)print('已加载 {0}'.format(已加载))加载文件(网址)加载文件(网址)

代码从仅支持 1 个连接的 ftp 服务器加载两个文件(此处两个文件相同).这将打印以下日志:

已加载 463675266回溯(最近一次通话最后):文件conection_test.py",第 20 行,在 <module>加载文件(网址)文件connection_test.py",第 7 行,在 load_file 中f = urlopen(url)文件/usr/lib/python2.6/urllib2.py",第 126 行,在 urlopenreturn _opener.open(网址,数据,超时)文件/usr/lib/python2.6/urllib2.py",第 391 行,打开响应 = self._open(请求,数据)_open 中的文件/usr/lib/python2.6/urllib2.py",第 409 行'_open',请求)_call_chain 中的文件/usr/lib/python2.6/urllib2.py",第 369 行结果 = 函数(*args)文件/usr/lib/python2.6/urllib2.py",第 1331 行,在 ftp_openfw = self.connect_ftp(用户,密码,主机,端口,目录,req.timeout)文件/usr/lib/python2.6/urllib2.py",第 1352 行,在 connect_ftpfw = ftpwrapper(用户、密码、主机、端口、目录、超时)__init__ 中的文件/usr/lib/python2.6/urllib.py",第 854 行self.init()文件/usr/lib/python2.6/urllib.py",第 860 行,在 initself.ftp.connect(self.host,self.port,self.timeout)文件/usr/lib/python2.6/ftplib.py",第 134 行,在连接中self.welcome = self.getresp()文件/usr/lib/python2.6/ftplib.py",第 216 行,在 getresp 中提高error_temp,respurllib2.URLError: <urlopen 错误 ftp 错误: 421 来自您的 Internet 地址的连接太多.>

所以第一个文件被加载,第二个文件失败,因为第一个连接没有关闭.

但是当我在 f.close() 之后使用 sleep(1) 时不会发生错误:

已加载 463675266已加载 463675266

有什么办法可以强制关闭连接,以免第二次下载失败?

解决方案

原因确实是文件描述符泄漏.我们还发现,使用 jython 时,问题比使用 cpython 时要明显得多.一位同事提出了这个解决方案:

<上一页>fdurl = urllib2.urlopen(req,timeout=self.timeout)realsock = fdurl.fp._sock.fp._sock** # 我们想稍后关闭真实"套接字req = urllib2.Request(url, header)尝试:fdurl = urllib2.urlopen(req,timeout=self.timeout)除了 urllib2.URLError,e:打印urlopen 异常",erealsock.close()fdurl.close()

修复很丑陋,但确实有效,不再有打开的连接太多".

I'm using urllib2 to load files from ftp- and http-servers.

Some of the servers support only one connection per IP. The problem is, that urllib2 does not close the connection instantly. Look at the example-program.

from urllib2 import urlopen
from time import sleep

url = 'ftp://user:pass@host/big_file.ext'

def load_file(url):
    f = urlopen(url)
    loaded = 0
    while True:
        data = f.read(1024)
        if data == '':
            break
        loaded += len(data)
    f.close()
    #sleep(1)
    print('loaded {0}'.format(loaded))

load_file(url)
load_file(url)

The code loads two files (here the two files are the same) from an ftp-server which supports only 1 connection. This will print the following log:

loaded 463675266
Traceback (most recent call last):
  File "conection_test.py", line 20, in <module>
    load_file(url)
  File "conection_test.py", line 7, in load_file
    f = urlopen(url)
  File "/usr/lib/python2.6/urllib2.py", line 126, in urlopen
    return _opener.open(url, data, timeout)
  File "/usr/lib/python2.6/urllib2.py", line 391, in open
    response = self._open(req, data)
  File "/usr/lib/python2.6/urllib2.py", line 409, in _open
    '_open', req)
  File "/usr/lib/python2.6/urllib2.py", line 369, in _call_chain
    result = func(*args)
  File "/usr/lib/python2.6/urllib2.py", line 1331, in ftp_open
    fw = self.connect_ftp(user, passwd, host, port, dirs, req.timeout)
  File "/usr/lib/python2.6/urllib2.py", line 1352, in connect_ftp
    fw = ftpwrapper(user, passwd, host, port, dirs, timeout)
  File "/usr/lib/python2.6/urllib.py", line 854, in __init__
    self.init()
  File "/usr/lib/python2.6/urllib.py", line 860, in init
    self.ftp.connect(self.host, self.port, self.timeout)
  File "/usr/lib/python2.6/ftplib.py", line 134, in connect
    self.welcome = self.getresp()
  File "/usr/lib/python2.6/ftplib.py", line 216, in getresp
    raise error_temp, resp
urllib2.URLError: <urlopen error ftp error: 421 There are too many connections from your internet address.>

So the first file is loaded and the second fails because the first connection was not closed.

But when i use sleep(1) after f.close() the error does not occurr:

loaded 463675266
loaded 463675266

Is there any way to force close the connection so that the second download would not fail?

解决方案

The cause is indeed a file descriptor leak. We found also that with jython, the problem is much more obvious than with cpython. A colleague proposed this sollution:

 

    fdurl = urllib2.urlopen(req,timeout=self.timeout)
    realsock = fdurl.fp._sock.fp._sock** # we want to close the "real" socket later 
    req = urllib2.Request(url, header)
    try:
             fdurl = urllib2.urlopen(req,timeout=self.timeout)
    except urllib2.URLError,e:
              print "urlopen exception", e
    realsock.close() 
    fdurl.close()

The fix is ugly, but does the job, no more "too many open connections".

这篇关于关闭 urllib2 连接的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

patching a class yields quot;AttributeError: Mock object has no attributequot; when accessing instance attributes(修补类会产生“AttributeError:Mock object has no attribute;访问实例属性时)
How to mock lt;ModelClassgt;.query.filter_by() in Flask-SqlAlchemy(如何在 Flask-SqlAlchemy 中模拟 lt;ModelClassgt;.query.filter_by())
FTPLIB error socket.gaierror: [Errno 8] nodename nor servname provided, or not known(FTPLIB 错误 socket.gaierror: [Errno 8] nodename nor servname provided, or not known)
Weird numpy.sum behavior when adding zeros(添加零时奇怪的 numpy.sum 行为)
Why does the #39;int#39; object is not callable error occur when using the sum() function?(为什么在使用 sum() 函数时会出现 int object is not callable 错误?)
How to sum in pandas by unique index in several columns?(如何通过几列中的唯一索引对 pandas 求和?)