如何使用 Python 识别二进制文件和文本文件?

How to identify binary and text files using Python?(如何使用 Python 识别二进制文件和文本文件?)
本文介绍了如何使用 Python 识别二进制文件和文本文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我需要确定目录中哪个文件二进制,哪个是文本.

I need identify which file is binary and which is a text in a directory.

我尝试使用 mimetypes 但在我的情况下这不是一个好主意,因为它无法识别所有文件 mime,而且我这里有陌生人...我只需要知道,二进制或文本.简单的 ?但是我找不到解决方案...

I tried use mimetypes but it isnt a good idea in my case because it cant identify all files mimes, and I have strangers ones here... I just need know, binary or text. Simple ? But I couldnt find a solution...

谢谢

推荐答案

谢谢大家,我找到了适合我的问题的解决方案.我在 http://code.activestate.com/recipes/173220/ 和我只改变了一点以适合我.

Thanks everybody, I found a solution that suited my problem. I found this code at http://code.activestate.com/recipes/173220/ and I changed just a little piece to suit me.

它工作正常.

from __future__ import division
import string 

def istext(filename):
    s=open(filename).read(512)
    text_characters = "".join(map(chr, range(32, 127)) + list("

	"))
    _null_trans = string.maketrans("", "")
    if not s:
        # Empty files are considered text
        return True
    if "" in s:
        # Files with null bytes are likely binary
        return False
    # Get the non-text characters (maps a character to itself then
    # use the 'remove' option to get rid of the text characters.)
    t = s.translate(_null_trans, text_characters)
    # If more than 30% non-text characters, then
    # this is considered a binary file
    if float(len(t))/float(len(s)) > 0.30:
        return False
    return True

这篇关于如何使用 Python 识别二进制文件和文本文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

How to find element by part of its id name in selenium with python(如何使用python在selenium中通过其id名称的一部分查找元素)
NoSuchElementException: Message: Unable to locate element while trying to click on the button VISA through Selenium and Python(NoSuchElementException:消息:尝试通过 Selenium 和 Python 单击 VISA 按钮时无法找到元素) - IT屋-程序员软件开发技术分
Selenium not able to click on Get Data button on using Python(Selenium 在使用 Python 时无法单击“获取数据按钮)
selenium.common.exceptions.ElementClickInterceptedException: Message: element click intercepted: Element is not clickable with Selenium and Python(selenium.common.exceptions.ElementClickInterceptedException:消息:元素点击被拦截:元素不可点击
Selenium Compound class names not permitted(不允许使用硒化合物类名称)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element while trying to click Next button with selenium(selenium.common.exceptions.NoSuchElementException:消息:没有这样的元素:尝试使用 selenium 单