在 mmap 文件中删除/插入数据

Delete / Insert Data in mmap#39;ed File(在 mmap 文件中删除/插入数据)
本文介绍了在 mmap 文件中删除/插入数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着跟版网的小编来一起学习吧!

问题描述

我正在编写一个 Python 脚本,该脚本映射一个文件以使用 mmap() 进行处理.

I am working on a script in Python that maps a file for processing using mmap().

这些任务需要我更改文件的内容

The tasks requires me to change the file's contents by

  1. 替换数据
  2. 将数据添加到文件中的偏移处
  3. 从文件中删除数据(而不是删除)

只要旧数据和新数据的字节数相同,替换数据就很有效:

Replacing data works great as long as the old data and the new data have the same number of bytes:

VDATA = mmap.mmap(f.fileno(),0)
start = 10
end = 20
VDATA[start:end] = "0123456789"

但是,当我尝试删除数据(用"替换范围)或插入数据(用比范围长的内容替换范围)时,我收到错误消息:

However, when I try to remove data (replacing the range with "") or inserting data (replacing the range with contents longer than the range), I receive the error message:

IndexError: mmap 切片分配是尺寸不对

IndexError: mmap slice assignment is wrong size

这是有道理的.

现在的问题是,如何从 mmap 文件中插入和删除数据?通过阅读文档,我似乎可以使用一系列低级操作来回移动文件的全部内容,但如果有更简单的解决方案,我宁愿避免这样做.

The question now is, how can I insert and delete data from the mmap'ed file? From reading the documentation, it seems I can move the file's entire contents back and forth using a chain of low-level actions but I'd rather avoid this if there is an easier solution.

推荐答案

在没有其他选择的情况下,我继续编写了两个辅助函数 - deleteFromMmap() 和 insertIntoMmap() - 来处理低级文件操作并简化发展.

In lack of an alternative, I went ahead and wrote two helper functions - deleteFromMmap() and insertIntoMmap() - to handle the low level file actions and ease the development.

关闭和重新打开 mmap 而不是使用 resize() 是由于 unix 上的 python 中的一个错误导致 resize() 失败.(http://mail.python.org/pipermail/python-bugs-list/2003-May/017446.html)

The closing and reopening of the mmap instead of using resize() is do to a bug in python on unix derivates leading resize() to fail. (http://mail.python.org/pipermail/python-bugs-list/2003-May/017446.html)

函数包含在一个完整的示例中.全局变量的使用取决于主项目的格式,但您可以轻松调整它以匹配您的编码标准.

The functions are included in a complete example. The use of a global is due to the format of the main project but you can easily adapt it to match your coding standards.

import mmap

# f contains "0000111122223333444455556666777788889999"

f = open("data","r+")
VDATA = mmap.mmap(f.fileno(),0)

def deleteFromMmap(start,end):
    global VDATA
    length = end - start
    size = len(VDATA)
    newsize = size - length

    VDATA.move(start,end,size-end)
    VDATA.flush()
    VDATA.close()
    f.truncate(newsize)
    VDATA = mmap.mmap(f.fileno(),0)

def insertIntoMmap(offset,data):
    global VDATA
    length = len(data)
    size = len(VDATA)
    newsize = size + length

    VDATA.flush()
    VDATA.close()
    f.seek(size)
    f.write("A"*length)
    f.flush()
    VDATA = mmap.mmap(f.fileno(),0)

    VDATA.move(offset+length,offset,size-offset)
    VDATA.seek(offset)
    VDATA.write(data)
    VDATA.flush()

deleteFromMmap(4,8)

# -> 000022223333444455556666777788889999

insertIntoMmap(4,"AAAA")

# -> 0000AAAA22223333444455556666777788889999

这篇关于在 mmap 文件中删除/插入数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持跟版网!

本站部分内容来源互联网,如果有图片或者内容侵犯您的权益请联系我们删除!

相关文档推荐

patching a class yields quot;AttributeError: Mock object has no attributequot; when accessing instance attributes(修补类会产生“AttributeError:Mock object has no attribute;访问实例属性时)
How to mock lt;ModelClassgt;.query.filter_by() in Flask-SqlAlchemy(如何在 Flask-SqlAlchemy 中模拟 lt;ModelClassgt;.query.filter_by())
FTPLIB error socket.gaierror: [Errno 8] nodename nor servname provided, or not known(FTPLIB 错误 socket.gaierror: [Errno 8] nodename nor servname provided, or not known)
Weird numpy.sum behavior when adding zeros(添加零时奇怪的 numpy.sum 行为)
Why does the #39;int#39; object is not callable error occur when using the sum() function?(为什么在使用 sum() 函数时会出现 int object is not callable 错误?)
How to sum in pandas by unique index in several columns?(如何通过几列中的唯一索引对 pandas 求和?)