Issue
I want to get a byte array from a big file, but when I use bytearray()
, I'm briefly using double the RAM, which is an issue when I don't have much RAM.
I here have an example that illustrates the issue. So my question is "How do I directly get a bytearray from a file?"
0.5 GB buffers:
from io import BytesIO
mb = 1024 * 1024
gb = 1024 * mb
size = 512 * mb
file = BytesIO(b"\0" * size)
memory = 0
files = []
while True:
file.seek(0)
data = file.read()
memory += size
print("RAM usage: %4.1f GB" % (memory / gb))
data = bytearray(data)
print("RAM usage*: %4.1f GB" % (memory / gb))
files.append(data)
Output:
RAM usage: 0.5 GB
RAM usage*: 0.5 GB
RAM usage: 1.0 GB
RAM usage*: 1.0 GB
RAM usage: 1.5 GB
Killed
[Program finished]
1 GB Buffers
...
size = 1 * gb
...
Output
RAM usage: 1.0 GB
Killed
[Program finished]
Solution
You can read the file in raw mode and use the readinto
method to read the file directly into your pre-allocated bytearray object without consuming more memory.
For example:
import os
buffer = bytearray(os.path.getsize(__file__))
with open(__file__, 'rb') as file:
file.raw.readinto(buffer)
print(buffer)
outputs:
bytearray(b"import os\n\nbuffer = bytearray(os.path.getsize(__file__))\nwith open(__file__, \'rb\') as file:\n\tfile.raw.readinto(buffer)\n\nprint(buffer)")
Demo: https://replit.com/@blhsing/ExternalAngryMenus
Answered By - blhsing
Answer Checked By - David Goodson (JavaFixing Volunteer)