Question or problem about Python programming:
I’m trying to read a BMP file in Python. I know the first two bytes
indicate the BMP firm. The next 4 bytes are the file size. When I execute:
fin = open("hi.bmp", "rb") firm = fin.read(2) file_size = int(fin.read(4))
I get:
What I want to do is reading those four bytes as an integer, but it seems Python is reading them as characters and returning a string, which cannot be converted to an integer. How can I do this correctly?
How to solve the problem:
Solution 1:
The read
method returns a sequence of bytes as a string. To convert from a string byte-sequence to binary data, use the built-in struct
module: http://docs.python.org/library/struct.html.
import struct print(struct.unpack('i', fin.read(4)))
Note that unpack
always returns a tuple, so struct.unpack('i', fin.read(4))[0]
gives the integer value that you are after.
You should probably use the format string '<i'
(< is a modifier that indicates little-endian byte-order and standard size and alignment – the default is to use the platform’s byte ordering, size and alignment). According to the BMP format spec, the bytes should be written in Intel/little-endian byte order.
Solution 2:
An alternative method which does not make use of ‘struct.unpack()’ would be to use NumPy:
import numpy as np f = open("file.bin", "r") a = np.fromfile(f, dtype=np.uint32)
‘dtype’ represents the datatype and can be int#, uint#, float#, complex# or a user defined type. See numpy.fromfile
.
Personally prefer using NumPy to work with array/matrix data as it is a lot faster than using Python lists.
Solution 3:
As of Python 3.2+, you can also accomplish this using the from_bytes
native int method:
file_size = int.from_bytes(fin.read(2), byteorder='big')
Note that this function requires you to specify whether the number is encoded in big- or little-endian format, so you will have to determine the endian-ness to make sure it works correctly.
Solution 4:
Except struct
you can also use array
module
import array values = array.array('l') # array of long integers values.read(fin, 1) # read 1 integer file_size = values[0]
Solution 5:
As you are reading the binary file, you need to unpack it into a integer, so use struct module for that
import struct fin = open("hi.bmp", "rb") firm = fin.read(2) file_size, = struct.unpack("i",fin.read(4))
Solution 6:
When you read from a binary file, a data type called bytes is used. This is a bit like list or tuple, except it can only store integers from 0 to 255.
Try:
file_size = fin.read(4) file_size0 = file_size[0] file_size1 = file_size[1] file_size2 = file_size[2] file_size3 = file_size[3]
Or:
file_size = list(fin.read(4))
Instead of:
file_size = int(fin.read(4))