Geographic Information Systems Asked on August 16, 2021
I realise a very similar question was asked as Rasterio and OpenCV shows two different pixel arrays for same image. However my situation is slightly different and I only wish to know why this difference occurs.
I converted a GeoTIFF (uint16) of mine to JPEG (uint8) via gdal_translate
as follows:
gdal_translate -of JPEG -scale ./rgb.tif rgb.jpg
To confirm whether the image was read correctly, I ran the following script (named test.py
):
import cv2
import rasterio as rio
from skimage import io
import numpy as np
def check(path):
# All images should be in RGB mode
im1 = rio.open(path)
im1 = im1.read()
im1 = im1.transpose(1, 2, 0) # CHW to HWC
im2 = cv2.imread(path)
im2 = im2[:,:,::-1] # BGR to RGB
im3 = io.imread(path)
print(f"Rasterio and OpenCV = {np.all(im1[:,:,0] == im2[:,:,0])}")
print(f"Rasterio and Skimage = {np.all(im1[:,:,0] == im3[:,:,0])}")
if __name__ == "__main__":
check("rgb.jpg")
which returns
Rasterio and OpenCV = False
Rasterio and Skimage = True
Apparently OpenCV reads the image differently from rasterio and skimage. I can also confirm rgb.jpg
is of type uint8
and is read as such by the three libraries.
So does anyone have an idea as to why this happens? Is this expected behaviour?
OS: Ubuntu 16.04
pip = 21.0
python = 3.7.3
conda create -n s1 -c conda-forge rasterio scikit-image python=3.7
conda activate s1
pip install opencv-python
Running test.py
with s1
‘fails’ as rasterio and cv2 reads image differently.
Adding the output of gdalinfo rgb.jpg
river: JPEG/JPEG JFIF
Files: rgb.jpg
rgb.jpg.aux.xml
Size is 1000, 1000
Coordinate System is:
PROJCS["unknown",
GEOGCS["WGS 84",
DATUM["WGS_1984",
SPHEROID["WGS 84",6378137,298.257223563,
AUTHORITY["EPSG","7030"]],
AUTHORITY["EPSG","6326"]],
PRIMEM["Greenwich",0],
UNIT["degree",0.0174532925199433,
AUTHORITY["EPSG","9122"]],
AUTHORITY["EPSG","4326"]],
PROJECTION["Lambert_Azimuthal_Equal_Area"],
PARAMETER["latitude_of_center",-90],
PARAMETER["longitude_of_center",0],
PARAMETER["false_easting",0],
PARAMETER["false_northing",0],
UNIT["metre",1],
AXIS["Easting",NORTH],
AXIS["Northing",NORTH]]
Origin = (-2383139.439709701109678,1465870.867263491498306)
Pixel Size = (25.610955512797485,-25.610955512797485)
Metadata:
AREA_OR_POINT=Area
Image Structure Metadata:
COMPRESSION=JPEG
INTERLEAVE=PIXEL
SOURCE_COLOR_SPACE=YCbCr
Corner Coordinates:
Upper Left (-2383139.440, 1465870.867) ( 58d24'15.47"W, 64d43'49.98"S)
Lower Left (-2383139.440, 1440259.912) ( 58d51'11.39"W, 64d51'11.12"S)
Upper Right (-2357528.484, 1465870.867) ( 58d 7'38.49"W, 64d55'50.63"S)
Lower Right (-2357528.484, 1440259.912) ( 58d34'42.36"W, 65d 3'15.06"S)
Center (-2370333.962, 1453065.390) ( 58d29'26.93"W, 64d53'32.72"S)
Band 1 Block=1000x1 Type=Byte, ColorInterp=Red
Overviews: 500x500, 250x250
Image Structure Metadata:
COMPRESSION=JPEG
Band 2 Block=1000x1 Type=Byte, ColorInterp=Green
Overviews: 500x500, 250x250
Image Structure Metadata:
COMPRESSION=JPEG
Band 3 Block=1000x1 Type=Byte, ColorInterp=Blue
Overviews: 500x500, 250x250
Image Structure Metadata:
COMPRESSION=JPEG
libjpeg-turbo
whereas opencv from conda uses libjpeg
i.e. two different libraries hence two different results.The problem was due to how OpenCV was installed (conda vs pip). The reason for the difference between the conda and pip versions is because they use different libraries for reading jpeg files. The conda version uses libjpeg
while the pip version uses libjpeg-turbo
. According to this issue, it seems rasterio has no plans to support libjpeg-turbo
.
This can be verified by running the following:
python -c "import cv2; print(cv2.getBuildInformation())" | grep jpeg
If opencv was installed via pip
3rdparty dependencies: ittnotify libprotobuf libjpeg-turbo libwebp libtiff libopenjp2 IlmImf quirc ippiw ippicv
JPEG: libjpeg-turbo (ver 2.0.6-62)
If opencv was installed via conda,
JPEG: /home/ash/anaconda3/envs/s2/lib/libjpeg.so (ver 90)
A Github issue was raised here regarding this very matter. Simply put, due to the lossy nature of jpeg and the optimizations done in libjpeg-turbo
there can be no guarantee that the libraries will produce the same results.
OpenCV officially moved from libjpeg
to libjpeg-turbo
post 3.3.0.10. So to verify if this was indeed the issue, I simply needed to compare them.
# For opencv-python<=3.3.0.10
>>python -c "import cv2; print(cv2.getBuildInformation())" | grep jpeg
3rdparty dependencies: ittnotify libprotobuf libjpeg libwebp libpng libtiff libjasper IlmImf
JPEG: libjpeg (ver 90)
This returns True for Rasterio vs OpenCV comparison
# For opencv-python>=3.3.1.11
>>python -c "import cv2; print(cv2.getBuildInformation())" | grep jpeg
3rdparty dependencies: ittnotify libprotobuf libjpeg-turbo libwebp libtiff libopenjp2 IlmImf quirc ippiw ippicv
JPEG: libjpeg-turbo (ver 2.0.6-62)
This returns False for Rasterio vs OpenCV comparison
Thanks to @user2856 and laurent.berger from the opencv thread for helping me narrow down the cause.
Correct answer by Ashwin Nair on August 16, 2021
Very interesting issue...
I also checked on my side with (Python 3.6.9 on Ubuntu 18.04) and I can reproduce it:
rio.__version__: 1.1.8
cv2.__version__: 4.4.0
skimage.__version__: 0.17.2
Let me share my investigations.
First, two tiny practical differences:
I reshaped rasterio image using their built-in tool as explained here:
https://rasterio.readthedocs.io/en/latest/topics/image_processing.html#imageorder
from rasterio.plot import reshape_as_image
im1 = reshape_as_image(im1)
Second, I also used an OpenCV parameter to convert the image to RGB as stated here: https://docs.opencv.org/master/d8/d01/group__imgproc__color__conversions.html
im2 = cv2.cvtColor(im2, cv2.COLOR_BGR2RGB)
Please, also note that using the cv2.IMREAD_UNCHANGED
flag while reading the image with OpenCV doesn't change anything.
I finally double checked the dtypes
:
ims = [im1, im2, im3]
for i in range(len(ims)):
print("shape: {} and dtype: {}".format(ims[i].shape, ims[i].dtype))
The output is consistent:
shape: (1000, 1000, 3) and dtype: uint8
shape: (1000, 1000, 3) and dtype: uint8
shape: (1000, 1000, 3) and dtype: uint8
From here, I dig into the image using a plotting function:
def plotlist(lst, cmap=None):
figsize = (12,12)
fig, axs = plt.subplots(
nrows=1,
ncols=len(lst),
figsize=figsize,
dpi=100,
sharex=True,
sharey=True, )
for i, ax in enumerate(axs):
ax.imshow(lst[i], interpolation='none', cmap=cmap)
plt.tight_layout()
plt.show()
And used a super tiny subset of only 4 pixels (the four last on the right of the bottom row of the image):
subimgs = [im[-1:,-4:,:] for im in ims] # bottom right
plotlist(subimgs)
Which rendered:
Rendering of the 4 last pixels of the last row.
Pretty much the same right? Let's plot the differences:
deltas = []
deltas.append(np.subtract(subimgs[0],subimgs[1])) # rio - cv2
deltas.append(np.subtract(subimgs[0],subimgs[2])) # rio - skimage
deltas.append(np.subtract(subimgs[2],subimgs[1])) # skimage - cv2
plotlist(deltas)
Which rendered:
Rendering of the differences between the images on the 4 last pixels of the last row.
Whoops, something wrong with OpenCV? Yes, maybe, but from there you cannot be sure, it may also be the case that both rasterio and skimage are wrong in the same amount (tiny chances but...)
Let's make this clear by printing the actual values of these 4 last pixels:
>>> subimgs[0] # rio
array([[[58, 40, 38],
[57, 37, 36],
[63, 42, 39],
[54, 30, 28]]], dtype=uint8)
>>> subimgs[2] # cv2
array([[[58, 40, 38],
[56, 38, 36],
[62, 42, 41],
[51, 31, 30]]], dtype=uint8)
>>> subimgs[2] # skimage
array([[[58, 40, 38],
[57, 37, 36],
[63, 42, 39],
[54, 30, 28]]], dtype=uint8)
Again, they look pretty much the same right? But wait... there actually are tiny little differences... (carefully look at each individual values...)
So, what is the truth?
This may (or may not) help, but using a third party tool can help decide. So I opened the image using Gimp and used the pipette on these 4 pixels, and here are the results:
Gimp
pixel 1 RGB: [58, 40, 38]
pixel 2 RGB: [56, 38, 36]
pixel 3 RGB: [62, 42, 41]
pixel 4 RGB: [51, 31, 30]
Yes, these values are the same than the one from OpenCV!
As a GIS person, I also decided to load the image into QGIS and query for the values on each of these pixels; results are the same than in Gimp and OpenCV.
As a last test, I also see what happens with Python GDAL:
from osgeo import gdal
print(gdal.__version__) # 3.1.0
ds = gdal.Open(path)
for i in range(3):
i+=1
print("{}".format(np.array(ds.GetRasterBand(i).ReadAsArray())[-1:,-4:]))
# Output (read pixel RGB values as columns here as each row represents a single band):
[[58 56 62 51]]
[[40 38 42 31]]
[[38 36 41 30]]
Again, the same than Gimp and OpenCV...
So now we have balanced the problem;
OpenCV == Gimp == QGIS == GDAL
rasterio == skimage
As I said, there are 2 possibilities from here;
If someone know what is the dependency tree of all the image libraries used here, it could help fixing it.
I would intuitively say that the chances are higher that there is a glitch in skimage and that rasterio is, somehow, based on it.
My advice; consider also investigating in this direction instead of only OpenCV.
Answered by swiss_knight on August 16, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP