Question
Asked 22nd Apr, 2021

How to clip a raster with exact shapefile where the output raster will contain same area of the shapefile?

I clipped a classified image (geotif) using a shapefile and the output shows larger shape area than the shapefile. Why it is happening? I classified an landsat 7 image with KNN. I found a classified map. I clipped that raster using by boundary shapefile. My boundary shapefile is 208 sq.km. But when I calculated all the land class values, the total land are becomes ~232 sq.km. I do not know why it is happening. I've tried extract by mask, clip, but same result is coming.

Most recent answer

Mizbah Ahmed Sresto
Khulna University of Engineering and Technology
Polygon is a vector data and satellite images are raster data. If you zoom in the satellite images then you will see the area is pixelated. If you have a closer look alongside the polygon then the pattern of pixel will be more noticeable.
For example a Landsat data having 30m resolution, meaning each pixel stands for a 30m x 30m area on the ground. So, when you you are extracting the area with vector data(line or polygon) the polygon will count the whole pixel which may be outside(crossing the boundary slightly which is noticeable when you zoom it) the line or polygon. Using higher resolution images like 10m might reduce the problem but won't extract the same area. So in this case it's not possible to get exact the same area by using extract by mask tool.
You can try another approach for extracting the same area.
Convert the classified image into vector data by using raster to polygon tool then use clip tool to extract it. Make sure the dataset is projected in the same coordinate system. Tafsirul Islam

Popular answers (1)

Srini Vasan
University of New Mexico
I agree with the answer from @shar.
It will be more exact when one uses a vector file to clip another vector file
3 Recommendations

All Answers (18)

Sher Bahadur Gurung
Tribhuvan University
I suggest to change classified raster map into vector shapefile then clip it for exact area as your boundary.
1 Recommendation
Stephen Maack
REAP Change Consultants
I'm not sure and that you double check the useful ESRI online "help" or call ESRI but @Shar Bahadur Gurung's approach seems reasonable. I've only had one image processing GIS class and am not an expert in this area, but here's some thoughts. The downloaded Landsat 7 file is a raster file and so any attempt to match the raster file to the shapefile would pick up rasters from the image. The rasters that include the boundaries of the shapefile would overlap into the area outside of the shapefile and if that area is land, so adding up land class values --which would come from the Landsat 7 image -- would be greater than that calculated from the classified map vector file. The Landsat 7 documentation provides the raster sizes of the imaging instruments on the satellite that would be used to create the land class values. You could check out my assumption by creating a GIS map that has the vector boundary from the classified map as one layer and the area that you pulled from the Landsat 7 Geotiff as another layer, turn both layers on, then zoom in until you can see the raster boundaries and check those in relation to the shapefile boundary. I'm wondering if there might also be difference in area related to the process you used to classify land class values. Try different ways to create the land use classification and see if one approach gives a closer value to the "ground truth" of the classified map vector file than another?
Srini Vasan
University of New Mexico
I agree with the answer from @shar.
It will be more exact when one uses a vector file to clip another vector file
3 Recommendations
Abdullah Al Rakib
Sheltech Consultants (Pvt.) Ltd.
Why is this happening?
It's a common phenomenon and I believe there is nothing to worry about actually if you are not doing a detailed analysis. That's because your shapefile is a vector data that is made of point, line, and polygon. Thus, vector data become more precise in area calculation than raster data.
On the other hand, raster data actually are made of pixels. Thus, it is not that precise as the vector data and have more area than the actual.
How to remedy this situation?
To remedy this, you can convert your raster data to vector data using the raster to polygon tool. Then you can use the intersect tool to find the exact area of the classified study area.
I hope it helps.
The following image was downloaded from Google. I do not have the authorship of the image.
Stephen Maack
REAP Change Consultants
I agree with Abdullah Al Rakib and with the solution that he and Sher Bahadur Gurung state.
Tafsirul Islam
Khulna University
Thank you for your answer. I've tried clipping and extract by mask after converting the classified raster to polygon. Still I've similar issue. The calculated land classes show over 7-15 sq.km from the original land area which is not acceptable.
Additionally, my classified image comes with no CRS. I shifted and registered the classified image to the base map (original satellite image). I found all the four corners do not match exactly, one corner still has a gap. How to solve this coordinate issue?
Stephen Maack
REAP Change Consultants
Check if the base map (satellite image) and the vector file use the same coordinate system. Otherwise the boundaries won't match up if they are based on different coordinate systems.
Stephen Maack
REAP Change Consultants
Sorry, I meant to see PROJECTION system. (Just realized that CRS is "coordinate projection system" -- is that the same as "projection"?). If using ESRI GIS then make the point, lines, and polygons vector map, which will have a coordinate system, the base map, put it in the GIS file first and then the other layers should match to that. You might need to check ESRI "Help" on how to insure that. Otherwise, when taking a class in an earlier version of ESRI software we did an exercise in which we matched a vector map point to a visible, easily identifiable photograph of an oil well with pollution so that the vector map and the photograph had the same relative reference system. This was done by matching up at least three points on the two. Also, make sure that the satellite image that you are using is orthorectified. "Using elevation to enable accurate image georeferencing Imagery has an amazing amount of information, but raw aerial or satellite imagery cannot be used in a GIS until it has been processed such that all pixels are in an accurate (x,y) position on the ground. Photogrammetry is a discipline, developed over many decades, for processing imagery to generate accurately georeferenced images, referred to as orthorectified images (or sometimes simply orthoimages). Orthorectified images have been processed to apply corrections for optical distortions from the sensor system, and apparent changes in the position of ground objects caused by the perspective of the sensor view angle and ground terrain."
Tafsirul Islam
Khulna University
Stephen Maack Thanks for your time. I have double checked the CRS. We have downloaded the images from google earth engine platform and the vectors used were absolutely perfect in terms of coordinate system. Issue is when I'm classifying the images using KNN in jupyter notebook (i'm still novice in classification in (g earth engine), the output raster comes with no CRS. I've shifted the classified raster to match the extent of the satellite image but still there are small gaps (i've also tried image registration in other form) which is not fully aligning the classified image to base raster. Hope I've made you clear about my issue. Thanks again.
Stephen Maack
REAP Change Consultants
I know nothing about Jupyter notebook or g earth engine and at this point I'm out of my limited understanding of satellite imagery processing and analysis. Good luck.
Sher Bahadur Gurung
Tribhuvan University
Tafsirul Islam, there is different data format one is raster and another is vector shape file. So I suggest you to change them into same data format raster or vector shape file for same area.
Md. Nazmul Huda Naim
Chittagong University of Engineering & Technology
Abdullah Al Rakib The problem is happened due to that you are clipping raster data with vector data. A vector data can not properly/exactly clip a raster data due to larger cell-size of raster data. That's why the area varies for the data types (vector and raster). there are two ways of solving this problem (as per I know). Firstly, you can convert the raster data into vector data before clipping. And another way is that you can adjust the calculated area with the proportion of the exact area (the area you got according to vector data).
1 Recommendation
Abdullah Al Rakib
Sheltech Consultants (Pvt.) Ltd.
Md. Nazmul Huda Naim Yes, that's exactly what I have told.
Abdullah-Al- Faisal
McGill University
You will not get the exact same area as same as polygon shapefile in raster dataset. The boundary of a polygon is smooth, whereas the raster data is pixel based (which is not smooth in boundaries). There will always be a little fluctuation found in the raster data in terms of calculating total area.
However, the error can be minimized while clipping the raster data using the shapefile. All you have to do is, set the extent of the raster by selecting the shapefile from the environment in ArcGIS software while using the "extract by mask" tool.
1 Recommendation
Abdulrazzaq Shaamala
Jordan University of Science and Technology
You can use clip or during the analysis from environment you can choose the vector area as mask.
Cheshini Malavipathirana
General Sir John Kotelawala Defence University
First, you need to project the whole data set in to same coordinate system. Then you can clip the raster using extract by mask or raster clip.
other one is you can convert the raster data to vector using feature to raster. after that you can clip the data
Mizbah Ahmed Sresto
Khulna University of Engineering and Technology
Polygon is a vector data and satellite images are raster data. If you zoom in the satellite images then you will see the area is pixelated. If you have a closer look alongside the polygon then the pattern of pixel will be more noticeable.
For example a Landsat data having 30m resolution, meaning each pixel stands for a 30m x 30m area on the ground. So, when you you are extracting the area with vector data(line or polygon) the polygon will count the whole pixel which may be outside(crossing the boundary slightly which is noticeable when you zoom it) the line or polygon. Using higher resolution images like 10m might reduce the problem but won't extract the same area. So in this case it's not possible to get exact the same area by using extract by mask tool.
You can try another approach for extracting the same area.
Convert the classified image into vector data by using raster to polygon tool then use clip tool to extract it. Make sure the dataset is projected in the same coordinate system. Tafsirul Islam

Similar questions and discussions

Related Publications

Article
The integration of LANDSAT data was achieved through the development of a flexible, compatible analysis tool and using an existing data base to select the usable data from a LANDSAT analysis. The software package allows manipulation of grid cell data plus the flexibility to allow the user to include FORTRAN statements for special functions. Using t...
Got a technical question?
Get high-quality answers from experts.