Access to this full-text is provided by American Association for the Advancement of Science.
Content available from Plant Phenomics
This content is subject to copyright. Terms and conditions apply.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 1
RESEARCH ARTICLE
Segment Anything for Comprehensive
Analysis of Grapevine Cluster Architecture
and Berry Properties
Efrain Torres-Lomas1, Jimena Lado-Bega2, Guillermo Garcia-Zamora1,
and Luis Diaz-Garcia1*
1Department of Viticulture and Enology, University of California Davis, Davis, CA 95616, USA. 2Soil and
Water Department, Universidad de la Republica, Montevideo 11400, Uruguay.
*Address correspondence to: diazgarcia@ucdavis.edu
Grape cluster architecture and compactness are complex traits influencing disease susceptibility, fruit
quality, and yield. Evaluation methods for these traits include visual scoring, manual methodologies, and
computer vision, with the latter being the most scalable approach. Most of the existing computer vision
approaches for processing cluster images often rely on conventional segmentation or machine learning
with extensive training and limited generalization. The Segment Anything Model (SAM), a novel foundation
model trained on a massive image dataset, enables automated object segmentation without additional
training. This study demonstrates out-of-the-box SAM’s high accuracy in identifying individual berries
in 2-dimensional (2D) cluster images. Using this model, we managed to segment approximately 3,500
cluster images, generating over 150,000 berry masks, each linked with spatial coordinates within their
clusters. The correlation between human-identified berries and SAM predictions was very strong (Pearson’s
r2 = 0.96). Although the visible berry count in images typically underestimates the actual cluster berry
count due to visibility issues, we demonstrated that this discrepancy could be adjusted using a linear
regression model (adjusted R2 = 0.87). We emphasized the critical importance of the angle at which the
cluster is imaged, noting its substantial effect on berry counts and architecture. We proposed different
approaches in which berry location information facilitated the calculation of complex features related to
cluster architecture and compactness. Finally, we discussed SAM’s potential integration into currently
available pipelines for image generation and processing in vineyard conditions.
Introduction
Grape cluster architecture and compactness are important fruit
traits that inuence yield, quality, and susceptibility to pests and
diseases [1]. Cluster architecture is directly related to cluster
compactness, which describes the ratio between the volume
occupied by berries and the total cluster volume [2]. In other
words, cluster architecture determines the arrangement of berries
in a cluster and the distribution of free space. Cluster architecture
is complex, dicult to measure quantitatively, and determined
by many factors such as berry number, size, shape, and spatial
location, which all relate to the rachis ramication patterns [3].
While certain features of cluster architecture can be discerned
by looking at the cluster contour, a more precise analysis requires
the identication and spatial localization of the individual berries
within the cluster. Cluster architecture and compactness are
determined genetically, as many genomic regions have been
associated with trait variation [2–6]. However, environmental
factors such as temperature, humidity, nutrient availability, and
vineyard management, among others, are known to alter cluster
architecture and compactness directly or indirectly [1,2,7,8].
Understanding the factors that inuence cluster architecture
and compactness, and to what extent they do so, has implications
for vineyard management, breeding, and genetics research.
For example, high cluster compactness has been associated
with increased susceptibility to Botrytis bunch rot caused by
Botrytis cinerea [9–11]. is, in turn, has implications in terms
of vineyard management and cultivar preference, since fungi-
cide applications can better reach berries within the cluster in
the case of a more open, looser cluster. Furthermore, there is
a greater temperature variability between the inner and outer
berries in densely compacted clusters, impacting the matura-
tion rate [8]. Additionally, restricted sun exposure to berries
has been observed to intensify powdery mildew infections [12],
thereby inuencing fungicide application scheduling.
Exploring cluster architecture and compactness has been
the focus of several studies utilizing qualitative and quantitative
methods. Among qualitative approaches, researchers primarily
rely on the OIV descriptors, a set of denitions established by
the International Organization of Vine and Wine. For instance,
the descriptor OIV 204, which addresses cluster density or
compactness, categorizes grape clusters into 5 classications
ranging from very loose to very dense. Similarly, cluster archi-
tecture can be described using a combination of OIV 208—
bunch shape (cylindrical, conical, and funnel-shaped) and OIV
209—number of wings of the primary bunch (ranging from 1
Citation: Torres-LomasE,
Lado-BegaJ, Garcia-ZamoraG,
Diaz-GarciaL. Segment Anything
for Comprehensive Analysis of
Grapevine Cluster Architecture and
Berry Properties. Plant Phenomics
2024;6:Article 0202. https://doi.
org/10.34133/plantphenomics.0202
Submitted 16 February 2024
Accepted 24 May 2024
Published 27 June 2024
Copyright © 2024 Efrain Torres-Lomas
etal. Exclusive licensee Nanjing
Agricultural University. No claim
to original U.S. Government Works.
Distributed under a Creative
Commons Attribution License 4.0 (CC
BY 4.0).
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 2
to 6 or more). Classifying clusters based on OIV descriptors
oen involves considering multiple characteristics simultane-
ously, which, while providing a comprehensive assessment, can
be challenging to replicate and scale. For example, Richter et al.
[2] studied an F1 mapping population derived from crossing
GF.GA-47-42 and Villard Blanc. eir study involved manually
recording individual cluster and berry traits (e.g., berry number,
cluster weight, rachis size and architecture, and shoulder length,
among others), which accounted for approximately half of the
observed variation compared to using the OIV 204 descriptor
alone. is emphasizes the complexity of cluster architecture
and how it is inuenced by various individual characteristics,
including cluster compactness.
Computer vision approaches can also be used to analyze clus-
ter architecture and compactness. In this case, available methods
involve 2-dimensional (2D) image analysis and 3D modeling,
which all have the capability of producing quantitative traits.
In many cases, the utilization of quantitative traits derived from
these imaging approaches has proven to be more eective in
genetics research and breeding compared to categorical traits
[13]. Depending on the algorithm, some studies have focused
on berry detection while others have focused only on whole cluster
analysis. For example, conventional segmentation on cluster
images generated in the lab has been used to assess berry color
and cluster architecture [4,14]. Cluster images generated directly
in the vineyard have been used also for cluster identication and
yield estimation using a variety of methods; however, prediction
accuracy has varied because of challenging light conditions or
occlusion [15–17]. Identifying and localizing berries within the
cluster is crucial for determining cluster architecture and com-
pactness. In this context, several approaches have been tested,
including robotic laser scanning systems to reconstruct 3D rep-
resentations of clusters and generate precise data regarding the
3D location of berries in a cluster [18]. Likewise, x-ray tomog-
raphy has been employed to scan grapevine inorescences and
model berry growth and infer phylogenetic relationships [19].
Partial 3D models of grape clusters have also been generated
using stereo-vision, which, in turn, allows berry counting [20].
Some other methodologies allow the estimation of berry numbers
from images taken directly in the eld. For example, in the work
of Luo et al. [21], the model developed allowed for an accurate
prediction of berry counts in Niagara grapes, which are generally
larger than most table and wine grapes. Neural networks have
also been applied for berry segmentation and counting, and
although they produced very accurate estimates, they were only
used on very immature clusters with limited berry growth,
low compactness, and sucient contrast between berries [22].
Furthermore, other methods based on convolutional neural net-
works and semantic segmentation have shown accurate estima-
tions of berry numbers in eld images, which might be of great
utility for, for example, yield prediction. However, using this
information to conduct cluster analysis is dicult, as the identi-
ed berries are not assigned to clusters [23].
Many of the image analysis-based methods used to describe
cluster architecture and compactness relied on traditional seg-
mentation methods. ese methods oen depend on labor-
intensive, customized functions, manually engineered features,
and error-prone thresholding designed for specic scenarios.
As an alternative, deep learning models for image analysis, with
their ability to capture latent image features, have shown promise
across various fields, including medicine, surveillance and
security, agriculture, biometrics, environmental sciences, and
remote sensing, among others. However, these models are typi-
cally designed and trained for specic segmentation tasks, and
unfortunately, their performance may substantially deteriorate
when applied to new tasks, dierent image types, or varying
external conditions. Large-scale foundational models have
revolutionized articial intelligence due to their remarkable
zero-shot and few-shot generalization capabilities across a broad
spectrum of downstream tasks [24,25]. Foundation models
are neural networks trained on vast datasets using innovative
learning methods and prompting objectives that generally
do not require conventional supervised training labels, which
makes them adaptable to a variety of external conditions [26].
e Segment Anything Model (SAM) is a new foundation model
that can be used as a zero-shot segmentation method [27]. SAM
can be used out of the box to segment a variety of objects in an
image, or can be ne-tuned for a specic task, such as the very
recently developed MedSAM [28]. SAM was built on the largest
segmentation dataset to date, with over 1 billion segmentation
masks [27]. To segment an object, SAM requires the user to
provide a prompt, which can take the form of a single point, a
polygon (similar to a mask), a bounding box, or just text [26].
In this study, we demonstrated the capabilities of SAM to
segment grape berries from 2D cluster images without addi-
tional model training or ne-tuning. Our research focused
on 4 main aspects: (1) measuring the accuracy of SAM in
identifying visible berries within a cluster image; (2) predicting
hidden berries in a cluster image and assessing the impact of
cluster imaging angle; (3) developing new quantitative methods
to describe cluster architecture based on berry distributions
within the clusters; and (4) assessing the repeatability of cluster
architecture and compactness traits in replicated experiments.
Materials and Methods
Plant material
Cluster images obtained from an F1 mapping population (n =
139 genotypes) derived from crossing Cabernet Sauvignon and
Riesling were used to test SAM. Both Cabernet Sauvignon and
Riesling, major wine grape cultivars around the world, display
contrasting cluster architectures. Cabernet Sauvignon clusters
are small to medium in size, conical, loose to well-lled, and with
medium-long peduncles. Its berries are small, round, and blue-
black. Riesling has smaller clusters, which can be cylindrical or
globular, and sometimes winged; clusters are compact and with
short peduncles. Riesling berries are small and round and have
a white-green skin coloration. is F1 progeny segregates for the
traits mentioned above, making it an ideal candidate to evaluate
the proposed pipeline. is population was planted in UC Davis
Experimental Station in Oakville, Napa County, CA, USA
(38°25′45.4′′N; 122°24′36.4′′W), in 2017. Vines were arranged
using a randomized complete block design with 3 blocks and
3 vines per experimental unit. For this study, one vine per experi-
mental unit was sampled (the one in the middle). For each vine,
5 representative clusters were imaged as described below.
Image capture
Five representative clusters per vine were imaged using the setup
shown in Fig. S1. e setup included a reference circle to nor-
malize measurements and account for potential variation in the
location of the camera relative to the cluster. e camera used
was a Canon EOS 70D with a 24-mm prime lens, an aperture
of f/5, and an exposure time of 1/500 s. Images were 5,472 × 3,648
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 3
(~20 Mpx). All clusters were imaged from at least one angle. In
addition, all the clusters from a subset of 99 vines were imaged
from 3 additional angles (90°, 180°, and 270°). e latter was
used to assess complex architectures that result from the presence
of cluster ramications or wings, and that are visible only from
specic view angles.
A second image dataset was generated to validate the SAM
algorithm. is dataset consisted of cluster images, each one
accompanied by an image of all the individual berries detached
and individually placed on a white surface (Fig. S2).
Model and processing pipeline
e images described above, without editing their original
brightness or contrasts, were used as input for SAM. To reduce
the amount of pixels to be processed, a region of interest (ROI)
was manually dened, as indicated in Fig. 1. e pretrained
ViT-H (Huge Version) image encoder was used for the segmen-
tation phase (checkpoint available at https://dl.aipublicles.
com/segment_anything/sam_vit_h_4b8939.pth). e mask pre-
diction was executed by applying the Automatic Mask Generator
to the input, which was dened as the pixels within the ROI and
a prompt, described as an XY grid of points equally distributed
across the ROI. Dierent grid congurations, including 4 × 4,
6 × 6, 8 × 8, and so on, up to 62 × 62, were explored and tested
for eciency. Both the number of masks and the area increased
as the number of points in the grid increased, until reaching a
plateau at around 20 to 25 points; aer that, the increase was
marginal. e number of masks still increased beyond this point
(Fig. S3A and C), as more berries, mainly those partially hidden,
were found. ese berries, discovered at higher point densities,
were of smaller sizes, as the increase in total area aer reaching
about 30 points was negligible (Fig. S3B). e marginal increase
in the area or number of objects detected at higher grid densities
is also likely due to Segment Anything’s reduction of image resolu-
tion, making smaller objects undetectable. To test this hypothesis,
a zoomed-in image of a cluster with numerous skin features (e.g.,
spots, color variations, and damages) was processed using a 256 ×
256 grid. is approach resulted in the detection of many smaller
features (Fig. S4), emphasizing the need to process a smaller set
of photos to optimize conditions. Aer these preliminary tests,
a 32 × 32 grid was chosen as it captured most of the grape objects
without unnecessary computational overhead. As a preliminary
analysis, SAM was executed using a graphic process unit (GPU),
massively parallel sequencing (MPS), and a central process unit
(CPU) platforms to compare any potential segmentation dif-
ferences; however, only computation time was aected. e out-
put produced by SAM comprised bounding boxes in XYWH
format, area, predicted intersection over union (IoU), stability
scores, and mask segments formatted as COCO Run Length
Encoding (RLE). e implementation of SAM, including ROI
identication and automatic mask generation, was implemented
in Python 3.11. e hardware tested was a g3.4xlarge AWS instance
(single GPU, 16 GB RAM) and a System76 workstation (32 CPU,
256 GB RAM). Details on specic dependencies are available in
the following GitHub repository: https://github.com/diazgarcialab/
SAM-cluster-segmentation.
e RLE mask segments were decoded using pycocotools
(https://github.com/cocodataset/cocoapi/blob/master/
PythonAPI/pycocotools/mask.py) to derive the x and y coordi-
nates of the mask contours and their position within the cluster.
ese coordinates were analyzed using the R package Momocs
[29] to compute various parameters such as berry area, length,
width, aspect ratio, perimeter, and color (represented as median
red, green, and blue values). SAM is a segmentation tool rather
than a classier. As such, the segmented masks it produces may
include, in addition to berries, other objects such as the clamp
used to hold the clusters or the reference circle for size normaliza-
tion. ese objects can be easily identied and distinguished
from berries due to their contrasting morphology and size, as
described below. More oen, some masks may encompass 2 or
more berries, which were addressed using the IoU estimates. IoU
is a metric used to evaluate the overlap between 2 bounding boxes
or masks, commonly employed when assessing the accuracy of
image segmentation models. In this study, IoU was calculated by
determining the size of the overlapping region between 2 masks
detected by SAM. For example, in instances where an overlapping
mask covers 2 berries, each with its own mask, the overlapping
mask will exhibit a larger size and IoU. Furthermore, lters based
on criteria such as area, perimeter-to-area ratio, and aspect ratio
were implemented to exclude objects other than berries. To rene
the segmentation further, we employed a ltering approach using
elliptical Fourier descriptors (EFDs) and principal component
analysis (PCA) to eliminate non-berry objects, especially rachis
parts. Initially, the x and y coordinates of objects were transformed
into an “Out” object using Momocs soware, which facilitated
the computation of EFD harmonic coecients. ese coecients
Grid for object detection
(prompt for SAM)
Raw masks
Filter based on
basic descriptors
(IoU, area, perimeter, aspect ratio)
Other complex shapes
(EFD + PCA)
ROI
Fig.1.Summary of the pipeline employed for generating and processing SAM masks. The process for each image is detailed below. Firstly, the region of interest (ROI) housing the
cluster is identified. Subsequently, a grid of points separated by 88 × 171 pixels is utilized as input for object identification in SAM. Following this, masks undergo analysis based
on various parameters including intersect over union (IoU), area, perimeter, length, width, aspect ratio, and elliptical Fourier descriptors (EFDs) to discern non-berry objects.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 4
were then analyzed using PCA for visualization purposes, and
outliers were identied through 5 rounds of outlier detection.
Each round involved recalculating the harmonics and principal
components with a cleaner dataset, adopting a threshold of ±2
standard deviations among the rst 10 principal components.
Results
Characteristics of the model implementation and
implementation time
e implementation of SAM on a population of 387 vines and
1,935 dierent clusters resulted in 215,090 masks. For 99 of the
387 vines, all the clusters were imaged 4 times, each time at a
dierent angle (0°, 90°, 180°, and 270°), which resulted in 3,431
cluster images. e identied masks included, among other
things, individual berries, 2 or more berries, the clamp used to
hold the clusters in place, stains/discolorations in the back-
ground, the reference circle for size normalization, and rachis
segments. is outcome is expected as SAM utilizes an algo-
rithm for unsupervised object segmentation, and not classica-
tion, within an area of interest dened by the user. As a result
of the ltering, 32,425 masks containing 2 or more berries were
removed using IoU. Furthermore, since berries had an expected
size and aspect ratio, 23,125 masks with signicantly larger
areas or aspect ratios, or located far from the cluster (stains in
the background) were ltered out. Finally, the rest of the mask
contours were analyzed with Momocs [29] using a combination
of EFD and PCA, leading to the identication of 5,601 objects
other than berries. Aer this ltering step, the number of true
berry masks was 153,939 (61,151 masks discarded). Each clus-
ter had, on average, 44.87 berries (median = 42). Berry number
varied between 5 and 130, and variation showed a normal dis-
tribution (Fig. S5).
Computation time per photo varied depending on the num-
ber of points in the grid used to initialize the object search, as
well as the characteristics of the machine. In this study, we used
a conguration of 32 points per side (32 × 32), resulting in a
grid where points were horizontally separated by ~88 pixels
and vertically by ~171 pixels. On average, processing a photo
took 55 s using the CPU of the System76 workstation and 14 s
with the GPU on the AWS g3.4xlarge instance. Increasing
the grid density slightly improved the number of berries
detected, although the increase was very marginal (Fig. S2).
However, when the grid density was increased to 62 × 62
points—resulting in 114 pixels of horizontal separation and
59 pixels vertically—computation time increased to 4 min and
45 s on the CPU and GPU, respectively.
2D cluster representations predict berry number
and cluster size
Berry counts from clusters imaged at 4 dierent angles were
compared with the number of berries determined manually.
e “manual” determination of berries was conducted using 2
methods. e rst involved humans counting visible berries in
a subset of 100 images, and then comparing these counts with
SAM predictions. e second involved processing additional
images of 84 clusters from 17 vines where all the berries were
detached and placed individually on a surface. e analysis of
these images is straightforward since there is no touching
among berries, and there exists good contrast between the
berry and surface colors (Fig. S2). In addition to being used to
determine the true number of berries, these images also allowed
the comparison of berry size, assuming that the masks gener-
ated from isolated, uncompressed berries imaged from the top
approximate well to the real size of a berry.
As shown in Fig. 2A, the SAM algorithm does a very good
job nding and segmenting all the berries in the cluster, inde-
pendently of the angle it is being imaged. e berries identied
were fully visible, represented as circles, or partially visible (Fig.
2B). e correlation between the berry number determined by
humans and the SAM prediction was 0.96 (Fig. S6). ere was
also good agreement between SAM berry number predictions
and the number of berries calculated from images with the
individual berries (R
2
= 0.93, 5-fold cross-validation). However,
there was a clear underestimation, which varied depending on
the imaging angle (Fig. 2C). Overall, the underestimation was
approximately 50% of the real number but linear. In symmetric
clusters (e.g., cylindrical with no ramications or wings),
images from all 4 angles yielded similar berry counts. Conversely,
clusters with wings, as they were only visible from specic angles,
increased the berry count prediction. While the berry count was
underestimated, a linear regression model of the form y ~ β0 +
β1x was sucient to adjust the prediction considerably well
(adjusted R2 = 0.8723), as long as the cluster with the maximum
number of berries (from the 4 images taken at dierent angles)
was used in the model.
Berry size (measured as projected berry area) was more
challenging to predict (Fig. 2D). Predictions were mostly over-
estimations and varied signicantly depending on the imaging
angle. Most berries were between 120 and 150 mm2, with just
a few having smaller sizes (<100 mm2). Studying clusters with
more variation in berry size might be required to better assess
the correlation for this trait. Similar to berry counts, a linear
model was tted using all cluster views available for each clus-
ter. Since it appeared to be linear, the tted values were consis-
tent with the real size estimations (adjusted R2 = 0.8457).
Cluster angle matters
Not all the berries in a cluster can be seen from a given angle;
therefore, berry counts from 2D images were, as expected,
underestimated (Fig. 2C). While cylindrical clusters are more
common among cultivars, the presence of ramications or
wings, or other asymmetries, can impact the number of berries
visible from a single view. To measure the eect of the image
angle on the berry counts, 490 clusters from 99 vines were
imaged from 4 dierent angles (0°, 90°, 180°, and 270°), and
the berry counts and sizes were compared. In general, the berry
count can vary by approximately ±50%, depending on the angle
(Fig. 3A). As expected, opposing angles (0° and 180°, 90° and
270°) tend to have more similar results (Fig. 3B). In other
words, when the cluster ramication or wing is fully visible
from a given angle, it becomes invisible or hard to distinguish
when the cluster is rotated 90°, and becomes fully visible again
aer another 90° rotation. Berry size was less dependent on the
viewing angle (Fig. 3C). In general, berry size varied by +30%.
e extent of the variation in berry count as a function of view-
ing angle is shown in Fig. 2D.
Cluster architecture
A typical approach for measuring cluster architecture and com-
pactness is based on whole cluster segmentation instead of
berry segmentation (e.g., [4]). While this method provides
insightful information and is easy to implement, it ignores the
spatial distribution of berries within the cluster. Moreover, in
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 5
0° 90°
180° 270°
A
B
y = −13.38 + 2.25x
50
100
150
50 100 150 200
Real number of berries
Predicted number of berries
C
y = −12.63 + 0.93x
50
100
150
200
60 90 120 150
Real berry area (mm2)
Predicted berry area (mm2)
D
0°
180°
270°
90°
Corrected
Fig.2.Prediction of berry number using SAM from cluster images. (A) Identification of individual berries from 4 angles on the same cluster. (B) Berry masks from cluster
images in panel A, color-coded by angle view. (C) Correlation between real and predicted berry counts from SAM; predicted counts for each angle view in panel A are displayed.
Points marked with an X represent corrected counts using the angle view with the maximum berries, adjusted with a linear model. (D) Correlation between real and predicted
berry area; color and shape patterns are similar to panel C; corrected points were generated with a linear model of the form y ~ β0 + β1x. The vertical red line indicates a one-
to-one relationship between variables.
−60
−30
0
30
60
0° 90° 180° 270°
% of change in berry number
relative to angle 1 (0 degrees)
A
0
25
50
75
−40 04080
% of change in berry number relative to angle 1
Count
180° 270° 90°
B
0
20
40
−40 −20 0204060
% of change in max berry area relative to angle 1
Count
180° 180° 90°
C
n = 21
n = 21
n = 21
n = 24
n = 37
n = 33
n = 33
n = 33
n = 48
n = 48
n = 49
n = 48
n = 44
n = 43
n = 43
n = 35
n = 27
n = 40
n = 32
n = 45
n = 60
n = 42
n = 65
n = 50
n = 20
n = 31
n = 24
n = 31
n = 40
n = 59
n = 43
n = 56
D
Fig.3.Impact of imaging angle on cluster analysis. (A) Change in berry number relative to angle 1 (0°, first image); each green line represents a cluster imaged at
4 different angles. (B) Frequency plot of changes in berry number relative to angle 1, similar to panel A. (C) Frequency plot of changes in max berry area relative to
angle 1. (D) Examples illustrating the effect of berry angle on SAM-detected berry counts; each column represents a different cluster, and each row represents a
different angle (0°, 90°, 180°, and 270°). The number of detected berries is indicated in each image. The first 4 clusters show little variation, while the last 4 exhibit
extreme berry count variation.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 6
the setup used for photographing clusters, it is common to use
clamps, hooks, or clips to hang clusters, which can then be chal-
lenging to identify during image analysis or post-processing.
In those cases, a common strategy is to crop the top of the image
to remove such objects. When the peduncle is long, cropping
the image does not aect the analysis; however, in clusters with
short peduncles or prominent shoulders, cropping the image
results in cropped berries as well. In this study, although the
clamps (and other objects) were masked by SAM, because of
dierent colors, sizes, and shapes, they were easy to identify
and remove.
To illustrate the capabilities of cluster architecture analysis
using berry locations, empirical cumulative distribution func-
tions were developed along the y-axis (from the top of the cluster,
or the peduncle, to the bottom, or cluster tip) and the x-axis
(from le to right). e distribution functions provided dierent
levels of information. For example, they allowed the estimation
of symmetry along both the x- and y-axes. With these symmetry
estimators, cylindrical or globular clusters are expected to have
a more uniform cumulative distribution. On the other hand,
clusters with ramications or signicant ramications will show
a cumulative distribution along the x-axis skewed opposite to
the main ramication.
A cluster with a prominent wing, photographed from dif-
ferent angles, is provided as an example in Fig. 4A. At 0° and
180° views, the wing is not visible, as it is either in front and com-
pletely aligned with the main cluster or in the back. In this case,
the cluster appears more cylindrical and symmetrical along both
axes. e empirical cumulative distribution functions for these
2 views, shown as red and green dots in Fig. 4B and E, were more
uniform and appeared as straight diagonal lines. Conversely, at
90° and 270° views, the wing becomes visible and produces a
very skewed distribution along the x-axis. Since the 90° and 270°
views, and the 0° and 180° views, can be seen as “mirror” images,
the distribution functions in Fig. 4D and E also display this
mirroring feature.
Masks generated by SAM for each berry object were repre-
sented as x, y coordinates, and their corresponding polygons
were drawn, as shown in Figs. 2A and 3D. Combining all the
berry polygons produced a representation of entire clusters. When
a cluster has a cylindrical or globular shape, and no wings are
present, representing its shape is simple. However, when other
cluster features are present, such as wings, shoulders, and conical
forms, among others, the so-called cluster shape descriptor
can vary depending on how detailed these complex features are
represented.
For example, for a cluster with a prominent wing, as the one
shown in Fig. 4A, should the outline (or contour) dening the
cluster shape include the sinus formed by the 2 wings? If so, how
far inside the sinus? e opposite approach would be to simply
0° 90°
180° 270°
A
0.00
0.25
0.50
0.75
1.00
0200 400600 800
x coordinate (shifted to start in 0)
Fn(x)
B
0.00
0.25
0.50
0.75
1.00
025507
5100
Normalized x coordinate
Fn(x)
D
0.00
0.25
0.50
0.75
1.00
0250 500750
y coordinate (shifted to start in 0)
Fn(y)
C
0.00
0.25
0.50
0.75
1.00
025507
5100
Normalized y coordinate
Fn(y)
E
0°
180°
270°
90°
1,000
Fig.4.Cumulative distributions of berry locations along the horizontal and vertical axes. (A) Example of berries identified in a cluster imaged from 4 different angles (0°, 90°, 180°, and
270°). (B) Empirical cumulative distributions along the x-axis for the 4 angle views; berry locations along the x-axis are shifted to start at 0. (C) Similar to panel B, but for the y-axis.
(D) Similar to panel B, but berry locations along the x-axis are scaled from 0 to 100 and sampled with n = 100. (E) Similar to panel D, but for berry locations along the y-axis.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 7
Fig.5.What is cluster architecture? Example of concave hull calculation for different clusters (in columns) at different cluster shape definition levels (from top to bottom,
higher to lower definition); concave hulls are calculated on the union of all berry masks in the cluster.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 8
connect the tips of the wing and the main cluster formation,
which would produce a simpler polygon. e same applies to
the presence of shoulders and curvatures along the cluster.
Figure 5 illustrates the same 8 clusters from Fig. 3D, outlined
using concave hulls with varying degrees of detail, from top to
bottom. At the top panels, the cluster outlines preserved detailed
features such as shoulders, indentations, separations between
wings, etc. Toward the bottom part of the gure, most of these
features were lost.
e approaches described above to measure cluster archi-
tecture were applied to all 3,431 cluster representations analyzed
in this study. Cumulative distribution functions for axes x and
y showed varying levels of asymmetry. Along the x-axis (Fig. 6A),
the asymmetry is due to having more berries either on the right
or the le side of the cluster (likely because of the presence of
a wing). For example, the green lines in Fig. 6A represent the
distribution functions of clusters in which more berries exist
on the le side of the cluster (as much as ~75%). Conversely,
the purple lines represent clusters with a larger accumulation
of berries on the right side of the cluster. Finally, gray lines
represent more symmetrical clusters, with an equal amount of
berries on the le and right. Regarding the y-axis (Fig. 6B), most
of the asymmetry is toward the base of the cluster, which is
expected, as many clusters exhibit conical forms. Importantly,
the color assignations (i.e., categories) in Fig. 6A and B are sub-
jective and for illustrative purposes only.
en, cluster shape variation was studied using the polygons
generated using concave hulls. e concave hulls were generated
using a conservative level of cluster feature preservation (using
the function R function sf::st_concave_hull(), with ratio=5) but
with enough resolution to capture major asymmetries, wings,
and shoulders. In general, cluster shape exhibited a continuous
gradient of variability with no clear group formation (Fig. 6C
and D). In other words, there were no groups formed only with,
for example, winged and non-winged clusters, or symmetric
and non-symmetric clusters. Instead, asymmetries can be small
and slightly visible, and increase gradually in size and separation
from the main cluster. To understand what cluster features were
associated with each PC, 100 clusters with extreme PC scores
(50 more negative and 50 more positive) were plotted for PCs
1 to 4 (Fig. 6E). PC1, which explained 53.23% of the variation,
was associated with aspect ratio, with more circular/globular
clusters having more negative values, and very elongated clusters
with more positive values. PC2, which explained 18.29% of the
variation, was associated with the location of the asymmetries
along the x-axis (either to the le or the right). Finally, both PCs
3 and 4, which accounted for a little less than 18%, explained
other more complex features (wings and shoulders) that are
more dicult to discern.
Is the level of sensitivity to complex cluster
features meaningful?
e methodologies employed in this study for identifying ber-
ries within a cluster, counting them, studying their spatial dis-
tribution to generate cumulative distribution functions, and
0.00
0.25
0.50
0.75
1.00
0255075100
Normalized x coordinate
Fn(x)
Non-symmetric
(more berries to the left)
Non-symmetric
(more berries to the right)
Symmetric
A
0.00
0.25
0.50
0.75
1.00
0255075100
Normalized y coordinate
Fn(y)
Non-symmetric
(more berries at the base)
Non-symmetric
(more berries at the tip)
Symmetric
(cylindrical)
B
−0.2
−0.1
0.0
0.1
0.2
−0.20.0 0.2
PC1 (54.42%)
PC2 (17.11%)
−0.2
−0.1
0.0
0.1
0.2
PC3
PC4
−0.1
0.0
0.1
C
−0.1
0.0
0.1
−0.3 −0.2 −0.10.0 0.10.2
PC3 (13.66%)
PC4 (3.57%)
PC2
−0.2
−0.1
0.0
0.1
0.2
−0.2
0.0
0.2
PC1
D
E
Fig.6.Comprehensive analysis of cluster architecture using cumulative distribution function and PCA of concave hulls. Empirical cumulative distributions for 3,431 clusters
using berry locations along the (A) x- and (B) y-axes; berry locations along both x- and y-axes are scaled from 0 to 100 and sampled with n = 100, similar to Fig. 3D and E. In
both cases, the green lines correspond to distributions with a normalized coordinate 25 larger than 0.3 and a normalized coordinate 75 larger than 0.8; the purple lines have
a normalized coordinate 25 < 0.2 and a normalized coordinate 75 < 0.7; finally, the gray lines have a coordinate 25 between 0.2 and 0.3, coordinate 50 between 0.45 and
0.55, and coordinate 75 between 0.7 and 0.8. Variation in cluster architecture along principal components 1 and 2 (B) and 3 and 4 (C). In panel C, different colors and sizes
correspond to variations in principal components 3 and 4, respectively. Similarly, in panel D, point color and size correspond to variations in principal components 1 and 2,
respectively. (E) One hundred clusters sampled from the extremes of principal components 1 (green), 2 (gray), 3 (dark cyan), and 4 (salmon); the clusters in each color group
are ordered from left to right and by rows according to their corresponding principal component values.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 9
applying PCA to examine cluster shape variation demonstrated
high sensitivity (Fig. 6). However, a critical question is: are these
features primarily driven by genetic variation, or are they sim-
ply a result of environmental and non-genetic factors?
e primary aim of this research was to implement SAM for
berry identication and propose methodologies for leveraging
this information in cluster architecture and compactness analysis.
erefore, the focus was not on characterizing specic cultivars
or genotypes in the surveyed population but rather on sampling
diverse cluster variations. Nevertheless, as mentioned earlier, the
sampled vines are part of a mapping population between Riesling
and Cabernet Sauvignon, planted in a randomized complete
block design with 3 contiguous vines per genotype per block.
is design allowed the calculation of repeatability, expressed
as the percentage of genetic variance relative to the phenotypic
variance.
First, to assess the consistency of the phenotypes measured
in this study, boxplot graphs per genotype were examined for
18 variables. ese variables included basic descriptors such
as berry count, area, length, and width, all computed from the
berry masks identied by SAM. Additionally, cluster compact-
ness was calculated as the ratio between the sum of all berry
areas and the concave hull area. Using the empirical cumulative
distribution functions, the predicted percentage of berries at
x or y = 25, 50, and 75 was also determined. In terms of cluster
architecture based on concave hulls, PCs 1 and 2 were included.
Finally, cluster length, width, perimeter, and aspect ratio were
computed using the concave hulls.
Overall, variables such as berry count, area, length, and width,
as well as cluster area, length, width, and perimeter, showed good
consistency (Fig. 7A), high correlation (Fig. 7B), and medium-
to-high repeatability (Fig. 7C). While descriptors derived from
cumulative distributions showed a correlation among them-
selves, except for ECDF at x = 25 and y = 25, their variability
was higher, likely inuenced by non-genetic sources given their
very low or zero repeatability. Cluster compactness demonstrated
little correlation with other traits but exhibited good consistency
with a repeatability of ~0.6. PC1 from the PCA conducted on
concave hulls, and related to cluster aspect ratio, also showed
good consistency and medium to high repeatability. In summary,
these analyses revealed that many variables computed from the
berry masks identied by SAM, along with others describing
more complex features in the cluster, possess a genetic compo-
nent. Nevertheless, certain variables, particularly those originat-
ing from empirical cumulative distribution functions, seem to
be strongly aected by variations in the environment.
Discussion
Several computational, image-based strategies have been imple-
mented to measure grapevine cluster architecture and compact-
ness. However, only a few have been utilized for identifying
ABerry count Berry area Berry length
Berry width Cluster compactness (berr y area/concave hull area)ECDF at x = 25
ECDF at x = 50 ECDF at x = 75 ECDF at y = 25
ECDF at y = 50 ECDF at y = 75 PC1 on concave hulls
PC2 on concave hulls Cluster area Cluster length
Cluster width Cluster perimeter Cluster aspect ratio
Berry count
Berry area
Berry length
Berry width
Compactness
ECDF at x = 25
ECDF at x = 50
ECDF at x = 75
ECDF at y = 25
ECDF at y = 50
ECDF at y = 75
PC1 on concave hulls
PC2 on concave hulls
Cluster area
Cluster length
Cluster width
Cluster perimeter
Cluster aspect ratio
berry count
berry area
berry length
berry width
compactness
ECDF at x=25
ECDF at x=50
ECDF at x=75
ECDF at y=25
ECDF at y=50
ECDF at y=75
PC1 on concave hulls
PC2 on concave hulls
cluster area
cluster length
cluster width
cluster perimeter
cluster aspect ratio
−1.0
−0.5
0.0
0.5
1.0
Corr
B
Berry area
Berry count
Berry length
Berry width
Cluster area
Cluster aspect ratio
Cluster compactness (berry area/concave hull area)
Cluster length
Cluster perimeter
Cluster width
ECDF at x = 25
ECDF at x = 50
ECDF at x = 75
ECDF at y = 25
ECDF at y = 50
ECDF at y = 75
PC1 on concave hulls
PC2 on concave hulls
0.00.2 0.40.6 0.8
Repeatability
C
Fig.7.Variability in berry and cluster characteristics. (A) Variation in berry characteristics and cluster architecture grouped by genotype; each genotype was replicated in 3
blocks; for each replicated vine, 5 clusters were sampled and imaged from 1 or 4 angles; genotypes are ordered by mean value, and names are omitted due to space constraints.
(B) Pearson’s correlation between traits. (C) Repeatability is calculated as the proportion of genetic variance relative to the phenotypic variance.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 10
individual berries within clusters [18–21,23]. Most of these
strategies rely on non-generalizable mathematical and analyti-
cal frameworks for analyzing colored images. Humans can easily
discern individual berries in a cluster image, even when taken
in the eld or under challenging light conditions. erefore, it
is reasonable to assume that machine learning algorithms could
achieve similar capabilities. However, until now, these models
have primarily been applied to cluster identication rather than
berry identication. is does not discard the potential use of
“conventional” deep learning approaches trained with human-
segmented berries, but they would require a substantial amount
of image labeling for training.
With the recent introduction of foundation models, par-
ticularly the SAM [27], objects of interest can be automatically
segmented without the need for additional training or ne-
tuning, at least for natural objects. In some specic cases, such
as in medical imaging, additional ne-tuning allows for more
accurate predictive models capable of analyzing many dierent
image types [28]. Here, we demonstrate that out-of-the-box
SAM can accurately segment berries in a 2D grape cluster
image with up to a 0.96 correlation (human berry counts vs.
SAM predictions on visible berries in 2D images; Fig. S5).
While one might argue that the segmented masks produced by
SAM in this study needed supervised classication to identify
berry objects exclusively, the implementation of lters (IoU,
size, area, EFD, and PCA) was straightforward. is approach
can be applied to hundreds, thousands, or even millions of
masks without any changes to the programming. A continua-
tion of this work could be the development of an automatic
classier based on, for example, YOLO, that can use cropped
images based on bounding boxes generated by SAM.
Applying SAM to photos of clusters still on the vine is pos-
sible, but it would require further development, particularly in
regard to image preprocessing. is preprocessing step would
rst need to identify clusters within vine images, which is fea-
sible with methods already available [16], and second, to remove
the background in cropped images containing clusters. Failing
to do this last step will cause SAM to segment non-berry objects,
such as leaves, trunks, or shoots (Fig. S4). In Fig. 8, we show 2
examples of how object removal could be performed on pre-
cropped images of clusters to further process them with SAM.
e same algorithm for berry detection and non-berry object
removal was used. One of the 2 models presented, BRBG (BRIA
Background Removal), is a background-removal tool available
at https://huggingface.co/briaai/RMBG-1.4. Although BRBG is
simple to use, it is not very customizable. For example, it does
not allow for dening the object of interest. However, it does
perform well at removing the background in images. e second
Raw image
Background removal using
RMBG v1.4
Depth estimation using DepthAnything
(yellower color = closer objects)
Berry segmentation using SA
M
Fig.8.Example of 2 machine learning tools (RMBG and Depth Anything) for the preprocessing required to remove background before implementing SAM. Raw image taken
from https://fps.ucdavis.edu with permission.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 11
model, Depth Anything [30], is used for monocular depth
estimation, which can be employed to remove backgrounds/
foregrounds based on depth. is serves only as an example of
how a future end-to-end pipeline for vineyard applications
might look. Our study aimed to showcase the capabilities of
zero-shot machine learning models that, despite their generaliza-
tion capabilities, perform well in specic situations, such as
berry segmentation. One of the important takeaways is that
researchers will have to spend less time on model training as
these models become more widely available.
Another consideration when deploying machine learning
tools for real-world applications (e.g., processing images directly
in the vineyard using a mobile device) is processing time. In
our study, the processing time per image was as fast as 14 s on
a GPU-powered machine, which was sucient for our needs.
However, for large-scale applications and edge computing, other
SAM-like implementations such as EdgeSAM [31], fastSAM
[32], and EcientSAM [33], could be adopted.
Although the number of berries visible in a cluster image
underestimates the actual number of berries in a cluster, this
underestimation can be corrected using a linear regression model
(Fig. 2C). Moreover, to compensate for the variability in berry
number caused by cluster ramications of rachis visible only
from certain angles, an additional image, for example, taken at
90°, can correct for any berry count underestimation. Notably,
the berry masks generated by SAM can be used for compre-
hensive cluster architecture analysis, which is only possible if
berries are spatially located.
While some of the trait variation in cluster architecture and
compactness, particularly that captured by the analysis of
empirical cumulative distributions, was inuenced by envi-
ronmental factors, these traits can still have applications in
determining vineyard management practices. For instance,
cluster thinning and/or tipping could be targeted toward asym-
metric or winged clusters, or those with specic architectures.
In table grape production, certain cluster architectures might
be more appealing to consumers [34]. For these types of appli-
cations to be feasible under eld conditions, SAM would have
to be integrated into an existing pipeline that processes eld
images obtained with cameras mounted on rovers or tractors.
In [16], for example, images from vines were acquired using a
sensing kit equipped with RGB cameras, and further processed
using YOLO to identify clusters within the image for yield
prediction. In this case, SAM could be incorporated into this
pipeline to compute additional variables regarding berry count,
size, and cluster architecture.
e observation that the cumulative distribution functions
(Fig. 6A and B) explaining cluster architecture showed lower
or zero repeatability is specic to the Riesling by Cabernet
Sauvignon population analyzed in this study. However, this
does not rule out the possibility that other mapping or breed-
ing populations display heritable variation for these traits.
Consequently, these traits could still be valuable for genetics
research or selection purposes in other mapping or breeding
populations.
Acknowledgments
e authors would like to thank Veronica Nunez, Jose Munoz,
Sadikshya Sharma, Yaniv Lupo, Hollywood Banayad, and Dan
Ng for their support during vineyard management, harvest,
and image annotation. e authors would also like to thank
Dario Cantu for providing access to the F1 population used
in this study.
Funding: This project was partially supported by USDA-
NIFA Specialty Crop Research Initiative Award No. 2022-
51181-38240.
Author contributions: E.T.-L. developed the proof of concept
and set up the computational workow to implement SAM.
E.T.-L. and L.D.-G. conceived and designed the eld experiment.
J.L.-B. and G.G.-Z. supported eldwork and cluster imaging.
E.T.-L. and L.D.-G. wrote the manuscript.
Competing interests: e authors declare that they have no
competing interests.
Data Availability
All the data and code to reproduce the results of this study are
available at https://github.com/diazgarcialab/SAM-cluster-
segmentation.
Supplementary Materials
Figs. S1 to S6
References
1. Tello J, Ibáñez J. What do we know about grapevine bunch
compactness? A state-of-the-art review. Aust J Grape Wine Res.
2018;24(1):6–23.
2. Richter R, Gabriel D, Rist F, Töpfer R, Zyprian E.
Identication of co-located QTLs and genomic regions
aecting grapevine cluster architecture. eor Appl Genet.
2019;132(4):1159–1177.
3. Correa J, Mamani M, Muñoz-Espinoza C, Laborie D, Muñoz C,
Pinto M, Hinrichsen P. Heritability and identication of QTLs
and underlying candidate genes associated with the architecture
of the grapevine cluster (Vitis vinifera L.). eor Appl Genet.
2014;127(5):1143–1162.
4. Underhill A, Hirsch C, Clark M. Image-based phenotyping
identies quantitative trait loci for cluster compactness in
grape. J Am Soc Hortic Sci. 2020;145(6):363–373.
5. Fanizza G, Lamaj F, Costantini L, Chaabane R, Grando MS.
QTL analysis for fruit yield components in table grapes (Viti s
vinifera). eor Appl Genet. 2005;111(4):658–664.
6. Richter R, Rossmann S, Töpfer R, eres K, Zyprian E. Genetic
analysis of loose cluster architecture in grapevine. BIO Web
Conf. 2017;9:01016.
7. Li-Mallet A, Rabot A, Geny L. Factors controlling
inorescence primordia formation of grapevine: eir role in
latent bud fruitfulness? A review. Botany. 2016;94:147–163.
8. Pieri P, Zott K, Gomès E, Hilbert G. Nested eects of
berry half, berry and bunch microclimate on biochemical
composition in grape. OENO One. 2016;50:23.
9. Hed B, Ngugi HK, Travis JW. Relationship between cluster
compactness and bunch rot in Vignoles grapes. Plant Dis.
2009;93:1195–1201.
10. Vail ME, Wolpert JA, Gubler WD, Rademacher MR. Eect
of cluster tightness on botrytis bunch rot in six chardonnay
clones. Plant Dis. 1998;82(1):107–109.
11. Vali ME, Marois JJ. Grape cluster architecture and the
susceptibility of berries to Botrytis cinerea. Phytopathology.
1991;81:188–191.
Torres-Lomas et al. 2024 | https://doi.org/10.34133/plantphenomics.0202 12
12. Austin CN, Wilcox WF. Eects of sunlight exposure on
grapevine powdery mildew development. Phytopathology.
2012;102(9):857–866.
13. Azevedo CF, Ferrão LFV, Benevenuto J, de Resende MDV,
Nascimento M, Nascimento ACC, Munoz PR. Using visual
scores for genomic prediction of complex traits in breeding
programs. eor Appl Genet. 2023;137(1):9.
14. Underhill A, Hirsch CD, Clark MD. Evaluating and mapping
grape color using image-based phenotyping. Plant Phenomics.
2020;2020:8086309.
15. Font D, Tresanchez M, Martínez D, Moreno J, Clotet E,
Palacín J. Vineyard yield estimation based on the analysis of
high resolution images obtained with articial illumination at
night. Sensors. 2015;15(4):8284–8301.
16. Olenskyj AG, Sams BS, Fei Z, Singh V, Raja PV, Bornhorst GM,
Earles JM. End-to-end deep learning for directly estimating
grape yield from ground-based imagery. Comput Electron
Agric. 2022;198:Article 107081.
17. Nuske S, Wilshusen K, Achar S, Yoder L, Narasimhan S,
Singh S. Automated visual yield estimation in vineyards. J Field
Robot. 2014;31(5):837–860.
18. Schöler F, Steinhage V. Automated 3D reconstruction of grape
cluster architecture from sensor data for ecient phenotyping.
Comput Electron Agric. 2015;114:163–177.
19. Li M, Klein LL, Duncan KE, Jiang N, Chitwood DH, Londo JP,
Miller AJ, Topp CN. Characterizing 3D inorescence
architecture in grapevine using X-ray imaging and advanced
morphometrics: Implications for understanding cluster
density. J Exp Bot. 2019;70(21):6261–6276.
20. Ivorra E, Sánchez AJ, Camarasa JG, Diago MP, Tardaguila J.
Assessment of grape cluster yield components based on 3D
descriptors using stereo vision. Food Control. 2015;50: 273–282.
21. Luo L, Liu W, Lu Q, Wang J, Wen W, Yan D, Tang Y. Grape
berry detection and size measurement based on edge
image processing and geometric morphology. Mach Des.
2021;9(10):233.
22. Aquino A, Diago MP, Millán B, Tardáguila J. A new
methodology for estimating the grapevine-berry number per
cluster using image analysis. Biosyst Eng. 2017;156:80–95.
23. Zabawa L, Kicherer A, Klingbeil L, Töpfer R, Kuhlmann H,
Roscher R. Counting of grapevine berries in images via
semantic segmentation using convolutional neural networks.
ISPRS J Photogramm Remote Sens. 2020;164:73–83.
24. Zhang Y, Jiao R. Towards segment anything model (SAM) for
medical image segmentation: A survey. arXiv. 2023. http://
arxiv.org/abs/2305.03678
25. Zhou C, Li Q, Li C, Yu J, Liu Y, Wang G, Zhang K, Ji C, Yan
Q, Peng H, etal. A comprehensive survey on pretrained
foundation models: A history from BERT to ChatGPT. arXiv.
2023. http://arxiv.org/abs/2302.09419
26. Mazurowski MA, Dong H, Gu H, Yang J, Konz N, Zhang Y.
Segment anything model for medical image analysis: An
experimental study. Med Image Anal. 2023;89:Article
102918.
27. Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L,
Xiao T, Whitehead W, Berg AC, Lo W-Y, etal. Segment
anything. arXiv. 2023. http://arxiv.org/abs/2304.02643
28. Ma J, He Y, Li F, Han L, You C, Wang B. Segment anything in
medical images. Nat Commun. 2024;15:654.
29. Bonhomme V, Picq S, Gaucherel C, Claude J. Momocs:
Outline analysis using R. J Stat Sow. 2014;56(13): 10.18637/
jss.v056.i13.
30. Yang L, Kang B, Huang Z, Xu X, Feng J, Zhao H. Depth
anything: Unleashing the power of large-scale unlabeled data.
arXiv. 2024. http://arxiv.org/abs/2401.10891
31. Zhou C, Li X, Loy CC, Dai B. EdgeSAM: Prompt-in-the-loop
distillation for on-device deployment of SAM. arXiv. 2023.
http://arxiv.org/abs/2312.06660
32. Zhao X, Ding W, An Y, Du Y, Yu T, Li M, Tang M, Wang
J. Fast segment anything. arXiv. 2023. http://arxiv.org/
abs/2306.12156
33. Xiong Y, Varadarajan B, Wu L, Xiang X, Xiao F, Zhu C, Dai
X, Wang D, Sun F, Iandola F, etal. EcientSAM: Leveraged
masked image pretraining for ecient segment anything.
arXiv. 2023. http://arxiv.org/abs/2312.00863
34. Zhou J, Cao L, Chen S, Perl A, Ma H. Consumer-assisted
selection: e preference for new tablegrape cultivars in China.
Aust J Grape Wine Res. 2015;21(3):351–360.