The onset of greening, the start of senescence, the timing of the maximum of the growing season and the growing season length are frequently calculated phenology metrics based on satellite imagery. However, since it is complicated to validate the much coarser spatial resolution observations of land surface phenology even with networks of ground observations of phenology, it is often unclear as to what the land surface phenology metrics actually quantify. For example, in northern biomes, the greatest increase in a satellite derived vegetation index indicated as "start of the season" (SOS) in some methods, can often be due to snow melt. The end of the greenness (EOS) metric, on the other hand, could measure an extended period of cloudiness instead of actual vegetation senescence. Since the relationship between satellite and ground observations of phenological events is ambiguous, many techniques have been developed. Here I discuss and compare some of the more commonly applied methods to derive land surface phenology metrics: delayed moving average method, percent threshold method, quadratic models based on accumulated growing degree-days and the MOD12 phenology product, which relies on piecewise sigmoidal models. I demonstrate the methods using both NDVI and EVI time series from 2001 and 2007 derived from MODIS/Terra+Aqua Nadir BRDF-Adjusted Reflectance 16-Day L3 Global 0.05Deg CMG V005 (MCD43C4) data for North America north of 30°N. To remove snow-covered pixels, I use the snow and ice QA flags. To compare the methods I evaluate the spatio-temporal differences in phenological metrics and discuss the bias- variance dilemma that involves the trade-off between an over-fitted model that is too complicated and an over- smoothed model that is too simple. Finally, I discuss the related issue of the possibility of statistically comparing the land surface phenology metrics for two years of data which represent extremes in the polarity of the North Atlantic Oscillation (2000 and 2007).