Better Predictions

The prediction strategy we use depends what data we already have about the block or vineyard of interest. There are three possibilities we deal with:
  • We have no prior data related to the variety grown on the block
  • We have no prior data related to THIS block, but we do know about similar blocks
  • We have a reasonable set of data about the variety and the region.
No history
Some blocks have no historical data to learn from; it may be a “new” variety for the region in question, or even relatively rare for the country. In this case, we could use the common rule of thumb of One Baume Per Week (OPW for short). This is (in statistical terms) a poor prediction model because its variances on residuals do not result in a normal distribution, which denotes a modelling bias (in other words the model does not “really fit” the data). As harvest approaches it gets more useful, but this usually means an abbreviated planning horizon.
When there is only similar historical data to learn from

Some blocks have no historical data, but many similar neighbouring blocks to leverage from. In this case, we can use regional/varietal linear regressions to determine ripening speeds of X Baume Per Week the grape ripens at (XPW for short). Sadras and Petrie conducted research resulting in a list of 9 region/variety speeds (“Predicting the time course of grape ripening” Australian Journal of Grape and Wine Research Volume 18,  Issue 1, pages 48–56, February 2012; registration required).
This is still a sub-optimal prediction model in the sense it has a linearity bias resulting in a non-normal distribution of variances on residuals; but it is better as XPW captures the ripening similarities and uses a high number of samples resulting in better noise cancellation. We have systematised this approach and now have a list of 140 regional/varietal speeds. While it’s an improvement, it is no panacea because in addition to the linearity bias, it also suffers from a ‘constant weather’ bias. The data in our possession shows the same block can be harvested at an “historical date” give or take 5 weeks, again shortening the planning horizon.
When we have historical data
In this case, we can model ripening rates with a view to reducing model bias to insignificant levels. Achieving this leaves us with truly random noise (that is our ultimate modelling quality criteria). Truly random noise allows us to safely use statistical tools that are designed for itand innovate on the predictive software side. There are two approaches to achieve this and both rely on predicting the speed rather than the value:
  • A statistical modelling approach
    Samples are considered “as a whole”, and each sample is decorated with likely influential factors (e.g. variety and region). Using specialist statistical software, it is then possible to analyse the effect of each factor on ripening speed.
    This is an unsupervised machine learning method CSIRO has perfected.
  • A physics/chemistry approach
    Samples are considered as time series, each vintage being a period of a block ripening signal. Each period is considered independent from the previous periods.
    Samples are then analysed to test candidate ripening equations. This is a supervised machine learning method Thoughtpool has developed.
Both approaches result in a “normalisation of residual variances” by using non-linear models, which now allows us to:
  • extend and stabilise predictions in typical weather conditions:
    The planning benefits are important as good samples in an easy vintage will result in transient/unreliable dates using OPW (as a result of the linear bias), whereas it will result in stable dates with non linear models.
    As a result, less time is spent adapting a plan when there is no need to.
  • actually model the effects of atypical weather:
    The planning benefits are also important as good samples in a difficult vintage MUST result in fleeting dates to reflect the changing conditions. This additional level of modelling is impossible to conduct over a biased  ripening model.
To check out what these predictions look like, we’ve created a “prediction widget” you can play around with on the Harvest-Plan site.
Prediction after two samples

Protecting Value

Protect the value in each grape via earlier planning and increased cooperation throughout your vintage network

Collaborative planning

Share your intake planning with the supply network; harvesters, transporters and crushing wineries

Better maturity prediction

More accurate predictions of grape maturity with a longer planning horizon


Register now to improve your vintage planning

Register for Harvest-Plan

Latest updates from the blog …

Thoughts on pricing – what is a “block”?

We've been spending some time thinking about how we price the Harvest Plan service, and the discussion usually comes down to a relationship to what we call, here in Australia at least, a “block”. What does that mean? What is a block? How do we define it so we can use...

read more

The Prediction Widget

In closing the last post, we introduced our prediction widget, which is a playground to get a feel for how predictions work. It is a graphing tool that shows how different prediction models behave on a set of real samples. It shows two linear models: One Baume per...

read more

Harvest-plan: how we predict (pt2)

In the last post, we considered the situations where we had little or no historical data for the vineyard of interest. In this post, we consider our preferred situation: we have data. When we have historical data In this case, we can model ripening rates with a view...

read more