In the last post, we considered the situations where we had little or no historical data for the vineyard of interest. In this post, we consider our preferred situation: we have data.
When we have historical data
In this case, we can model ripening rates with a view to reducing model bias to insignificant levels. Achieving this leaves us with truly random noise (that is our ultimate modelling quality criteria). Truly random noise allows us to safely use statistical tools that are designed for it (e.g. random walk correction) and innovate on the predictive software side (e.g. conditional random walk correction).There are two approaches to achieve this and both rely on predicting the speed rather than the value:

  • A statistical modelling approach
    Samples are considered “as a whole”, and each sample is decorated with likely influential factors (e.g. variety and region). Using specialist statistical software, it is then possible to analyse the effect of each factor on ripening speed.
    This is an unsupervised machine learning method CSIRO has perfected.
  • A physics/chemistry approach
    Samples are considered as time series, each vintage being a period of a block ripening signal. Each period is considered independent from the previous periods.
    Samples are then analysed to test candidate ripening equations. This is a supervised machine learning method Thoughtpool has developed. We call this method MAM (short for Modulated Asymptotic Model).
Both approaches result in a “normalisation of residual variances” by using non-linear models. The distribution of variances is not always normal (note: “normal” in the statistical sense) for each block/vintage, because extreme short term weather variations can affect predictions. But it remains normal when weather is typical, which now allows us to:
  • extend and stabilise predictions in typical weather conditions:
    The planning benefits are important as good samples in an easy vintage will result in transient/unreliable dates using OPW (as a result of the linear bias), whereas it will result in stable dates with non linear models.
    As a result, less time is spent adapting a plan when there is no need to.
  • actually model the effects of atypical weather:
    The planning benefits are also important as good samples in a difficult vintage MUST result in fleeting dates to reflect the changing conditions. This additional level of modelling is impossible to conduct over a biased  ripening model.

To check out what these predictions look like, we’ve created a “prediction widget” you can play around with on the Harvest-plan site.