---
title: "Field-photo-only classification"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Field-photo-only classification}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment  = "#>",
  eval     = FALSE
)
```

`classify_from_photos()` answers a field-realistic question: *given only a
photograph of a soil profile and a GPS fix, what is this soil?* No augered
samples, no laboratory sheet -- just what a phone camera and a coordinate can
provide. It is a screening tool, not a replacement for a described and sampled
profile, and the result is graded accordingly.

# The pipeline

```
profile photo  --VLM-->  Munsell colour per horizon
field sheet    --VLM-->  site metadata (optional)
GPS fix        --SoilGrids-->  depth-resolved clay / sand / silt / pH / OC / CEC
                         |
                         v
              assembled PedonRecord
                         |
                         v
        WRB 2022  +  SiBCS 5  +  USDA ST 13
```

Three things never happen here: the taxonomic key is never delegated to the
model; quantitative non-colour attributes are never read off a photograph
(that is a hard prompt-level rule); and a SoilGrids prior never silently
displaces a measured value -- the `PedonRecord` authority order forbids it.

# A minimal run

```{r run}
library(soilKey)

res <- classify_from_photos(
  images   = list(profile = "perfil.jpg", fieldsheet = "ficha.jpg"),
  lat      = -22.74,
  lon      = -43.68,
  country  = "BR",
  provider = ellmer::chat_anthropic()   # any ellmer chat object
)

res$wrb$name            # e.g. "Rhodic Ferralsol (Clayic, ...)"
res$wrb$evidence_grade  # "D" -- VLM-extracted; or "C" with a prior
res$summary             # one row per system
```

`images` can also be a plain character vector of profile photos when there is
no field sheet:

```{r images-vector}
res <- classify_from_photos("perfil.jpg", lat = -22.74, lon = -43.68,
                            provider = ellmer::chat_anthropic())
```

# The `provider` argument is required

There is deliberately no default provider. Passing a real classification
back from canned data by accident would be worse than an error, so you must
supply either a live [ellmer](https://ellmer.tidyverse.org/) chat object or a
`MockVLMProvider` for testing:

```{r provider}
# Testing / offline: a mock provider returning a canned, schema-valid response
mock <- MockVLMProvider$new(responses = list(my_canned_munsell_json))
classify_from_photos("perfil.jpg", lat = -22.7, lon = -43.6, provider = mock)
```

# The SoilGrids depth prior

A Munsell colour alone rarely keys past the data-poor catch-all
(Regosols / Neossolos / Entisols). `classify_from_photos()` therefore
back-fills the missing horizon attributes from a SoilGrids depth prior, on by
default. `apply_soilgrids_depth_prior()` does the work and can be called on
its own:

```{r soilgrids}
p <- make_cambisol_canonical()
p$horizons$clay_pct <- NA_real_

# Live: fetch the six SoilGrids 2.0 depth slices via the ISRIC REST API.
apply_soilgrids_depth_prior(p)

# Offline / reproducible: pass the six-slice profiles directly.
apply_soilgrids_depth_prior(
  p,
  depth_profiles = list(clay_pct = c(18, 20, 24, 28, 30, 30)))
```

For each horizon the value is interpolated at the mid-depth from the
0-5 / 5-15 / 15-30 / 30-60 / 60-100 / 100-200 cm slices and recorded with
`source = "inferred_prior"`. Skip it with `soilgrids = FALSE` if you only
want the Munsell-driven result.

# Reading the evidence grade

Because every value came from a photograph or a prior, the classification's
evidence grade is low by construction -- `D` where a VLM-extracted attribute
was used, `C` where only a SoilGrids prior contributed. For a cell-by-cell
view use `compute_per_attribute_evidence_grade()`:

```{r grade}
grades <- compute_per_attribute_evidence_grade(res$pedon)
grades            # data.table(horizon_idx, attribute, grade)
```

The five-grade scale is: **A** measured, **B** spectra-predicted,
**C** prior-inferred, **D** VLM-extracted, **E** user-assumed. A photo-only
classification will never report grade A or B -- and that is the point: the
grade tells the user exactly how much trust the result has earned.

# When to use it

`classify_from_photos()` is built for reconnaissance: a quick, georeferenced,
multi-system first guess from a field visit, to be confirmed later by
description and sampling. The handful of horizons it can see, the colour it
can estimate, and the regional prior it can fetch are genuinely informative --
but the evidence grade is the honest headline, and it is never an A.