Skip to content

cmip7: fall back to variable_id when CF detection finds zero geophysical variables#49

Open
JanStreffing wants to merge 1 commit into
ESGF:masterfrom
JanStreffing:fix/geophysical-variable-id-fallback
Open

cmip7: fall back to variable_id when CF detection finds zero geophysical variables#49
JanStreffing wants to merge 1 commit into
ESGF:masterfrom
JanStreffing:fix/geophysical-variable-id-fallback

Conversation

@JanStreffing

Copy link
Copy Markdown
Contributor

Closes #48.

CMIP7 region-selector fx files (basin, siline, similar) carry their data variable as a CF flag-valued integer (flag_values + flag_meanings mapping integer codes to region/section names). compliance_checker.cf.util.is_geophysical excludes any variable with flag_meanings from the geophysical-variable set as a heuristic for status flags. The exclusion is right for QC flags but wrong for these region selectors, whose standard_name is region not status_flag, and which ARE the file's data variable by CMIP7 design.

The existing variable_id disambiguation in _get_geo_var only fires when CF returned multiple candidates. When CF returns zero (the basin/siline case) the function bails with "No geophysical variable detected in the file." at HIGH severity, which blocks ESGF publication.

This PR extends the variable_id fallback to the zero-candidate branch. If CF detection finds nothing and the global variable_id attribute names an existing variable, accept that variable as the file's geophysical variable. The strict CF heuristic stays as the primary path; the fallback only kicks in when both (a) CF found zero candidates AND (b) the CMIP7-canonical variable_id attribute is present.

Verified locally against an AWI-ESM3-4-2-veg-HR basin fx file (CMIP7 basin.ti-u-hxy-u.fx.glb). Pre-patch: HIGH "No geophysical variable detected in the file." Post-patch: pass.

…cal variables

CMIP7 region-selector fx files (basin, siline, similar) carry their
data variable as a CF flag-valued integer (flag_values + flag_meanings
mapping integer codes to region/section names). compliance-checker's
is_geophysical excludes any variable with flag_meanings from the
geophysical-variable set as a heuristic for status flags. The exclusion
is right for QC flags but wrong for these region selectors, whose
standard_name is "region" not "status_flag", and which ARE the file's
data variable by CMIP7 design.

The existing variable_id disambiguation in _get_geo_var only fires when
CF returned multiple candidates. When CF returns zero (the basin/siline
case) the function bails with "No geophysical variable detected in the
file." at HIGH severity, which blocks ESGF publication.

Extend the variable_id fallback to the zero-candidate branch. If CF
detection finds nothing and the global variable_id attribute names an
existing variable, accept that variable as the file's geophysical
variable. The strict CF heuristic stays as the primary path; the
fallback only kicks in when both (a) CF found zero candidates AND
(b) the CMIP7-canonical variable_id attribute is present.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Geophysical Variable Detection rejects CMIP7 basin (and other flag-valued region selectors)

1 participant