Skip to content

Bugfix/non integer taxon keys#203

Closed
jhnwllr wants to merge 7 commits intodevfrom
bugfix/non-integer-taxon-keys
Closed

Bugfix/non integer taxon keys#203
jhnwllr wants to merge 7 commits intodevfrom
bugfix/non-integer-taxon-keys

Conversation

@jhnwllr
Copy link
Copy Markdown
Contributor

@jhnwllr jhnwllr commented Mar 25, 2026

Fix non-integer taxon keys in download request

Fixes #179

jhnwllr and others added 6 commits November 14, 2025 14:49
* Feature/grscicoll (#173)

Adding support for GRSciColl institution and collection search. Updating documentation and vcr fixtures.

* Adding support for species match v2 #176 (#177)

* preparing for new version 0.6.6

* updating ver

* fixing tests and fresh cassettes

* fixing build
- Add type conversion for integer keys (TAXON_KEY, YEAR, MONTH, etc.)
- Add type conversion for numeric keys (DEPTH, ELEVATION, coordinates, etc.)
- Modify _parse_args function to convert string values to int/float
- Handle both simple predicates and 'in' predicates with value lists
- Add comprehensive tests for integer and numeric conversions
- Update existing tests to expect numeric values
- Remove numeric_keys set and float conversion logic
- Only convert taxonomic backbone keys (TAXON_KEY, SPECIES_KEY, etc.) and YEAR/MONTH to integers
- UUID keys (DATASET_KEY, NETWORK_KEY, etc.) remain as strings
- Add tests for datasetKey to verify UUIDs are not converted
- Add test for mixed taxonomic and UUID keys
- 19 tests passing

This comment was marked as outdated.

- Add _convert_to_int() helper function to handle edge cases
- Handle float-like strings (e.g., '5785887.0') by converting via float first
- Raise clear ValueError for non-numeric values instead of silently failing
- Log warning when fractional parts are truncated
- Apply robust conversion to both scalar and 'in' predicates
- Add 3 new tests: float string conversion, in-predicate float conversion, invalid value error
- All 22 tests passing
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +26 to +30
# Taxonomic keys from GBIF's backbone taxonomy that should be integers
# Note: Other keys like DATASET_KEY, NETWORK_KEY are UUIDs and should remain strings
taxonomic_integer_keys = {
"TAXON_KEY", "SPECIES_KEY", "GENUS_KEY", "FAMILY_KEY",
"ORDER_KEY", "CLASS_KEY", "PHYLUM_KEY", "KINGDOM_KEY",
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comments/variable name imply these are strictly taxonomic backbone keys, but the set also includes YEAR and MONTH. To avoid confusion for future maintenance, either rename this set to something like integer_predicate_keys (or similar) or adjust the comment to match its actual contents.

Suggested change
# Taxonomic keys from GBIF's backbone taxonomy that should be integers
# Note: Other keys like DATASET_KEY, NETWORK_KEY are UUIDs and should remain strings
taxonomic_integer_keys = {
"TAXON_KEY", "SPECIES_KEY", "GENUS_KEY", "FAMILY_KEY",
"ORDER_KEY", "CLASS_KEY", "PHYLUM_KEY", "KINGDOM_KEY",
# Includes taxonomic backbone keys (e.g. TAXON_KEY, SPECIES_KEY) and other
# integer-valued predicates such as YEAR and MONTH. Other keys like
# DATASET_KEY, NETWORK_KEY are UUIDs and should remain strings.
taxonomic_integer_keys = {
"TAXON_KEY", "SPECIES_KEY", "GENUS_KEY", "FAMILY_KEY",
"ORDER_KEY", "CLASS_KEY", "PHYLUM_KEY", "KINGDOM_KEY",

Copilot uses AI. Check for mistakes.
try:
converted_values.append(_convert_to_int(v, key))
except ValueError as e:
raise ValueError(f"Error in 'in' predicate for {key}: {str(e)}")
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When re-raising conversion failures in the 'in' predicate branch, the original exception context is dropped. Using exception chaining (raise ... from e) would preserve the underlying cause/traceback and make debugging malformed lists easier.

Suggested change
raise ValueError(f"Error in 'in' predicate for {key}: {str(e)}")
raise ValueError(f"Error in 'in' predicate for {key}: {str(e)}") from e

Copilot uses AI. Check for mistakes.
)
self.assertIsInstance(payload["predicate"]["predicates"][0]["value"], int)

def test_depth_numeric_conversion(self):
Copy link

Copilot AI Mar 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test name "test_depth_numeric_conversion" conflicts with its docstring/behavior (it asserts depth stays a string and is not converted). Consider renaming the test to reflect the intent (e.g., that non-integer numeric-like fields remain strings) to keep the suite self-explanatory.

Suggested change
def test_depth_numeric_conversion(self):
def test_depth_numeric_like_value_remains_string(self):

Copilot uses AI. Check for mistakes.
@jhnwllr
Copy link
Copy Markdown
Contributor Author

jhnwllr commented Mar 25, 2026

I think we have to think about another approach to handle downloads with checklistKey parameter.

@jhnwllr jhnwllr closed this Mar 25, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants