Skip to content

fix: More exhaustive athena types#6544

Open
Marcus-Rosti wants to merge 2 commits into
feast-dev:masterfrom
Marcus-Rosti:mrosti/athena-types
Open

fix: More exhaustive athena types#6544
Marcus-Rosti wants to merge 2 commits into
feast-dev:masterfrom
Marcus-Rosti:mrosti/athena-types

Conversation

@Marcus-Rosti

Copy link
Copy Markdown

What this PR does / why we need it:

There's a lot of uncovered athena types -- so I tried to include them all

Which issue(s) this PR fixes:

Checks

  • I've made sure the tests are passing.
  • My commits are signed off (git commit -s)
  • My PR title follows conventional commits format

Testing Strategy

  • Unit tests
  • Integration tests
  • Manual tests
  • Testing is not required for this change

Misc

@Marcus-Rosti Marcus-Rosti requested a review from a team as a code owner June 19, 2026 00:10
@Marcus-Rosti Marcus-Rosti force-pushed the mrosti/athena-types branch from 6e8d53c to c14ecff Compare June 19, 2026 03:15
Comment thread sdk/python/feast/type_map.py Outdated

type_map = {
"null": ValueType.UNKNOWN,
"null": ValueType.NULL,

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder though if we should keep this for backwards compat?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should.

Signed-off-by: Marcus Rosti <marcus.rosti@baton.io>

fix: Double quotes and type

Signed-off-by: Marcus Rosti <marcus.rosti@baton.io>
@Marcus-Rosti Marcus-Rosti force-pushed the mrosti/athena-types branch from c14ecff to ead51c0 Compare June 22, 2026 16:56
@Marcus-Rosti Marcus-Rosti changed the title More exhaustive athena types fix: More exhaustive athena types Jun 22, 2026
@franciscojavierarceo

Copy link
Copy Markdown
Member

@codex review

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 6c4dc1d85e

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

return list_mapping.get(inner_feast_type, ValueType.VALUE_LIST)
return ValueType.VALUE_LIST

base_type = re.split(r"[(<\[]", athena_type)[0].strip()

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve non-string Athena map keys

When Athena returns a map type with a non-string key, e.g. MAP<INT, STRING> or MAP(INTEGER, VARCHAR), this split drops the key type and the existing map entry below infers ValueType.MAP. That path serializes through Feast's string-keyed Map proto, so materializing or inferring features with integer/boolean/etc. map keys will fail during conversion or be represented with the wrong schema instead of being rejected or handled with the non-string-key map path.

Useful? React with 👍 / 👎.

ValueType.UUID: ValueType.UUID_LIST,
ValueType.DECIMAL: ValueType.DECIMAL_LIST,
}
return list_mapping.get(inner_feast_type, ValueType.VALUE_LIST)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep nested array element types

For Athena columns like ARRAY<ARRAY<INT>>, the recursive call returns ValueType.INT32_LIST, which is not in list_mapping, so this fallback classifies the column as generic VALUE_LIST. The generic path is later converted to placeholder string nested lists (Array(Array(String)) in from_value_type, and the Athena type string cannot be parsed by the PyArrow fallback), so feature inference/materialization will use the wrong schema for nested numeric, boolean, or timestamp arrays.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants