Skip to content

Feat/feature store fp lf#5816

Open
BassemHalim wants to merge 12 commits intoaws:masterfrom
BassemHalim:feat/feature-store-fp-lf
Open

Feat/feature store fp lf#5816
BassemHalim wants to merge 12 commits intoaws:masterfrom
BassemHalim:feat/feature-store-fp-lf

Conversation

@BassemHalim
Copy link
Copy Markdown
Contributor

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Add configurable use_lake_formation_credentials parameter to the
@feature_processor decorator, defaulting to False. The value flows
through FeatureProcessorConfig to the Spark connector's ingest_data()
call, enabling Lake Formation credential vending when set to True.

---
X-AI-Prompt: make useLakeFormationCreds configurable, defaults to False, passed to feature_processor
X-AI-Tool: kiro-cli
Generate ECDSA signing key in ConfigUploader and pass it to
StoredFunction for function payload signature verification. The
public key PEM is returned to callers for remote-side verification.

---
X-AI-Prompt: fix StoredFunction missing signing_key error in feature_processor pipeline
X-AI-Tool: kiro-cli
Add _image_resolver module that resolves the SageMaker Spark
processing container image URI based on installed PySpark and Python
versions. Supports Spark 3.1/3.2/3.3/3.5 with appropriate Python
version mapping. Uses container_version=v1 as a floating tag.

---
X-AI-Prompt: add image resolver with container_version v1 for spark processing image
X-AI-Tool: kiro-cli
…cheduler

Update feature_scheduler to use _get_spark_image_uri for dynamic
image resolution instead of _JobSettings._get_default_spark_image.
Thread public_key_pem from ConfigUploader through to ModelTrainer
environment as REMOTE_FUNCTION_SECRET_KEY. Allow user-provided
image_uri to take precedence over auto-resolved URI.

---
X-AI-Prompt: integrate image resolver and signing key into feature scheduler pipeline
X-AI-Tool: kiro-cli
…Store JARs

Resolve Hadoop version dynamically based on installed PySpark version
instead of hardcoding 3.3.1. Move Feature Store JAR classpath setup
outside the non-training-job guard so spark.jars is always set,
fixing FeatureStoreManager class loading in training job mode.

---
X-AI-Prompt: fix spark factory hadoop version and jar classpath for spark 3.5
X-AI-Tool: kiro-cli
Update _get_default_spark_image to accept Python 3.12 in addition to
3.9. Auto-detect Spark version from installed pyspark instead of
hardcoding 3.3, falling back to the default if pyspark is not
installed. Also resolve correct Python binary in Spark bootstrap
script to avoid PATH conflicts with system python3.

---
X-AI-Prompt: fix job.py to select correct spark image for py312 and detect pyspark version
X-AI-Tool: kiro-cli
…on 3.12

Update expected error message in remote function tests to reflect that
SageMaker Spark images now support Python versions 3.9 and 3.12.
…or deps

Pin pyspark==3.5.1 in both feature-processor and test optional
dependencies to ensure consistent Spark version across environments.
…ersions

SageMaker Spark image only supports Python 3.9 and 3.12. Add skipif
markers to three feature processor integ tests that fail on Python 3.10.
Inject sagemaker-feature-store-pyspark>=2,<3 via pre_execution_commands
in _get_remote_decorator_config_from_input so it gets installed on the
remote container automatically.

Update integ tests: add skipif for Python 3.10 Spark tests, remove
manual feature-store-pyspark install, use python3 instead of python3.12.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant