Skip to content

filter ingestor array on controlled_vocabulary list#1292

Open
mikesndrs wants to merge 1 commit intoElixirTeSS:masterfrom
DaanVanVugt:feature/filter_controlled_vocabulary
Open

filter ingestor array on controlled_vocabulary list#1292
mikesndrs wants to merge 1 commit intoElixirTeSS:masterfrom
DaanVanVugt:feature/filter_controlled_vocabulary

Conversation

@mikesndrs
Copy link
Copy Markdown
Contributor

@mikesndrs mikesndrs commented Apr 24, 2026

Checklist

  • I have read and followed the CONTRIBUTING guide.
  • I confirm that I have the authority necessary to make this contribution on behalf of its copyright owner and agree to license it to the TeSS codebase under the BSD license.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the ingestor pipeline to optionally normalize certain array fields (e.g., target_audience, keywords) against a controlled vocabulary mapping, controlled via a new feature.controlled_vocabulary_vars setting.

Changes:

  • Refactors auto-parsing logic into handle_auto_parsing and reuses shared mapping-loading via get_mapping.
  • Adds handle_controlled_vocabulary and applies it to both event and material ingestion.
  • Adds unit tests validating controlled-vocabulary behavior when the feature flag is enabled/disabled.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
test/unit/ingestors/ingestor_test.rb Adds coverage for controlled-vocabulary normalization behavior.
lib/ingestors/material_ingestion.rb Routes material ingestion through the new helper methods.
lib/ingestors/event_ingestion.rb Routes event ingestion through the new helper methods.
lib/ingestors/auto_parsing.rb Introduces shared mapping loader + controlled vocabulary normalization helper.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +19 to 25
def auto_parse(var, description)
mapping = get_mapping(var)

mapping
.select{ |key, val| description&.downcase&.include?(key.to_s.downcase) }
&.values
&.uniq
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto_parse calls mapping.select even when get_mapping(var) can return nil (e.g., if the JSON mapping file doesn’t exist). That will raise NoMethodError and break ingestion for any misconfigured/typo’d auto_parse_vars. Consider using safe navigation on mapping (or early-return when mapping is nil) so missing mappings are handled gracefully.

Copilot uses AI. Check for mistakes.
Comment on lines +46 to +49
current_val = obj.send(var).map{|x| x.to_s.downcase}
new_val = mapping
.filter{ |key, val| current_val.include?(key.to_s.downcase) || current_val.include?(val.to_s.downcase) }
.map{ |key, val| val }
Copy link

Copilot AI Apr 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handle_controlled_vocabulary assumes both a mapping exists and that obj.send(var) is an array. If get_mapping(var) returns nil (no JSON file) or the attribute is nil/non-enumerable, this will raise (e.g., nil.filter / nil.map). Consider skipping when mapping is nil and normalizing the current value via Array(obj.send(var)) (and possibly next when it’s blank) before filtering.

Suggested change
current_val = obj.send(var).map{|x| x.to_s.downcase}
new_val = mapping
.filter{ |key, val| current_val.include?(key.to_s.downcase) || current_val.include?(val.to_s.downcase) }
.map{ |key, val| val }
next if mapping.blank?
current_val = Array(obj.send(var)).map { |x| x.to_s.downcase }
next if current_val.blank?
new_val = mapping
.filter { |key, val| current_val.include?(key.to_s.downcase) || current_val.include?(val.to_s.downcase) }
.map { |key, val| val }

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants