Adiciona módulo Markup by pitangainnovare · Pull Request #35 · scieloorg/markapi

pitangainnovare · 2025-10-20T20:20:31Z

Em produção, deve-se definir a variável de ambiente

LLAMA_ENABLED=True

… principales

… de nuevas apps y aumento del límite de campos

…s, textos con idioma y manejo flexible de fechas

…ón de referencias

…s y ampliación de tipos soportados (confproc, full_text, etc.)

…e model_ai

… model_ai)

…rga de modelos

…ai y python-docx

…istas de búsqueda, utilidades y hooks de Wagtail

…o, utilidades y hooks de Wagtail

…ones OMML a MathML

…es de inferencia, tareas y hooks de Wagtail

…s de procesamiento de datos

…ial.py y eliminación de migraciones intermedias

…n de Django y traducción de verbose_name a inglés

Corrige el tipo de excepción para responder 404 cuando el registro no existe.

…nlaces Reduce ruido en logs y mantiene la función enfocada a su retorno.

Mejora legibilidad y buenas prácticas de manejo de errores.

…a prompt de referencias Se agregan comillas a campos textuales y se corrigen comas/keys para evitar errores de parseo del prompt.

Permite traducción de 'Mixed Citation' y 'Rating from 1 to 10'.

…en save()

…eference status' (incluye migraciones)

- function_llama passou a ser LlamaInputSettings em llama.py - generic_llama passou a ser llama.py com LlamaService

Copilot

Pull Request Overview

This PR implements a merge of markup functionality, introducing AI-powered document processing capabilities for converting DOCX files to structured XML format. The changes integrate LLM services (both Llama and Gemini) for metadata extraction, reference parsing, and content labeling.

Key changes:

Adds new model_ai and markup_doc applications for AI model management and document processing
Integrates Google Generative AI and python-docx libraries
Refactors reference processing to use data_utils module instead of tasks
Renames classes for consistency (e.g., ReferenceAdmin → ReferenceModelViewSet)

Reviewed Changes

Copilot reviewed 55 out of 66 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
requirements/base.txt	Adds dependencies for AI/ML (google-generativeai, langid) and document processing (python-docx)
reference/wagtail_hooks.py	Refactors import paths and renames admin class for consistency
reference/models.py	Adds ReferenceStatus enum and replaces raw integer field with structured status
reference/marker.py	Updates import path from llama3.generic_llama to model_ai.llama
reference/data_utils.py	Updates to use ReferenceStatus enum instead of raw integer
reference/api/v1/views.py	Updates to use ReferenceStatus enum and new import path
model_ai/*	New application for managing AI models (Llama/Gemini) with download functionality
markup_doc/*	New application for DOCX-to-XML conversion with AI-powered content extraction
markuplib/*	Adds DOCX processing utilities and OMML-to-MathML transformation

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-20T20:21:16Z

+                    ],
+                    "date": c2005,
+                    "source": "Advanced practice nursing: an integrative approach",
+                    "edition: "3rd ed",


Corrected missing closing quote in 'edition' key. Should be \"edition\": \"3rd ed\".

Copilot · 2025-10-20T20:21:16Z

+                    "date": 1995,
+                    "source": "Inflammatory bowel disease",
+                    "chapter": "The epidemiology of idiopathic inflammatory bowel disease.",
+                    "edition: "4th",


Corrected missing closing quote in 'edition' key. Should be \"edition\": \"4th\".

Copilot · 2025-10-20T20:21:16Z

-                'uri': {'type': 'string'},
-                'access_date': {'type': 'string'},
-                'version': {'type': 'string'},
+                "full_text": {"type": "integer"},


The 'full_text' field should have type 'string' not 'integer', as it contains citation text.

Suggested change

"full_text": {"type": "integer"},

"full_text": {"type": "string"},

Copilot

Pull Request Overview

Copilot reviewed 55 out of 66 changed files in this pull request and generated 5 comments.

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-10-21T11:59:41Z

               marked_xml=etree.tostring(get_xml(i), pretty_print=True, encoding='unicode')
           )
-   obj_reference.estatus = 2
+   obj_reference.estatus = ReferenceStatus.READY


The code uses obj_reference.estatus but the model field was renamed to status in reference/models.py. This will cause an AttributeError. Change estatus to status.

Suggested change

obj_reference.estatus = ReferenceStatus.READY

obj_reference.status = ReferenceStatus.READY

Copilot · 2025-10-21T11:59:42Z

                new_reference = Reference.objects.create(
                    mixed_citation=post_reference,
-                    estatus=1,
+                    estatus=ReferenceStatus.CREATING,


The field name estatus should be status to match the renamed field in the Reference model.

Suggested change

estatus=ReferenceStatus.CREATING,

status=ReferenceStatus.CREATING,

Copilot · 2025-10-21T11:59:42Z

-                'uri': {'type': 'string'},
-                'access_date': {'type': 'string'},
-                'version': {'type': 'string'},
+                "full_text": {"type": "integer"},


The full_text field is defined with type 'integer' but should be 'string' since it contains text content, not numeric values.

Suggested change

"full_text": {"type": "integer"},

"full_text": {"type": "string"},

Copilot · 2025-10-21T11:59:42Z

+
+    def form_valid(self, form):
+        self.object = form.save_all(self.request.user)
+        self.object.estatus = ProcessStatus.PROCESSING


The field name estatus should be status based on the pattern of renaming fields from Spanish to English throughout the codebase.

Suggested change

self.object.estatus = ProcessStatus.PROCESSING

self.object.status = ProcessStatus.PROCESSING

Copilot · 2025-10-21T11:59:43Z

+    instance.content_body = stream_data_body
+    # Guardar el XML en el campo `file_xml`
+    #archive_xml = ContentFile(xml)  # Crea un archivo temporal en memoria
+    instance.estatus = ProcessStatus.PROCESSED


The field name estatus should be status to maintain consistency with the model field naming convention.

Suggested change

instance.estatus = ProcessStatus.PROCESSED

instance.status = ProcessStatus.PROCESSED

eduranm and others added 30 commits September 26, 2025 10:15

Integración de ArticleViewSet en el enrutador y actualización de URLs…

e34f8a5

… principales

Actualización de settings: cambio en directorio de modelos, inclusión…

ab31a5e

… de nuevas apps y aumento del límite de campos

Ampliación de core.models: nuevos modelos de género, idioma, licencia…

b2e9538

…s, textos con idioma y manejo flexible de fechas

Actualización en Reference: uso de ReferenceStatus y ajuste en creaci…

d7395d9

…ón de referencias

Refactor en reference.config: serialización de ejemplos con json.dump…

02654a5

…s y ampliación de tipos soportados (confproc, full_text, etc.)

Refactor en marker: actualización de importación de GenericLlama desd…

e40b6be

…e model_ai

Eliminación de reference.tasks (responsabilidad movida a data_utils y…

d88c9c5

… model_ai)

Refactor en wagtail_hooks: uso de SnippetViewSet y soporte para desca…

3159c3b

…rga de modelos

Actualización de dependencias: inclusión de langid, google-generative…

8746d02

…ai y python-docx

Creación y ampliación de la app core: modelos comunes, formularios, v…

a9305e6

…istas de búsqueda, utilidades y hooks de Wagtail

Nueva app markup_doc: modelos ArticleDocx, API REST, tareas de marcad…

95a8153

…o, utilidades y hooks de Wagtail

Nueva librería markuplib: funciones para procesar DOCX y transformaci…

58cc438

…ones OMML a MathML

Nueva app model_ai: integración de LLaMA, funciones genéricas, mensaj…

4944e3b

…es de inferencia, tareas y hooks de Wagtail

Módulos adicionales en reference: configuración de Gemini y utilidade…

4ecb50a

…s de procesamiento de datos

Consolidación de migraciones en reference: actualización de 0001_init…

2ab9bd9

…ial.py y eliminación de migraciones intermedias

Actualización de migración inicial en core_settings: cambio de versió…

3024568

…n de Django y traducción de verbose_name a inglés

Creación de migración inicial en core

1e31de7

Eliminación de la app llama3: funcionalidades migradas a model_ai

826800b

fix(markup_doc): capturar ArticleDocxMarkup.DoesNotExist en generate_xml

76258cd

Corrige el tipo de excepción para responder 404 cuando el registro no existe.

refactor(markuplib): eliminar prints de depuración en extracción de e…

aff3a39

…nlaces Reduce ruido en logs y mantiene la función enfocada a su retorno.

style(model_ai): reemplazar bare except por Exception en download_model

6dd1ae2

Mejora legibilidad y buenas prácticas de manejo de errores.

fix(reference): corregir literales y separadores en config_gemini par…

5842054

…a prompt de referencias Se agregan comillas a campos textuales y se corrigen comas/keys para evitar errores de parseo del prompt.

i18n(reference): envolver labels y help_text con gettext_lazy

6a09911

Permite traducción de 'Mixed Citation' y 'Rating from 1 to 10'.

model_ai: corrige etiqueta de i18n y refuerza unicidad de LlamaModel …

95e6d31

…en save()

reference: renombra campo 'estatus' a 'status' y ajusta etiqueta a 'R…

192403d

…eference status' (incluye migraciones)

reference: elimina import no utilizado en wagtail_hooks.py

22fe3d1

remove antigo app llama3 e resolve conflito em base.py

abe54e2

adiciona instruções para fazer build com suporte a Llama

e000258

melhora imports

cbc8509

Apaga modulo legado llama3

2c32138

pitangainnovare added 9 commits October 20, 2025 17:13

Resolve conflitos em generic_llama e aplica correções estruturais:

aa9c6a3

- function_llama passou a ser LlamaInputSettings em llama.py - generic_llama passou a ser llama.py com LlamaService

Cria input settings para references

dc61b24

Padroniza deps

e2e696a

Padroniza imports em geral (além de adequar à nova nomenclatura)

3e712af

Adiciona método faltante em marker (por causa de merge)

dd94a35

Remove linhas comentadas

3ed8e58

Inclui AI Model na interace, para poder registrar modelos

1fbd7d1

Adequa uso do serviço Llama em tasks

e22bc7c

Adiciona migrações

79245bd

Copilot AI review requested due to automatic review settings October 20, 2025 20:20

Copilot AI reviewed Oct 20, 2025

View reviewed changes

robertatakenaka changed the title ~~Eduranm merge markup~~ Adiciona módulo Markup Oct 20, 2025

pitangainnovare added 5 commits October 20, 2025 19:26

Corrige nome de parâmetro type (deve ser mode)

5740120

Flexibiliza attrs de LlamaService

a830fb5

Adiciona alguns FIXME

3bf5c7c

Melhora imports

f501379

Padroniza nome de metodo que obtem tipo de IA (LLAMA ou GEMINI)

cb739b4

robertatakenaka requested review from Rossi-Luciano and Copilot October 21, 2025 11:58

Copilot AI reviewed Oct 21, 2025

View reviewed changes

robertatakenaka mentioned this pull request Feb 9, 2026

Melhorar testabilidade e modularidade das views em markup_doc/api/v1/views.py #47

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adiciona módulo Markup#35

Adiciona módulo Markup#35
pitangainnovare wants to merge 44 commits intoscieloorg:mainfrom
pitangainnovare:eduranm-merge_markup

pitangainnovare commented Oct 20, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 20, 2025

Uh oh!

Copilot AI Oct 20, 2025

Uh oh!

Copilot AI Oct 20, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Oct 21, 2025

Uh oh!

Copilot AI Oct 21, 2025

Uh oh!

Copilot AI Oct 21, 2025

Uh oh!

Copilot AI Oct 21, 2025

Uh oh!

Copilot AI Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	"full_text": {"type": "integer"},
	"full_text": {"type": "string"},

	obj_reference.estatus = ReferenceStatus.READY
	obj_reference.status = ReferenceStatus.READY

	estatus=ReferenceStatus.CREATING,
	status=ReferenceStatus.CREATING,

	self.object.estatus = ProcessStatus.PROCESSING
	self.object.status = ProcessStatus.PROCESSING

	instance.estatus = ProcessStatus.PROCESSED
	instance.status = ProcessStatus.PROCESSED

Conversation

pitangainnovare commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Oct 21, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

pitangainnovare commented Oct 20, 2025 •

edited

Loading