Skip to content

removing check in TextToVectorUpdateProcessorFactory which brakes update for new dynamic fields #3426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

renatoh
Copy link
Contributor

@renatoh renatoh commented Jul 9, 2025

Using TextToVectorUpdateProcessorFactory with dynamics fields, the removed check causes an exception if no document with the input-field has been processes yet. The schema does not know the field yet.

TextToVectorUpdateProcessor#isNullOrEmpty is already checking for null and if the input field does not exist it will be skipped anyway, hence I think it is safe to remove the check.

The same could happen on the output field check , but it might be more likely, that the output field is a field and not a dymaicField, and with that will already exist on the schema

@renatoh renatoh changed the title removing check which brakes update with new dynamic field removing check in TextToVectorUpdateProcessorFactory which brakes update for new dynamic fields Jul 9, 2025
@alessandrobenedetti
Copy link
Contributor

hi @renatoh , can you add a test that covers the scenario? I think your proposed change makes sense, but with the additional test coverage we'll be more confident

@github-actions github-actions bot added the tests label Jul 10, 2025
@@ -40,6 +41,13 @@ public static void cleanup() throws Exception {
afterTest();
}

@After
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the deletion within the test method will not happen if a test fails, such a failing test was then impacting all following test since the deletion was not done and the setup is done only once for the test-class

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, have you checked that all is fine? I remember I struggled a bit with the after/afterClass

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, now with the @after, even if I make one test fail on purpose, the result is consistent across 5 test runs. Without the @after one failing test has unwanted side effect and other test of the class are randomly failing due to the data which got not cleaned up.

.getMethod(paramName, String.class)
.invoke(builder, params.get(paramName).toString());
} catch (NoSuchMethodException e) {
log.error("Parameter {} not supported by model {}", paramName, className);
Copy link
Contributor Author

@renatoh renatoh Jul 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementing my own DimensionAwareEmbeddingModel, I notice that if the attribute is not supported by the model, the error was not logged at all and I had to debug to see what the issue was.
I also took the liberty tot clean up the duplicate code snippets.

@renatoh
Copy link
Contributor Author

renatoh commented Jul 10, 2025

Hi @alessandrobenedetti,

Thanks for your feedback! One of your test was actually failing, I adjusted it so that it no longer check for the exception.
The case where we the input fields it not present or empty is already covered well by TextToVectorUpdateProcessorTest.
Please also have a look at my comments on my changes

Copy link
Contributor

@alessandrobenedetti alessandrobenedetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are nearly there, the contributions make sense but the tests should get some additional attention

@@ -40,6 +41,13 @@ public static void cleanup() throws Exception {
afterTest();
}

@After
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It makes sense, have you checked that all is fine? I remember I struggled a bit with the after/afterClass

@@ -132,7 +132,7 @@ public void init_notDenseVectorOutputField_shouldThrowExceptionWithDetailedMessa

/* Following test depends on a real solr schema and depends on BeforeClass-AfterClass methods */
@Test
public void init_notExistentInputField_shouldThrowExceptionWithDetailedMessage() {
public void init_notExistentInputField_shouldNotThrowException() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmmm I think this was ok, if the field is undefined I expect the exception.
You should keep this test and add a new one specific for a dynamic field (the scope of this issue).
Did I misunderstand anything?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this test will fail with the changes I did to TextToVectorUpdateProcessorFactory.java, since the exception is not thrown anymore. If we want to keep this test we need to distinguish somehow between a dynamic field and a non-dynamic field within TextToVectorUpdateProcessorFactory.java, in order to throw the exception only for dynamic fields, not sure if this would make sense

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mmm, so what would happen if the user specifies in the input a non-existent field? For example, because of a typo?
Will it just silently fail?
If that's the case, we need to differentiate between non-existent and 'non explicitly defined but compatible with a dynamic field' I suspect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I am checking now it the input or output field is a dynamic field and the exception is now still throw if the non-dynamic-field does not exists

Copy link
Contributor

@alessandrobenedetti alessandrobenedetti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The changes look good to me, I'll wait for a green build and add the changes.txt, merging afterwards. Thanks Renato!

@alessandrobenedetti
Copy link
Contributor

Hi Renato, wanted to merge, but a bit in a rush now, Can you solve the conflict and push it again? (the annoying CHANGES.txt).

I'll merge after lunch, so in case you can't do it in the meantime, I'll do it later.

Cheers

@renatoh
Copy link
Contributor Author

renatoh commented Jul 14, 2025

Hi Renato, wanted to merge, but a bit in a rush now, Can you solve the conflict and push it again? (the annoying CHANGES.txt).

I'll merge after lunch, so in case you can't do it in the meantime, I'll do it later.

Cheers

Hi Alessandro, only back at my computer on friday.
Ah, always the annoying CHANGES. txt

@alessandrobenedetti alessandrobenedetti force-pushed the error-handling-in-texttovecupdateprocessorfactory branch from 69156c6 to a6a45e4 Compare July 14, 2025 13:48
@alessandrobenedetti alessandrobenedetti merged commit 8a28b91 into apache:main Jul 14, 2025
3 checks passed
alessandrobenedetti added a commit that referenced this pull request Jul 14, 2025
…ate for new dynamic fields (#3426)

* distinguish between dynamic field and field, dynamic fields do not have to exist

---------

Co-authored-by: Alessandro Benedetti <[email protected]>
(cherry picked from commit 8a28b91)
alessandrobenedetti added a commit that referenced this pull request Jul 14, 2025
…ate for new dynamic fields (#3426)

* distinguish between dynamic field and field, dynamic fields do not have to exist

---------

Co-authored-by: Alessandro Benedetti <[email protected]>
(cherry picked from commit 8a28b91)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants