Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data loss if issues on TCP protocol layer or failures on network link. Retry policy is ignored #6962

Open
YuriyHolinko opened this issue Dec 16, 2024 · 4 comments · Fixed by #6991
Labels
Feature Request Suggest an idea for this project help wanted

Comments

@YuriyHolinko
Copy link
Contributor

YuriyHolinko commented Dec 16, 2024

Is your feature request related to a problem? Please describe.
I have noticed data loss in shipping of logs even if retry policy is used. the problem is that we do not have any strong logic around retryable exceptions and if there is some issue in network link we do don't re-send the data but just swallow the exception
the code which decides retry or not is here

but it does not handle all the issues with network link failures e.g. my use case

java.net.SocketTimeoutException: timeout
because the message does not match a condition
message == null || message.toLowerCase(Locale.ROOT).contains("connect timed out");

Describe the solution you'd like
I suggest to make this condition

  • to cover more case
  • to make configurable on programatic layer as everyone can adapt it to network failures that can happen

Describe alternatives you've considered
I cannot define

Additional context
I can bring a draft PR with my fix as I am not using OTEL java agents to ship logs but initiate it programmatically . Hope it will be useful for everyone

@YuriyHolinko YuriyHolinko added the Feature Request Suggest an idea for this project label Dec 16, 2024
@YuriyHolinko YuriyHolinko changed the title Data loss if issues on TCP protocol later or failures on network link. Retry policy is ignored Data loss if issues on TCP protocol layer or failures on network link. Retry policy is ignored Dec 16, 2024
@jack-berg
Copy link
Member

jack-berg commented Dec 16, 2024

Thanks for the report.

I recommend for now we adjust the logic to catch this case you've found. If it becomes a recurring problem, we can consider making it programmatically configurable, but let's opt for the simple solution for now.

PRs welcome!

@YuriyHolinko
Copy link
Contributor Author

YuriyHolinko commented Dec 16, 2024

thanks for reply

in my case it will be 3 exceptions when I want to retry. if we use SSL under the hood there will be more, because when servers (log receivers) are crashed/failed, ssl handshake exceptions might be thrown and we need to handle it as well. that's why I want to make it configurable

in any case I start following procedure how to become a contributor and get a member access to repo to create my first PR

@trask
Copy link
Member

trask commented Dec 16, 2024

how to become a contributor and get a member access to repo to create my first PR

hi @YuriyHolinko! no special access needed, we recommend to fork the repo and send PR from your fork

@jack-berg
Copy link
Member

Keeping open despite merging #6991 because I think we should update the default retry exception predicate to include some additional exceptions, as discussed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Request Suggest an idea for this project help wanted
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants