Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove too long atomic fields #890

Merged
merged 1 commit into from
Apr 8, 2024
Merged

Conversation

oguzhanunlu
Copy link
Member

No description provided.

@oguzhanunlu oguzhanunlu self-assigned this Apr 8, 2024
AtomicField(name = "app_id", _.app_id, _.app_id = null),
AtomicField(name = "platform", _.platform, _.platform = null),
AtomicField(name = "event", _.event, _.event = null),
AtomicField(name = "event_id", _.event_id, _.event_id = null),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some loaders event_id is a a required field in the table. For example, see here how we create the snowflake events table.

Same is true of v_collector and v_etl, although I cannot tell you why it is this way!!

I'm fine with setting fields to null, and I understand how this helps with the wider design of incomplete events. But be aware this makes it technically possible that enrich can emit events which cannot be loaded.

cc @colmsnowplow

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for the heads up @istreeter ! That makes me think if we should

  • keep a list of atomic field names to be excluded from this check or
  • take prefix of the field as much as its' limit allow or
  • let it be and accept the possibility

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion we should not set event_id, v_collector and v_etl to null, we should leave them untouched.

Contrary to atomic.events table, incomplete.events doesn't have lengths limits for the atomic fields.
Ideally we would like each field set in incomplete.events table to mean that it is valid and would be the same in atomic.events table.
But the very priority of incomplete events is to know what the problems are so that they can get fixed, and if an event can't get loaded because a required field is missing, we will never know about the problems.
If one of these three fields is too long, failure entity will tell us so that's fine.

BTW when we have removed all the lengths limits for all destinations, we will remove the checks for atomic fields lengths from Enrich.

@oguzhanunlu oguzhanunlu requested a review from benjben April 8, 2024 15:19
Co-Authored-By: Benjamin BENOIST <benjamin.benoist@snowplowanalytics.com>
@oguzhanunlu oguzhanunlu merged commit 3fd3690 into incomplete_events Apr 8, 2024
1 check passed
@oguzhanunlu oguzhanunlu deleted the too-long-atomic branch April 8, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants