-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove too long atomic fields #890
Conversation
AtomicField(name = "app_id", _.app_id, _.app_id = null), | ||
AtomicField(name = "platform", _.platform, _.platform = null), | ||
AtomicField(name = "event", _.event, _.event = null), | ||
AtomicField(name = "event_id", _.event_id, _.event_id = null), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some loaders event_id
is a a required field in the table. For example, see here how we create the snowflake events table.
Same is true of v_collector
and v_etl
, although I cannot tell you why it is this way!!
I'm fine with setting fields to null, and I understand how this helps with the wider design of incomplete events. But be aware this makes it technically possible that enrich can emit events which cannot be loaded.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the heads up @istreeter ! That makes me think if we should
- keep a list of atomic field names to be excluded from this check or
- take prefix of the field as much as its' limit allow or
- let it be and accept the possibility
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion we should not set event_id
, v_collector
and v_etl
to null
, we should leave them untouched.
Contrary to atomic.events
table, incomplete.events
doesn't have lengths limits for the atomic fields.
Ideally we would like each field set in incomplete.events
table to mean that it is valid and would be the same in atomic.events
table.
But the very priority of incomplete events is to know what the problems are so that they can get fixed, and if an event can't get loaded because a required field is missing, we will never know about the problems.
If one of these three fields is too long, failure
entity will tell us so that's fine.
BTW when we have removed all the lengths limits for all destinations, we will remove the checks for atomic fields lengths from Enrich.
...n/src/main/scala/com.snowplowanalytics.snowplow.enrich/common/enrichments/AtomicFields.scala
Outdated
Show resolved
Hide resolved
...n/src/main/scala/com.snowplowanalytics.snowplow.enrich/common/enrichments/AtomicFields.scala
Outdated
Show resolved
Hide resolved
Co-Authored-By: Benjamin BENOIST <benjamin.benoist@snowplowanalytics.com>
d50c5d2
to
9999d5e
Compare
No description provided.