From cea873cd245536a7a464d24bf3b24044719daca6 Mon Sep 17 00:00:00 2001 From: Randy Reddig Date: Fri, 27 Apr 2018 08:49:27 +0200 Subject: [PATCH] encoding/xml: fixes to enforce XML namespace standard MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit All issues of #13400 which are not new functionalities have fixes. There are minor incompatibilities between them due to the handling of prefixes. Duplicating a prefix or an namespace is invalid XML. This is now avoided. XML produced is always valid. Tests have been added for each fix and example and previous tests fixed as output is already more compact is some cases. encoding/xml: fix duplication of namespace tags by encoder (#7535) A tag prefix identifies the name space of the tag and not the default name space like xmlns="...". Writing the prefix is incorrect when it is bound to a name space using the standard xmlns:prefix="..." attribute. This fix skips this print. Consequences are - duplication is avoided in line with name space standard in reference. - the _xmlns declaration does not appear anymore. To keep the previous behaviour, the prefix is printed in all other cases. Token now always produces well-formed XML except when the explicit name space collides with the prefix. Made prefix handling "xmlns="space" xmlns:_xmlns="xmlns" _..." has be removed in all wants of tests. In some cases, useless declarations like xmlns:x="x" are still added in line with previous behavior. encoding/xml: fix unexpected behavior of encoder.Indent("", "") (#13185, #11431) MarshalIndent and Marshal share code. When prefix and indent are empty, the behavior is like Marshal when it should have a minimal indent, i.e. new line as in documentation. A boolean is added to the local printer struct which defaults to false. It is set to true only when MarshalIndent is used and prefix and indent are empty. encoding/xml: fix overriding by empty namespace (#7113) The namespace defined by xmlns="value" can be overridden in every included tag including by the empty namespace xmlns="". Empty namespace is not authorized with a prefix (xmlns:ns=""). Method to calculate indent of XML handles depth of tag and its associated namespace. The fix leaves the method active even when no indent is required. An XMLName field in a struct means that namespace must be enforced even if empty. This occurs only on an inner tag as an overwrite of any non-empty namespace of its outer tag. To obtain the xmlns="" required, an attribute is added. encoding/xml: fix panic on embedded unexported XMLName (#10538) By laws of reflection, unexported fields are unreachable by .Value. XMLName are allowed at any level of an inner struct but the struct may not be reachable. If XMLName field is found without name, it cannot be exported and the value is discarded like other fields. Some comments have been to underline where the issue arises. Another XMLName test was incorrectly set up and is fixed. Various cases added in a specific test. encoding/xml: fix panic of unmarshaling of anonymous structs (#16497) Encoding/xml is using type "typinfo" which provides its own "value" method using the reflection package. It fails to check for the documented possible panics of the reflection methods. It is impossible to fix the method as the parameter is also using reflection. The fix is to discard anonymous structs which have no value anyway. Encoder/xml documentation already mentions that fields are accessed using the reflection package. A relevant test is added. encoding/xml: fix closing tag failure (#20685) Push/pop of elements name must be done using the eventual prefix together with the tag name. The operation is moved before the translation of prefix into its URI. One end element of a test is fixed as expecting the last used namespace is incorrect. After closing a tag using a namespace, the valid namespace must be taken from the opening tag. encoding/xml: add check of namespaces to detect field names conflicts (#8535, #11724) Comparing namespaces of fields was missing and false conflicts were detected. encoding/xml: fix invalid empty namespace without prefix (#8068) Empty namespace is allowed only if it has no prefix. An error message is now returned if a prefix exists. A similar case when no prefix is provided, thus with syntax xmlns:="..." is also rejected. encoding/xml: fix normalization of attributes values (#20614) The attribute value was read as text. The existing attribute reader logic is fixed as an attribute may have a namespace or only a prefix. Other possibilities have been removed. To keep the behavior of raw token which allows many faults in attributes list, error handling is heavily using the Strict parameter of the decoder. Specific tests have been added including list of attributes. To keep the behavior or unmarshal, escaped characters handling has been added but it is not symmetrical to Marshal for quotes but follows XML specification. encoding/xml: fix absence of detection of another : in qualified names (#20396) The occurrence of second : in space and tag name is rejected. Fixes: #7113, #7535, #8068, #8535, #10538, #11431, #13185, #16497, #20396, #20614, #20685 Change-Id: Ib4a60347a47d23ff59b63307cebb83b71c7c9165 Commit originally authored by: Constantin Konstantinidis encoding/xml: fix printing of namespace prefix in tag names Prefix displays in XML when defined in tag names. The URL of the name space is also returned for an End Token as documentation requires to improve idempotency of Marshal/Unmarshal. Translating the prefix is popping the NS which was unavailable for translation. Translate for an End Token has been moved inside the pop element part. Fixes #9519 Change-Id: Id7a14d07c106a76a487b5c4e28d9d563fe061c60 encoding/xml: restore changes lost in merge See #43168. encoding/xml: gofmt encoding/xml: minimalIndent -> minIndent to shrink diff encoding/xml: gofmt encoding/xml: revert changes to (*Decoder).attrval from master encoding/xml: restore (*printer).writeIndent to master encoding/xml: remove comments that don’t change functionality of tests encoding/xml: use t.Errorf instead of fmt.Errorf encoding/xml: fix test case for #11496 encoding/xml: revert unrelated change in (*Decoder).readName encoding/xml: remove comments encoding/xml: remove comments encoding/xml: edit comments --- src/encoding/xml/marshal.go | 133 ++++-- src/encoding/xml/marshal_test.go | 41 +- src/encoding/xml/read.go | 20 +- src/encoding/xml/typeinfo.go | 28 +- src/encoding/xml/xml.go | 48 ++- src/encoding/xml/xml_test.go | 704 +++++++++++++++++++++++++++++++ 6 files changed, 904 insertions(+), 70 deletions(-) diff --git a/src/encoding/xml/marshal.go b/src/encoding/xml/marshal.go index a8c8f659caca59..8d2f8fbe099055 100644 --- a/src/encoding/xml/marshal.go +++ b/src/encoding/xml/marshal.go @@ -304,8 +304,8 @@ type printer struct { *bufio.Writer encoder *Encoder seq int - indent string - prefix string + indent string // line identation + prefix string // line prefix depth int indentedIn bool putNewline bool @@ -364,7 +364,7 @@ func (p *printer) createAttrPrefix(url string) string { p.attrPrefix[url] = prefix p.attrNS[prefix] = url - + /* prints a prefix definition for the URL which had no prefix */ p.WriteString(`xmlns:`) p.WriteString(prefix) p.WriteString(`="`) @@ -382,6 +382,8 @@ func (p *printer) deleteAttrPrefix(prefix string) { delete(p.attrNS, prefix) } +// p.prefixes contains prefixes separated by the empty string between tags +// as several prefixes can be defined at any level func (p *printer) markPrefix() { p.prefixes = append(p.prefixes, "") } @@ -391,12 +393,36 @@ func (p *printer) popPrefix() { prefix := p.prefixes[len(p.prefixes)-1] p.prefixes = p.prefixes[:len(p.prefixes)-1] if prefix == "" { - break + break // end of tag is reached } p.deleteAttrPrefix(prefix) } } +// No prefix is returned if the first prefix of the list for this tag is not assigned to a namespace +func (p *printer) tagPrefix() string { + prefix := p.prefixes[len(p.prefixes)-1] // last prefix relates to current tag + i := len(p.prefixes) - 1 + for i >= 0 && i < len(p.prefixes) { + if prefix == "" { // end of list of prefixes for this tag is reached + // check that previous prefix is the one of the tag + if i+1 < len(p.prefixes) { // list is not empty + if p.attrNS[p.prefixes[i+1]] != "" { + // prefix has been created, i.e. no prefix for the tag + return "" + } else { + return p.prefixes[i+1] + } + } else { + return "" + } + } + i-- + prefix = p.prefixes[i] + } + return "" +} + var ( marshalerType = reflect.TypeOf((*Marshaler)(nil)).Elem() marshalerAttrType = reflect.TypeOf((*MarshalerAttr)(nil)).Elem() @@ -482,17 +508,23 @@ func (p *printer) marshalValue(val reflect.Value, finfo *fieldInfo, startTemplat xmlname := tinfo.xmlname if xmlname.name != "" { start.Name.Space, start.Name.Local = xmlname.xmlns, xmlname.name + // .Space is equivalent to xmlns=".Space" so adding the attribute + if start.Name.Space != "" { + start.Attr = append(start.Attr, Attr{Name{"", xmlnsPrefix}, start.Name.Space}) + } } else { fv := xmlname.value(val, dontInitNilPointers) if v, ok := fv.Interface().(Name); ok && v.Local != "" { start.Name = v } } + } else { + // No enforced namespace, i.e. the outer tag namespace remains valid } - if start.Name.Local == "" && finfo != nil { + if start.Name.Local == "" && finfo != nil { // XMLName overrides tag name - anonymous struct start.Name.Space, start.Name.Local = finfo.xmlns, finfo.name } - if start.Name.Local == "" { + if start.Name.Local == "" { // No or empty XMLName and still no tag name name := typ.Name() if i := strings.IndexByte(name, '['); i >= 0 { // Truncate generic instantiation name. See issue 48318. @@ -526,6 +558,13 @@ func (p *printer) marshalValue(val reflect.Value, finfo *fieldInfo, startTemplat } } + /* If an xmlname was found, namespace must be overridden */ + if tinfo.xmlname != nil && start.Name.Space == "" && + len(p.tags) != 0 && p.tags[len(p.tags)-1].Space != "" { + //add attr xmlns="" to override the outer tag namespace + start.Attr = append(start.Attr, Attr{Name{"", xmlnsPrefix}, ""}) + } + /* */ if err := p.writeStart(&start); err != nil { return err } @@ -698,31 +737,70 @@ func (p *printer) writeStart(start *StartElement) error { return fmt.Errorf("xml: start tag with no name") } + // Pushes the value (url) of the namespace and not the eventual prefix p.tags = append(p.tags, start.Name) - p.markPrefix() + p.markPrefix() // pushes an empty prefix to allow pop to locate the end of any prefix - p.writeIndent(1) + p.writeIndent(1) // handling relative depth of a tag p.WriteByte('<') - p.WriteString(start.Name.Local) - - if start.Name.Space != "" { - p.WriteString(` xmlns="`) - p.EscapeString(start.Name.Space) - p.WriteByte('"') + /* The attribute was not added if no XMLName field existed. */ + var tagSpaceAttr Attr // the attribute will not be printed to avoid repetition + if start.Name.Space != "" { // tag starts with <.Space:.Local + // Locate an eventual prefix. If none, the XML is without start tag", name.Local) } - if top := p.tags[len(p.tags)-1]; top != name { - if top.Local != name.Local { + if top := p.tags[len(p.tags)-1]; top != name { // the end tag must check the prefix value when there is one + if top.Local != name.Local { // Tag names do not match return fmt.Errorf("xml: end tag does not match start tag <%s>", name.Local, top.Local) + } // Namespaces do not match + if top.Space != name.Space { // tag prefixes do not match + return fmt.Errorf("xml: end space does not match start space <%s>", name.Space, top.Space) } - return fmt.Errorf("xml: end tag in namespace %s does not match start tag <%s> in namespace %s", name.Local, name.Space, top.Local, top.Space) } - p.tags = p.tags[:len(p.tags)-1] p.writeIndent(-1) p.WriteByte('<') p.WriteByte('/') + endPrefix := p.tagPrefix() + if name.Space != "" && endPrefix != "" { // a prefix is available + p.WriteString(endPrefix) // print the prefix and not its value + p.WriteByte(':') + } // otherwise, xmlns=".Space" has no prefix and nothing should be printed + p.tags = p.tags[:len(p.tags)-1] p.WriteString(name.Local) p.WriteByte('>') p.popPrefix() diff --git a/src/encoding/xml/marshal_test.go b/src/encoding/xml/marshal_test.go index d2e5137afd7c05..457e5d4e86339f 100644 --- a/src/encoding/xml/marshal_test.go +++ b/src/encoding/xml/marshal_test.go @@ -629,6 +629,11 @@ var marshalTests = []struct { {Value: &Port{Type: "ssl", Number: "443"}, ExpectXML: `443`}, {Value: &Port{Number: "443"}, ExpectXML: `443`}, {Value: &Port{Type: ""}, ExpectXML: ``}, + // Marshal is not symetric to Unmarshal for these oddities because &apos is written as ' + {Value: &Port{Type: ""}, ExpectXML: ``, UnmarshalOnly: true}, + {Value: &Port{Type: ""}, ExpectXML: ``, UnmarshalOnly: true}, + {Value: &Port{Type: ""}, ExpectXML: ``}, + {Value: &Port{Type: ""}, ExpectXML: ``, UnmarshalOnly: true}, {Value: &Port{Number: "443", Comment: "https"}, ExpectXML: `443`}, {Value: &Port{Number: "443", Comment: "add space-"}, ExpectXML: `443`, MarshalOnly: true}, {Value: &Domain{Name: []byte("google.com&friends")}, ExpectXML: `google.com&friends`}, @@ -1107,10 +1112,10 @@ var marshalTests = []struct { Value: &AnyTest{Nested: "known", AnyField: AnyHolder{ XML: "", - XMLName: Name{Local: "AnyField"}, + XMLName: Name{Local: "other"}, // Overriding the field name is the purpose of the test }, }, - ExpectXML: `known`, + ExpectXML: `known`, }, { ExpectXML: `b`, @@ -2058,7 +2063,7 @@ var encodeTokenTests = []struct { StartElement{Name{"space", "foo"}, nil}, EndElement{Name{"another", "foo"}}, }, - err: "xml: end tag in namespace another does not match start tag in namespace space", + err: "xml: end space does not match start space ", want: ``, }, { desc: "start element with explicit namespace", @@ -2068,7 +2073,7 @@ var encodeTokenTests = []struct { {Name{"space", "foo"}, "value"}, }}, }, - want: ``, + want: ``, }, { desc: "start element with explicit namespace and colliding prefix", toks: []Token{ @@ -2078,7 +2083,8 @@ var encodeTokenTests = []struct { {Name{"x", "bar"}, "other"}, }}, }, - want: ``, + // #17 Removed version was not well-formed as x is bound to "space" and to "x" + want: ``, }, { desc: "start element using previously defined namespace", toks: []Token{ @@ -2089,7 +2095,8 @@ var encodeTokenTests = []struct { {Name{"space", "x"}, "y"}, }}, }, - want: ``, + // #18 The well-formed prefix is the only one appearing and the prefix is not in the tag as .Space is empty + want: ``, }, { desc: "nested name space with same prefix", toks: []Token{ @@ -2110,7 +2117,7 @@ var encodeTokenTests = []struct { {Name{"space2", "b"}, "space2 value"}, }}, }, - want: ``, + want: ``, }, { desc: "start element defining several prefixes for the same name space", toks: []Token{ @@ -2120,7 +2127,7 @@ var encodeTokenTests = []struct { {Name{"space", "x"}, "value"}, }}, }, - want: ``, + want: ``, }, { desc: "nested element redefines name space", toks: []Token{ @@ -2132,7 +2139,7 @@ var encodeTokenTests = []struct { {Name{"space", "a"}, "value"}, }}, }, - want: ``, + want: ``, }, { desc: "nested element creates alias for default name space", toks: []Token{ @@ -2144,7 +2151,7 @@ var encodeTokenTests = []struct { {Name{"space", "a"}, "value"}, }}, }, - want: ``, + want: ``, }, { desc: "nested element defines default name space with existing prefix", toks: []Token{ @@ -2156,7 +2163,7 @@ var encodeTokenTests = []struct { {Name{"space", "a"}, "value"}, }}, }, - want: ``, + want: ``, }, { desc: "nested element uses empty attribute name space when default ns defined", toks: []Token{ @@ -2167,7 +2174,7 @@ var encodeTokenTests = []struct { {Name{"", "attr"}, "value"}, }}, }, - want: ``, + want: ``, }, { desc: "redefine xmlns", toks: []Token{ @@ -2199,7 +2206,7 @@ var encodeTokenTests = []struct { {Name{"xmlns", "foo"}, ""}, }}, }, - want: ``, + want: ``, }, { desc: "attribute with no name is ignored", toks: []Token{ @@ -2228,7 +2235,7 @@ var encodeTokenTests = []struct { {Name{"space", "x"}, "value"}, }}, }, - want: ``, + want: ``, }, { desc: "nested element requires empty default name space", toks: []Token{ @@ -2237,7 +2244,7 @@ var encodeTokenTests = []struct { }}, StartElement{Name{"", "foo"}, nil}, }, - want: ``, + want: ``, }, { desc: "attribute uses name space from xmlns", toks: []Token{ @@ -2259,7 +2266,7 @@ var encodeTokenTests = []struct { EndElement{Name{"space", "baz"}}, EndElement{Name{"space", "foo"}}, }, - want: ``, + want: ``, }, { desc: "default name space not used by attributes, not explicitly defined", toks: []Token{ @@ -2271,7 +2278,7 @@ var encodeTokenTests = []struct { EndElement{Name{"space", "baz"}}, EndElement{Name{"space", "foo"}}, }, - want: ``, + want: ``, }, { desc: "impossible xmlns declaration", toks: []Token{ diff --git a/src/encoding/xml/read.go b/src/encoding/xml/read.go index ef5df3f7f6aecc..0eaf94805d2414 100644 --- a/src/encoding/xml/read.go +++ b/src/encoding/xml/read.go @@ -435,9 +435,23 @@ func (d *Decoder) unmarshal(val reflect.Value, start *StartElement) error { } return UnmarshalError(e) } - fv := finfo.value(sv, initNilPointers) - if _, ok := fv.Interface().(Name); ok { - fv.Set(reflect.ValueOf(start.Name)) + // Anonymous struct with no field or anonymous fields cannot get a value using the reflection + // package and must be discarded. + noValue := true + if sv.Type().Name() == "" && sv.Type().Kind() == reflect.Struct { + i := 0 + for i < sv.Type().NumField() && noValue { + noValue = noValue && sv.Type().Field(i).Anonymous + i++ + } + } else { + noValue = false + } + if !noValue { + fv := finfo.value(sv, initNilPointers) + if _, ok := fv.Interface().(Name); ok { + fv.Set(reflect.ValueOf(start.Name)) + } } } diff --git a/src/encoding/xml/typeinfo.go b/src/encoding/xml/typeinfo.go index 162724ef1a58b0..42508de5e21deb 100644 --- a/src/encoding/xml/typeinfo.go +++ b/src/encoding/xml/typeinfo.go @@ -65,7 +65,7 @@ func getTypeInfo(typ reflect.Type) (*typeInfo, error) { } // For embedded structs, embed its fields. - if f.Anonymous { + if f.Anonymous { // i.e. reflect package will panic to get a Value t := f.Type if t.Kind() == reflect.Ptr { t = t.Elem() @@ -75,13 +75,14 @@ func getTypeInfo(typ reflect.Type) (*typeInfo, error) { if err != nil { return nil, err } - if tinfo.xmlname == nil { + if tinfo.xmlname == nil && inner.xmlname != nil && inner.xmlname.name != "" { + // to leave xmlname to nil and to avoid assigning unexported xmlname field tinfo.xmlname = inner.xmlname } for _, finfo := range inner.fields { finfo.idx = append([]int{i}, finfo.idx...) if err := addFieldInfo(typ, tinfo, &finfo); err != nil { - return nil, err + return nil, err // Any detected conflict returns an error } } continue @@ -93,19 +94,16 @@ func getTypeInfo(typ reflect.Type) (*typeInfo, error) { return nil, err } - if f.Name == xmlName { + if f.Name == xmlName { // copying the field for "easier" access when .FieldByName() is appropriate tinfo.xmlname = finfo - continue - } - - // Add the field if it doesn't conflict with other fields. - if err := addFieldInfo(typ, tinfo, finfo); err != nil { + } else if err := addFieldInfo(typ, tinfo, finfo); err != nil { + // Add the field if it doesn't conflict with other fields. return nil, err } } } - ti, _ := tinfoMap.LoadOrStore(typ, tinfo) + ti, _ := tinfoMap.LoadOrStore(typ, tinfo) // Returned bool is always false return ti.(*typeInfo), nil } @@ -122,7 +120,7 @@ func structFieldInfo(typ reflect.Type, f *reflect.StructField) (*fieldInfo, erro // Parse flags. tokens := strings.Split(tag, ",") if len(tokens) == 1 { - finfo.flags = fElement + finfo.flags = fElement // Nothing but the name of field } else { tag = tokens[0] for _, flag := range tokens[1:] { @@ -187,6 +185,7 @@ func structFieldInfo(typ reflect.Type, f *reflect.StructField) (*fieldInfo, erro // If the name part of the tag is completely empty, get // default from XMLName of underlying struct if feasible, // or field name otherwise. + // This is how an anonymous struct gets a value if xmlname := lookupXMLName(f.Type); xmlname != nil { finfo.xmlns, finfo.name = xmlname.xmlns, xmlname.name } else { @@ -240,7 +239,7 @@ func lookupXMLName(typ reflect.Type) (xmlname *fieldInfo) { if f.Name != xmlName { continue } - finfo, err := structFieldInfo(typ, &f) + finfo, err := structFieldInfo(typ, &f) // Recursive call if err == nil && finfo.name != "" { return finfo } @@ -292,7 +291,8 @@ Loop: conflicts = append(conflicts, i) } } else { - if newf.name == oldf.name { + if newf.name == oldf.name && newf.xmlns == oldf.xmlns { + // A conflict is only if the names are identical in the same namespace which might be "" conflicts = append(conflicts, i) } } @@ -314,7 +314,7 @@ Loop: // Otherwise, if any of them is at the same depth level, it's an error. for _, i := range conflicts { oldf := &tinfo.fields[i] - if len(oldf.idx) == len(newf.idx) { + if len(oldf.idx) == len(newf.idx) { // Same depth f1 := typ.FieldByIndex(oldf.idx) f2 := typ.FieldByIndex(newf.idx) return &TagPathError{typ, f1.Name, f1.Tag.Get("xml"), f2.Name, f2.Tag.Get("xml")} diff --git a/src/encoding/xml/xml.go b/src/encoding/xml/xml.go index c14954df155a64..59e15a11e9c63d 100644 --- a/src/encoding/xml/xml.go +++ b/src/encoding/xml/xml.go @@ -216,7 +216,7 @@ type Decoder struct { toClose Name nextToken Token nextByte int - ns map[string]string + ns map[string]string // url for a prefix ns[.Space]=.Value err error line int offset int64 @@ -309,29 +309,38 @@ func (d *Decoder) Token() (Token, error) { // to the other attribute names, so process // the translations first. for _, a := range t1.Attr { - if a.Name.Space == xmlnsPrefix { - v, ok := d.ns[a.Name.Local] - d.pushNs(a.Name.Local, v, ok) + if a.Name.Space == xmlnsPrefix { // name space attribute {.Space}xmlns:{.Local}={.Value} + if a.Value == "" { + d.err = d.syntaxError("empty namespace without prefix") + return nil, d.err + } + if a.Name.Local == "" { + d.err = d.syntaxError("empty prefix") + return nil, d.err + } + v, ok := d.ns[a.Name.Local] // Checking existence + // Recording the level of the name space by recording tag name + d.pushNs(a.Name.Local, v, ok) // Pushing tag, eventual value, and existence of namespace d.ns[a.Name.Local] = a.Value } - if a.Name.Space == "" && a.Name.Local == xmlnsPrefix { - // Default space for untagged names + if a.Name.Space == "" && a.Name.Local == xmlnsPrefix { // xmlns=".Value" + // Default space for non-prefixed names v, ok := d.ns[""] d.pushNs("", v, ok) d.ns[""] = a.Value } } + d.pushElement(t1.Name) // Pushing the element with its eventual prefix + // Assigning value to Space of the attributed using the NS bindings d.translate(&t1.Name, true) for i := range t1.Attr { d.translate(&t1.Attr[i].Name, false) } - d.pushElement(t1.Name) t = t1 case EndElement: - d.translate(&t1.Name, true) - if !d.popElement(&t1) { + if !d.popElement(&t1) { // Popping the element with its eventual prefix for appropriate comparison return nil, d.err } t = t1 @@ -499,10 +508,13 @@ func (d *Decoder) popElement(t *EndElement) bool { return false case s.name.Space != name.Space: d.err = d.syntaxError("element <" + s.name.Local + "> in space " + s.name.Space + - "closed by in space " + name.Space) + " closed by in space " + name.Space) return false } + // translating before removal of ns + d.translate(&t.Name, true) // returning a namespace and not its prefix as doc says + // Pop stack until a Start or EOF is on the top, undoing the // translations that were associated with the element we just closed. for d.stk != nil && d.stk.kind != stkStart && d.stk.kind != stkEOF { @@ -798,6 +810,9 @@ func (d *Decoder) rawToken() (Token, error) { for { d.space() if b, ok = d.mustgetc(); !ok { + if len(attr) > 0 && !d.Strict { + break // When not strict, an attribute might end with EOF + } return nil, d.err } if b == '/' { @@ -832,7 +847,6 @@ func (d *Decoder) rawToken() (Token, error) { d.err = d.syntaxError("attribute name without = in element") return nil, d.err } - d.ungetc(b) a.Value = a.Name.Local } else { d.space() @@ -1016,6 +1030,11 @@ Input: d.ungetc('<') break Input } + // This occurs only for an unquoted attr name. + if b == '>' && !cdata && quote < 0 { // Possible end of tag reached + d.ungetc('>') // Leaving end of tag available + break // returning text + } if quote >= 0 && b == byte(quote) { break Input } @@ -1112,6 +1131,7 @@ Input: } // We must rewrite unescaped \r and \r\n into \n. + // End of line handling https://www.w3.org/TR/xml/#sec-line-ends if b == '\r' { d.buf.WriteByte('\n') } else if b1 == '\r' && b == '\n' { @@ -1168,7 +1188,11 @@ func (d *Decoder) nsname() (name Name, ok bool) { name.Local = s } else { name.Space = s[0:i] - name.Local = s[i+1:] + if strings.Contains(s[i+1:], ":") { + return name, false + } else { + name.Local = s[i+1:] + } } return name, true } diff --git a/src/encoding/xml/xml_test.go b/src/encoding/xml/xml_test.go index 19152dbdb68951..02c77edb1bdd82 100644 --- a/src/encoding/xml/xml_test.go +++ b/src/encoding/xml/xml_test.go @@ -864,6 +864,345 @@ func TestIssue5880(t *testing.T) { } } +func TestIssue8535(t *testing.T) { + + type ExampleConflict struct { + XMLName Name `xml:"example"` + Link string `xml:"link"` + AtomLink string `xml:"http://www.w3.org/2005/Atom link"` // No conflict but no assignment + } + testCases := []string{ + ` + Example + http://example.com/default + http://example.com/home + http://example.com/ns + `, + } + + var dest ExampleConflict + d := NewDecoder(strings.NewReader(testCases[0])) + if err := d.Decode(&dest); err != nil { + t.Errorf("%s: Field conflicts : got error %v, want no fail", testCases[0], err) + } +} + +func TestIssue11431(t *testing.T) { // + + type Test struct { + XMLName Name `xml:"Test"` + Ns string `xml:"xmlns,attr"` + Body string + } + + s := &Test{Ns: "http://example.com/ns", Body: "hello world"} + b, err := Marshal(s) + if err != nil { + t.Errorf("namespace handling: expected no error, got %s", err) + } + + want := `hello world` + if string(b) != want { + t.Errorf("namespace handling: got %s, want %s \n", string(b), want) + } +} + +func TestIssue11431NsWoAttr(t *testing.T) { + + type Test struct { + Body string `xml:"http://example.com/ns body"` + } + + s := &Test{Body: "hello world"} + b, err := Marshal(s) + if err != nil { + t.Errorf("namespace handling: expected no error, got %s", err) + } + + want := `hello world` + if string(b) != want { + t.Errorf("namespace handling: got %s, want %s \n", string(b), want) + } +} + +func TestIssue11431XMLName(t *testing.T) { // + + type Test struct { + XMLName Name `xml:"http://example.com/ns Test"` + Body string + } + + //s := &Test{XMLName: Name{"http://example.com/ns",""}, Body: "hello world"} is unusable as the "-" is missing + // as documentation states + s := &Test{Body: "hello world"} + b, err := Marshal(s) + if err != nil { + t.Errorf("namespace handling: expected no error, got %s", err) + } + + want := `hello world` + if string(b) != want { + t.Errorf("namespace handling: got %s, want %s \n", string(b), want) + } +} + +func TestIssue11431UsingAttr(t *testing.T) { // + + type T struct { + Ns string `xml:"xmlns,attr"` + Body string + } + + //s := &Test{XMLName: Name{"http://example.com/ns",""}, Body: "hello world"} is unusable as the "-" is missing + // as documentation states + s := &T{Ns: "http://example.com/ns", Body: "hello world"} + b, err := Marshal(s) + if err != nil { + t.Errorf("namespace handling: expected no error, got %s", err) + } + + want := `hello world` + if string(b) != want { + t.Errorf("namespace handling: got %s, want %s \n", string(b), want) + } +} + +func TestIssue11496(t *testing.T) { // Issue answered + + type Person struct { + XMLName Name `xml:"ns1 person"` + Name string `xml:"name"` + Phone string `xml:"ns2 phone,omitempty"` + } + + p := &Person{ + Name: "Oliver", + Phone: "110", + } + + got, err := Marshal(p) + if err != nil { + t.Errorf("namespace assignment: marshal error returned is %s", err) + } + + want := `Oliver110` + if string(got) != want { + t.Errorf("namespace assignment:\ngot: %s\nwant: %s", string(got), want) + } + + // Output: + // + // Oliver + // 110 + // + // + // Want: + // + // Oliver + // 110 + // + +} + +func TestIssue8068(t *testing.T) { + + testCases := []struct { + s string + ok bool + }{ // Empty prefixed namespace is not allowed + {``, true}, + {``, false}, + {``, false}, + {``, false}, + {``, false}, + } + + var dest string // type does not matter as tested tags are empty + var err error + for _, tc := range testCases { + err = Unmarshal([]byte(tc.s), &dest) + + if err != nil && tc.ok { + t.Errorf("%s: Empty prefixed namespace : expected no error, got %s", tc.s, err) + continue + } + if err == nil && !tc.ok { + t.Errorf("%s: Empty prefixed namespace : expected error, got nil", tc.s) + } + } + +} + +func TestIssue10538(t *testing.T) { + // There is no restriction of the placement of XMLName in embedded structs + // If the field is unexported, reflect package will panic in the documented cases + // Purpose of the test is to show that no panic occurs with multiple set ups of embedded structs using XMLName + type elementNoXMLName struct { + Children []interface{} + } + + type element struct { + XMLName Name + Children []interface{} + } + + type Element struct { + XMLName Name + Children []interface{} + } + + type svgstrEmptyStruct struct { + elementNoXMLName //is not exported and empty + Height string `xml:"height,attr,omitempty"` + Width string `xml:"width,attr,omitempty"` + } + + type svgstr struct { + element // not exported and .Value panics + Height string `xml:"height,attr,omitempty"` + Width string `xml:"width,attr,omitempty"` + } + + type svgstrExp struct { + Element element // exported and .Value does not panic + Height string `xml:"height,attr,omitempty"` + Width string `xml:"width,attr,omitempty"` + } + + type svgstrExpType struct { + Element // exported and .Value does not panic + Height string `xml:"height,attr,omitempty"` + Width string `xml:"width,attr,omitempty"` + } + + type svgstr2 struct { + XMLName Name + Children []interface{} + Height string `xml:"height,attr,omitempty"` + Width string `xml:"width,attr,omitempty"` + } + + /* No embedded XMLName */ + result := `` + sE := svgstrEmptyStruct{ + Width: "400", + Height: "200", + } + a, err := Marshal(sE) + if err != nil { + t.Errorf("xmlname handling : marshaling failed with %s \n", err) + } + if string(a) != result { + t.Errorf("xmlname handling : got %s, want %s \n", string(a), result) + } + /* XMLName in a unexported field is not assigned */ + result = `` + s := svgstr{ + element: element{XMLName: Name{Local: "svg", Space: "www.etc"}, Children: nil}, + Width: "400", + Height: "200", + } + + f, err := Marshal(s) + if err != nil { + t.Errorf("xmlname handling : marshaling failed with %s \n", err) + } + if string(f) != result { + t.Errorf("xmlname handling : got %s, want %s \n", string(f), result) + } + /* Embedding the XMLName gets it assigned to the inner struct */ + result = `` + sExp := svgstrExp{ + Element: element{XMLName: Name{Local: "svg", Space: "www.etc"}, Children: nil}, + Width: "400", + Height: "200", + } + + b, err := Marshal(sExp) + if err != nil { + t.Errorf("xmlname handling : marshaling failed with %s \n", err) + } + if string(b) != result { + t.Errorf("xmlname handling : got %s, want %s \n", string(b), result) + } + /* XMLName is not assigned to outer tag but to inner tag. Not working due to other issues */ + result = `` + sExpType := svgstrExpType{ + Element: Element{XMLName: Name{Local: "svg", Space: "www.etc"}, Children: []interface{}{""}}, + Width: "400", + Height: "200", + } + + d, err := Marshal(sExpType) + if err != nil { + t.Errorf("xmlname handling : marshaling failed with %s \n", err) + } + if string(d) != result { + t.Errorf("xmlname handling : got %s, want %s \n", string(d), result) + } + /* No inner struct. XMLName is assigned as usual */ + result = `` + s2 := svgstr2{ + XMLName: Name{Local: "svg", Space: "www.etc"}, + Width: "400", + Height: "200", + } + + c, err := Marshal(s2) + if err != nil { + t.Errorf("xmlname handling : marshaling failed with %s \n", err) + } + if string(c) != result { + t.Errorf("xmlname handling : got %s, want %s \n", string(c), result) + } +} + +func TestIssue7535(t *testing.T) { + source := `` + result := `` + // A prefix is the namespace known from the tag where it is declared and not the default namespace. + // But in a well-formed xml, it is useless as the prefix is bound and recorded as an attribute + in := strings.NewReader(source) + var errl, err error + var token Token + + for i := 0; i < 4; i++ { + out := &bytes.Buffer{} + d := NewDecoder(in) + e := NewEncoder(out) + errl = nil + for errl == nil { + token, err = d.Token() + if err != nil { + if err == io.EOF { + errl = err + } else { + t.Errorf("read token failed:%s", err) + return + } + } else { // err is nil + // end token contains now the URL which can be encoded only if the NS if available + // from the start token + err = e.EncodeToken(token) + if err != nil { + t.Errorf("encode token failed : %s", err) + return + } + } + } + e.Flush() + if out.String() != result { + t.Errorf("duplicating namespace : got %s, want %s \n", out.String(), result) + return + } + in.Reset(out.String()) + } + + if errl != nil && errl != io.EOF { + t.Errorf("%s \n: duplicating namespace : got error %v, want no fail \n", source, errl) + } +} + func TestIssue11405(t *testing.T) { testCases := []string{ "", @@ -917,6 +1256,371 @@ func TestIssue12417(t *testing.T) { } } +func TestIssue20396(t *testing.T) { + testCases := []struct { + s string + ok bool + }{ // Should not allow to change namespace of opening tag + {``, false}, + {``, false}, + {``, false}, + {``, true}, + } + for _, tc := range testCases { + d := NewDecoder(strings.NewReader(tc.s)) + var err error + for { + _, err = d.Token() + if err != nil { + if err == io.EOF { //EOF indicates that process is complete + err = nil + } + break + } + } + if err != nil && tc.ok { + err = d.err + t.Errorf("%q: Multiple colons in tag : expected no error, got %s", tc.s, err) + continue + } + if err == nil && !tc.ok { + t.Errorf("%q: Multiple colons in tag : expected error, got nil", tc.s) + } + } +} + +func TestIssue20685(t *testing.T) { + testCases := []struct { + s string + ok bool + }{ + {`one`, false}, + {`one`, true}, + {`one`, false}, + {`one`, false}, + {`one`, false}, + {`one`, false}, + {`one`, false}, + } + for _, tc := range testCases { + d := NewDecoder(strings.NewReader(tc.s)) + var err error + for { + _, err = d.Token() + if err != nil { + if err == io.EOF { + err = nil + } + break + } + } + if err != nil && tc.ok { + t.Errorf("%q: Closing tag with namespace : expected no error, got %s", tc.s, err) + continue + } + if err == nil && !tc.ok { + t.Errorf("%q: Closing tag with namespace : expected error, got nil", tc.s) + } + } +} + +func TestIssue16497(t *testing.T) { + + type IQ struct { + Type string `xml:"type,attr"` + XMLName Name `xml:"iq"` + } + + type embedIQ struct { + IQ IQ + } + + /* Anonymous struct */ + resp := struct { + IQ + }{} /* */ + + var err error + err = Unmarshal([]byte(``), &resp) + if err != nil { + t.Errorf("unmarshal anonymous struct failed with %s", err) + return + } + // assigning values or not does not change anything + var respEmbed embedIQ + err = Unmarshal([]byte(``), &respEmbed) + if err != nil { + t.Errorf("unmarshal anonymous struct failed with %s", err) + return + } +} + +func TestIssue9519(t *testing.T) { + // Expects prefixed notation prefix:tag name iso xmlns: + type HouseType struct { + XMLName Name `xml:"prefix11 House"` + MessageId string `xml:"message_id,attr"` + } + + var tm HouseType + var err error + + tm.MessageId = "test1234" + + var data1 []byte + data1, err = Marshal(tm) + if err != nil { + t.Errorf("%s : handling namespace : got error %v, want no fail \n", data1, err) + } + + result := `` + if string(data1) != result { + t.Errorf("handling namespace : got %v, want %s \n", string(data1), result) + } + + var tm2 HouseType + err = Unmarshal([]byte(data1), &tm2) + if err != nil { + t.Errorf("%s : handling namespace : got error %v, want no fail \n", data1, err) + } + + if tm.MessageId != tm2.MessageId { + t.Errorf("handling namespace : got %s, want %s \n", tm.MessageId, tm2.MessageId) + } +} + +func TestUnmarshalXMLName(t *testing.T) { + + type InnerStruct struct { + XMLName Name `xml:"testns outer"` + } + + type OuterStruct struct { + InnerStruct + IntAttr int `xml:"int,attr"` + } + + type OuterNamedStruct struct { + InnerStruct + IntAttr int `xml:"int,attr"` + XMLName Name `xml:"outerns test"` + } + + type OuterNamedOrderedStruct struct { + XMLName Name `xml:"outerns test"` + InnerStruct + IntAttr int `xml:"int,attr"` + } + + var unMarshalTestsXMLName = []struct { + Value interface{} + ExpectXML string + MarshalOnly bool + MarshalError string + UnmarshalOnly bool + UnmarshalError string + }{ + { + ExpectXML: ``, + Value: &OuterStruct{IntAttr: 10}, + }, + { + ExpectXML: ``, + Value: &OuterStruct{IntAttr: 10}, + }, + { + ExpectXML: ``, + Value: &OuterNamedStruct{XMLName: Name{Space: "outerns", Local: "test"}, IntAttr: 10}, + }, + { + ExpectXML: ``, + Value: &OuterNamedOrderedStruct{XMLName: Name{Space: "outerns", Local: "test"}, IntAttr: 10}, + }, + } + for i, test := range unMarshalTestsXMLName { + if test.MarshalOnly { + continue + } + if _, ok := test.Value.(*Plain); ok { + continue + } + if test.ExpectXML == ``+ + `b`+ + `b1`+ + `` { + // TODO(rogpeppe): re-enable this test in + // https://go-review.googlesource.com/#/c/5910/ + continue + } + + vt := reflect.TypeOf(test.Value) + dest := reflect.New(vt.Elem()).Interface() + err := Unmarshal([]byte(test.ExpectXML), dest) + + t.Run(fmt.Sprintf("%d", i), func(t *testing.T) { + switch fix := dest.(type) { + case *Feed: + fix.Author.InnerXML = "" + for i := range fix.Entry { + fix.Entry[i].Author.InnerXML = "" + } + } + + if err != nil { + if test.UnmarshalError == "" { + t.Errorf("unmarshal(%#v): %s", test.ExpectXML, err) + return + } + if !strings.Contains(err.Error(), test.UnmarshalError) { + t.Errorf("unmarshal(%#v): %s, want %q", test.ExpectXML, err, test.UnmarshalError) + } + return + } + if got, want := dest, test.Value; !reflect.DeepEqual(got, want) { + t.Errorf("unmarshal(%q):\nhave %#v\nwant %#v", test.ExpectXML, got, want) + } + }) + } +} + +func TestMarshalXMLName(t *testing.T) { + + type InnerStruct struct { + XMLName Name `xml:"testns outer"` + } + + type OuterStruct struct { + InnerStruct + IntAttr int `xml:"int,attr"` + } + + type OuterNamedStruct struct { + InnerStruct + IntAttr int `xml:"int,attr"` + XMLName Name `xml:"outerns test"` + } + + type OuterNamedOrderedStruct struct { + XMLName Name `xml:"outerns test"` + InnerStruct + IntAttr int `xml:"int,attr"` + } + + var marshalTestsXMLName = []struct { + Value interface{} + ExpectXML string + MarshalOnly bool + MarshalError string + UnmarshalOnly bool + UnmarshalError string + }{ + { + ExpectXML: ``, + Value: &OuterStruct{IntAttr: 10}, + }, + { + ExpectXML: ``, + Value: &OuterStruct{IntAttr: 10}, + }, + { + ExpectXML: ``, + Value: &OuterNamedStruct{XMLName: Name{Space: "outerns", Local: "test"}, IntAttr: 10}, + }, + { + ExpectXML: ``, + Value: &OuterNamedOrderedStruct{XMLName: Name{Space: "outerns", Local: "test"}, IntAttr: 10}, + }, + } + + for idx, test := range marshalTestsXMLName { + if test.UnmarshalOnly { + continue + } + + t.Run(fmt.Sprintf("%d", idx), func(t *testing.T) { + data, err := Marshal(test.Value) + if err != nil { + if test.MarshalError == "" { + t.Errorf("marshal(%#v): %s", test.Value, err) + return + } + if !strings.Contains(err.Error(), test.MarshalError) { + t.Errorf("marshal(%#v): %s, want %q", test.Value, err, test.MarshalError) + } + return + } + if test.MarshalError != "" { + t.Errorf("Marshal succeeded, want error %q", test.MarshalError) + return + } + if got, want := string(data), test.ExpectXML; got != want { + if strings.Contains(want, "\n") { + t.Errorf("marshal(%#v):\nHAVE:\n%s\nWANT:\n%s", test.Value, got, want) + } else { + t.Errorf("marshal(%#v):\nhave %#q\nwant %#q", test.Value, got, want) + } + } + }) + } +} + +func TestIssue7113(t *testing.T) { + type C struct { + XMLName Name `xml:""` // To reset namespace to "" + } + + type A struct { + XMLName Name `xml:""` + C C `xml:""` + } + + var a A + structSpace := "b" + fieldSpace := "" + xmlTests := []string{ + ``, + // ``, + } + for _, xmltest := range xmlTests { + + err := Unmarshal([]byte(xmltest), &a) + if err != nil { + t.Errorf("overidding with empty namespace: expected no error, got %s", err) + } + if a.XMLName.Space != structSpace { + t.Errorf("overidding with empty namespace: before marshaling, got %s != %s, want == \n", a.XMLName.Space, structSpace) + } + if a.C.XMLName.Space != fieldSpace { + t.Errorf("overidding with empty namespace: before marshaling, got %s != %s, want == \n", a.C.XMLName.Space, fieldSpace) + } + + var b []byte + /* Because of unmarshaling, namespaces are already assigned */ + b, err = Marshal(&a) + if string(b) != xmltest { + t.Errorf("overidding with empty namespace: after marshaling, got %s != %s, want == \n", string(b), xmltest) + return + } + // Unmarshaling has no interest if the previous test succeed as the structs are initially empty unless + if a.C.XMLName.Local != "C" { + t.Errorf("overidding with empty namespace: after marshaling, unmarshaling will fail, got %s as C tag space which should be tag name C \n", a.C.XMLName.Local) + } + if a.C.XMLName.Space != "" { + t.Errorf("overidding with empty namespace: after marshaling, unmarshaling will fail, got %s in C tag which should be empty \n", a.C.XMLName.Space) + } + err = Unmarshal(b, &a) + if err != nil { + t.Errorf("overidding with empty namespace: expected no error, got %s", err) + } + if a.XMLName.Space != "b" { + t.Errorf("overidding with empty namespace: after marshaling & unmarshaling, got %s in XMLName != %s, want == \n", a.XMLName.Space, "b") + } + if a.C.XMLName.Space != "" { + t.Errorf("overidding with empty namespace: after marshaling & unmarshaling, got %q in C tag != %q, want == \n", a.C.XMLName.Space, "") + } + } +} + func tokenMap(mapping func(t Token) Token) func(TokenReader) TokenReader { return func(src TokenReader) TokenReader { return mapper{