Word takes page breaks into account when creating a tagged PDF. It splits blocks that span two or more pages. This may be acceptable for simple paragraphs, but if headings, lists or tables are separated, for example, semantics suffers.Continue reading "Divided block elements"
A PDF converted from Word contains the <Part> element as its top container. However, the semantically more suitable <Document> tag should be used as root element.Continue reading "
Documentisn’t the top-level tag"
A table of contents is created with the tags <TOC> and <TOCI>. <TOC> is the container and <TOCI> is used for each entry. In this error scenario, all or some of the <TOC> entries are tagged with a headline tag, e.g. <H1>, instead of <TOCI>.Continue reading "Heading tags instead of
Everything you put into the header or footer of a Word document, is going to be an Artifact and not tagged. Therefore, images with content value mustn’t be placed in it. Images within the header or footer of a .docx (Word 2013 document without compatibility mode) are incorrectly tagged if they are used with the “Behind Text” layout option.Continue reading "Tagged image in header or footer"
When exporting a table from Word 2013, the lines of a table are set within <Span> tags. <Span> tags are not allowed inside a <TR> (table row) on the same level as the <TD> tags (table cell). In addition, table lines must be marked as artifacts and must not be tagged.Continue reading "Tagged table lines"
An image is placed with the “In Line with Text” layout option into Word in “Compatibility Mode” (.doc file). After exporting as PDF the image won’t be within aContinue reading "No
Figuretag in compatibility mode"