    Hi there,

    we have been testing pd4ml for exporting into RTF and PDF and are quite impressed. However we see some serious issues when exporting arabic script to RTF. For example, when Text and Numbers are mixed in arabic, numbers are in general from left to right, which is correct.

    However, when the numbers are complemented by e.g. an equals sign, the equals sign is moved to the beginning of line. Likewise, when a dot is made to end a sentence, the dot is also rendered at the right end of line, not on the left end, as would be correct for right-to-left script.

    I try to copy and past a simple arabic line from our content base

    الولايات المتحدة تواجه لحظات ملحمية في الصين خلال بكين 2008 =

    We have verified that the text is rendered correctly if arabic paragraphs are enclosed by the unicode sequences u8235? and u8236? respectively which are used to signify “right-to-left” embedding and pop-directional-formatting.

    A snippet from the RTF paragraph that renders correctly, representing the above line.

    u8235?u1575?u1604?u1608?u1604?u1575?u1610?u1575?u1578? u1575?u1604?u1605?u1578?u1581?u1583?u1577? u1578?u1608?u1575?u1580?u1607? u1604?u1581?u1592?u1575?u1578? u1605?u1604?u1581?u1605?u1610?u1577? u1601?u1610? u1575?u1604?u1589?u1610?u1606? u1582?u1604?u1575?u1604? u1576?u1603?u1610?u1606? 2008 =u8236?


    Hmm… We’ll address the issue this week.

    The issue may come from a difference in a preparing of text to output to PDF and to RTF. By RTF the Arabic or mixed script should be printed “as is”, letting MS Word to ligaturise and to render it correctly. For PDF output the text should be pre-processed to appear as a correct sequence of glyphs.

    I suspect in some situations it uses the PDF preprocessing for RTF output, which is wrong. But it is only a guess, we need to analyze that in details.

