Hi Alan,
The answer will depend on your scenario. What are you using to read the XPS
file? In other words, are you manipulating it programmatically with .NET
code, such as C#? Or are you opening the unzipped XPS in a text editor (for
example) and trying to read the contents?
It is permissible for the UnicodeString property of the Glyph object to
contain spaces at word breaks. However, this is not strictly required;
because UnicodeString is just the list of Unicode chars. The spacing,
kerning, etc of those chars can be controlled by the Indices property of the
Glyph (not all word breaks in all docs will be formed by space chars). I'd
guess that whatever app produced the XPS output you're working with, decided
not to bother inserting raw 0x0020 chars in the UnicodeString, preferring to
control spacing via the graphical representation encoded in the Glyph's
Indices property.
However, if you used the right combination of .NET APIs you can convert the
Glyph object back into text; see for example,
http://www.microsoft.com/whdc/xps/xps-read.mspx.
But, .NET APIs might not help if you're looking for, say, a scripting or
interactive solution. So like I said, we may need to know more about what
you have planned.
XPS is oriented more towards final-form presentation of documents; rather
than active processing of text data. If you can intercept the data at an
earlier stage of production, such as an XLM or plain text form, that might
be easier to deal with than final XPS. I think you could certainly find a
solution for reading the XPS data; but without some .NET programming (ie,
taking advantage of the smarts already built-in to the .NET Framework) you'd
have some pretty complex work.
(By way of aside: the .NET world is just catching up with IBM mainframes,
which have made a clear distinction between FFT ("Final form text") and RFT
(Revisable form text") documents for several decades now. In the Microsoft
world pre-XPS, nearly all documents were in a revisable format; such as MS
Word *.DOC files).
XPS generally falls into the realm of WPF (Windows Presentation Foundation)
programming, so maybe any WPF forums you can find would have ideas too.
Hope it helps a bit,
Andrew
--
amclar at optusnet dot com dot au