Discussion:
[Urwid] Bidirectional text support
Alejandro Gómez
2012-11-17 12:19:05 UTC
Permalink
Hello everybody,

I'm using urwid for building an application and recently a user
submitted a bug regarding the incorrect display of arabic characters[1].

Another user pointed out a 6 year old issue[2] for implementing
support for bidirectional text. Its priority is set to "minor" so I
guess it won't be addressed in the near future.

I would like to ask if someone has any clues about how to include this
in urwid. I wouldn't mind to spend some time on it myself but I'm
clueless on how to even get started.


Cheers,

Alejandro

[1]: https://github.com/alejandrogomez/turses/issues/120
[2]: http://excess.org/urwid/ticket/14
Ian Ward
2012-11-17 14:58:59 UTC
Permalink
Hello Alejandro, and thank you for Turses!
Post by Alejandro Gómez
Hello everybody,
I'm using urwid for building an application and recently a user
submitted a bug regarding the incorrect display of arabic characters[1].
Unfortunately some of the links in the bug report are broken. I'd
love to see real world examples of text to be displayed and how it
should look on the terminal.
Post by Alejandro Gómez
Another user pointed out a 6 year old issue[2] for implementing
support for bidirectional text. Its priority is set to "minor" so I
guess it won't be addressed in the near future.
It's been on the back-burner because I don't know any RTL languages or
how they should look when mixed in with LTR text, and in Arabic in
particular how to handle ligatures with neighbouring characters. I've
switched the priority back to 'major', as it should be.
Post by Alejandro Gómez
I would like to ask if someone has any clues about how to include this
in urwid. I wouldn't mind to spend some time on it myself but I'm
clueless on how to even get started.
I would love to get this issue fixed. I see it as a number of related changes:

1. BiDi support in Unicode text displayed with Text widgets
2. alignment swapping for RTL text in Text widgets (text should be
right-aligned by default)
3. reordering columns globally when an application is started in a RTL
locale (the first column should be on the right)

For Turses you might only need #1.

The best way I see to handle #1 is with a BiDi-aware text layout class

http://excess.org/urwid/docs/manual/textlayout.html

StandardTextLayout does left/center/right alignment and space/any/clip
wrapping but doesn't reorder RTL characters. A new BiDiTextLayout
could reorder characters based on their Unicode direction. The data
structure that text layout objects would have to return will be
unwieldy if the whole string is RTL, however. Maybe now is the time
to extend that structure to support ranges of RTL text.

Please join the IRC channel if you need help working with that code.

Ian
Post by Alejandro Gómez
[1]: https://github.com/alejandrogomez/turses/issues/120
[2]: http://excess.org/urwid/ticket/14
Ian Ward
2012-11-19 19:58:26 UTC
Permalink
Here's a quick example of a text layout class that displays all text
in a right-to-left manner. It almost,sort-of works with the Edit
widget too:

https://gist.github.com/4113436
Post by Ian Ward
Hello Alejandro, and thank you for Turses!
Post by Alejandro Gómez
Hello everybody,
I'm using urwid for building an application and recently a user
submitted a bug regarding the incorrect display of arabic characters[1].
Unfortunately some of the links in the bug report are broken. I'd
love to see real world examples of text to be displayed and how it
should look on the terminal.
Post by Alejandro Gómez
Another user pointed out a 6 year old issue[2] for implementing
support for bidirectional text. Its priority is set to "minor" so I
guess it won't be addressed in the near future.
It's been on the back-burner because I don't know any RTL languages or
how they should look when mixed in with LTR text, and in Arabic in
particular how to handle ligatures with neighbouring characters. I've
switched the priority back to 'major', as it should be.
Post by Alejandro Gómez
I would like to ask if someone has any clues about how to include this
in urwid. I wouldn't mind to spend some time on it myself but I'm
clueless on how to even get started.
1. BiDi support in Unicode text displayed with Text widgets
2. alignment swapping for RTL text in Text widgets (text should be
right-aligned by default)
3. reordering columns globally when an application is started in a RTL
locale (the first column should be on the right)
For Turses you might only need #1.
The best way I see to handle #1 is with a BiDi-aware text layout class
http://excess.org/urwid/docs/manual/textlayout.html
StandardTextLayout does left/center/right alignment and space/any/clip
wrapping but doesn't reorder RTL characters. A new BiDiTextLayout
could reorder characters based on their Unicode direction. The data
structure that text layout objects would have to return will be
unwieldy if the whole string is RTL, however. Maybe now is the time
to extend that structure to support ranges of RTL text.
Please join the IRC channel if you need help working with that code.
Ian
Post by Alejandro Gómez
[1]: https://github.com/alejandrogomez/turses/issues/120
[2]: http://excess.org/urwid/ticket/14
Ian Ward
2012-11-19 20:10:42 UTC
Permalink
Post by Ian Ward
Here's a quick example of a text layout class that displays all text
in a right-to-left manner. It almost,sort-of works with the Edit
https://gist.github.com/4113436
This works by the terrible method of laying out each character
one-by-one in the reverse order. StandartTextLayout gives ranges for
each line:

(column width of text segment, start offset, end offset)

This terrible method generates one of these per character, e.g. [(1,
5, 6), (1, 4, 5), (1, 3, 4), (1, 2, 3)] instead of one per line.

Real support for right-to-left ranges in text layouts would mean
allowing end_offset < start_offset and handling that case in the part
of the Text and Edit widgets that consume text layout structures. The
example above would become just [(4, 6, 2)].

I apologise about the text layout code, it's some of the older code in
the library and hasn't seen much love. Cryptic single-letter variable
names are everywhere. At least there *are* some comments. If it's
any help "sc" stands for "screen columns".

Ian
Alejandro Gómez
2012-11-19 22:50:41 UTC
Permalink
Post by Ian Ward
Unfortunately some of the links in the bug report are broken. I'd
love to see real world examples of text to be displayed and how it
should look on the terminal.
I've asked the reporter if he can kindly send me the images that were
posted on the bug report. As soon as I get them I'll share them with
you.
Post by Ian Ward
It's been on the back-burner because I don't know any RTL languages or
how they should look when mixed in with LTR text, and in Arabic in
particular how to handle ligatures with neighbouring characters. I've
switched the priority back to 'major', as it should be.
Unfortunately I don't know any RTL languages myself. A quick read
through this article[1] taught me that in unicode text direction is
inferred by the characters used; and that it can be explicitly set
using special characters.
Post by Ian Ward
StandardTextLayout does left/center/right alignment and space/any/clip
wrapping but doesn't reorder RTL characters. A new BiDiTextLayout
could reorder characters based on their Unicode direction. The data
structure that text layout objects would have to return will be
unwieldy if the whole string is RTL, however. Maybe now is the time
to extend that structure to support ranges of RTL text.
I'm going to have to familiarize with the source. I already took a look
at the gist you posted. It's clearly not the ideal implementation but it
helped me understand how text layout works in urwid.

I'll ping you on IRC if I have any trouble understanding the code, and
maybe I can clean up some of the hairy parts in the process.

Thank you for your time. And kudos for urwid, turses (and many others)
wouldn't be possible without it.

[1]: http://www.iamcal.com/understanding-bidirectional-text/
Post by Ian Ward
Hello Alejandro, and thank you for Turses!
Post by Alejandro Gómez
Hello everybody,
I'm using urwid for building an application and recently a user
submitted a bug regarding the incorrect display of arabic characters[1].
Unfortunately some of the links in the bug report are broken. I'd
love to see real world examples of text to be displayed and how it
should look on the terminal.
Post by Alejandro Gómez
Another user pointed out a 6 year old issue[2] for implementing
support for bidirectional text. Its priority is set to "minor" so I
guess it won't be addressed in the near future.
It's been on the back-burner because I don't know any RTL languages or
how they should look when mixed in with LTR text, and in Arabic in
particular how to handle ligatures with neighbouring characters. I've
switched the priority back to 'major', as it should be.
Post by Alejandro Gómez
I would like to ask if someone has any clues about how to include this
in urwid. I wouldn't mind to spend some time on it myself but I'm
clueless on how to even get started.
1. BiDi support in Unicode text displayed with Text widgets
2. alignment swapping for RTL text in Text widgets (text should be
right-aligned by default)
3. reordering columns globally when an application is started in a RTL
locale (the first column should be on the right)
For Turses you might only need #1.
The best way I see to handle #1 is with a BiDi-aware text layout class
http://excess.org/urwid/docs/manual/textlayout.html
StandardTextLayout does left/center/right alignment and space/any/clip
wrapping but doesn't reorder RTL characters. A new BiDiTextLayout
could reorder characters based on their Unicode direction. The data
structure that text layout objects would have to return will be
unwieldy if the whole string is RTL, however. Maybe now is the time
to extend that structure to support ranges of RTL text.
Please join the IRC channel if you need help working with that code.
Ian
Post by Alejandro Gómez
[1]: https://github.com/alejandrogomez/turses/issues/120
[2]: http://excess.org/urwid/ticket/14
_______________________________________________
Urwid mailing list
Urwid at lists.excess.org
http://lists.excess.org/mailman/listinfo/urwid
Loading...