7d.nz

-*- coding: utf-8

[2018-03-02 ven.]

Extending the keymap

Having recoded in python3 an other model/view/camera with OpenGL, the associated transform space and projections to setup a skycube, lights, geometries, textures, etc. I ended up, zen, with a nifty engine showing both practically and some purity. From wherever programming background you come, you certainly noticed how great python is for code factorization, but this happens many times at performance cost.

At this stage, and since the program was still in an acceptable area, I was about to exploit more extensively numpy in the linear algebra module I had acquired and composed, going toward a model wich allows many visual experiments.

The stock of formulas to code started to grow in numbers and their possible combinations too.

This is where I realized that a character map closer to the math would provide superior readability with more symbols and less code. This is where I realized how underuse were the third and fourth level of my keymap.

if you type the whole set of AltGr + [shift] + key on a standard linux distribution, you notice that most of the displayed characters can be considered garbage.

Things start to get interesting when instead of say alpha or omega, you allow yourself to type α or Ω. In my case, this is simply altgr+a and altgr + shift + o on an alternative keymap.

Setting up a keymap

the keepmap I use is a french-latin9. It's definition is to be found here

/usr/share/X11/xkb/symbols/fr

A few trials and errors later to be able to set it up with the system > preferences > keyboard. This gives us the followings steps :

  • to edit the symbols
  • dpkg-reconfigure keyboard-configuration
  • edit /usr/share/X11/xkb/rules/evdev.xml
  • udevadm trigger --subsystem-match=input --action=change
  • add/remove/visualize the keyboard through the system > preferences > keyboard menu

i read somewhere that one has to remove some files under /var/lib/xkb/, but on Jessie, I had none.

apparently editing /usr/share/X11/xkb/rules/base.xml could/would also be necessary

In any case, something wasn't working well, and though the keyboard layout would appear in the list, XKB would complain it couldn't load it.

So I ended up with something much simpler.

  • to edit the symbol
  • to activate the mapping with

setxkbmap -rules evdev -model evdev -layout fr -variant oss_special -v 10

  • to activate the greek map with

setxkbmap -rules evdev -model evdev -layout gr -v 10

  • and to visualize it in the keyboard menu

Here's an example. Perhaps not the best fit, as it is a visual mapping, rather that a phonetic one.

// ┌─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┬─────┲━━━━━━━━━┓
// │ ³ ¸ │ 1 ̨ │ 2 É │ 3 ˘ │ 4 — │ 5 – │ 6 ‑ │ 7 È │ 8 ™ │ 9 Ç │ 0 À │ ° ≠ │ + ± ┃ ⌫ Retour┃
// │ ² ¹ │ & ˇ │ é ~ │ " # │ ' { │ ( [ │ - | │ è ` │ _ \ │ ç ^ │ à @ │ ) ] │ = } ┃  arrière┃
// ┢━━━━━┷━┱───┴─┬───┴─┬───┴─┬───┴─┬───┴─┬───┴─┬───┴─┬───┴─┬───┴─┬───┴─┬───┴─┬───┺━┳━━━━━━━┫
// ┃       ┃ A Α │ Z Ϟ │ E Σ │ R ξ │ T ᵀ │ Y Ϡ │ U ϴ │ I Ψ │ O Ω │ P Π │ ¨ ˚ │ £ Ø ┃Entrée ┃
// ┃Tab ↹  ┃ a α │ z ϟ │ e € │ r ε │ t τ │ y ϡ │ u ϑ │ i ϝ │ o ω │ p π │ ^ ~ │ $ ø ┃   ⏎   ┃
// ┣━━━━━━━┻┱────┴┬────┴┬────┴┬────┴┬────┴┬────┴┬────┴┬────┴┬────┴┬────┴┬────┴┬────┺┓      ┃
// ┃        ┃ Q Ä │ S Σ │ D Δ │ F ϕ │ G Γ │ H Ψ │ J Ü │ K Ï │ L Λ │ M Ö │ % Ù │ µ ¯ ┃      ┃
// ┃Maj ⇬   ┃ q ä │ s σ │ d δ │ f φ │ g γ │ h ψ │ j ü │ k ï │ l λ │ m ϻ │ ù ' │ * ` ┃      ┃
// ┣━━━━━━━┳┹────┬┴────┬┴────┬┴────┬┴────┬┴────┬┴────┬┴────┬┴────┬┴────┬┴────┲┷━━━━━┻━━━━━━┫
// ┃       ┃ > ≥ │ W “ │ X χ │ C Ϛ │ V ← │ B ↑ │ N → │ ? … │ . . │ / ∕ │ § ’ ┃             ┃
// ┃Shift ⇧┃ < ≤ │ w ϖ │ x ϰ │ c ς │ v ᜯ │ b β │ n ν │ , … │ ; × │ : ÷ │ ! ‘ ┃Shift ⇧      ┃
// ┣━━━━━━━╋━━━━━┷━┳━━━┷━━━┱─┴─────┴─────┴─────┴─────┴─────┴───┲━┷━━━━━╈━━━━━┻━┳━━━━━━━┳━━━┛
// ┃       ┃       ┃       ┃ ␣         Espace fine insécable ⍽ ┃       ┃       ┃       ┃
// ┃Ctrl   ┃Meta   ┃Alt    ┃ ␣ Espace       Espace insécable ⍽ ┃AltGr ⇮┃Menu   ┃Ctrl   ┃
// ┗━━━━━━━┻━━━━━━━┻━━━━━━━┹───────────────────────────────────┺━━━━━━━┻━━━━━━━┻━━━━━━━┛

partial alphanumeric_keys
xkb_symbols "oss_special" {

   // my special keymap
   include "fr(oss)"
   name[Group1]="French (alternative, special)";

   // First row
   key <TLDE>  { [      twosuperior,    threesuperior,     onesuperior,    dead_cedilla ] }; // ² ³ ¹ ¸
   key <AE01>  { [        ampersand,                1,      dead_caron,     dead_ogonek ] }; // & 1 ˇ ̨
   key <AE02>  { [           eacute,                2,      asciitilde,          Eacute ] }; // é 2 ~ É
   key <AE03>  { [         quotedbl,                3,      numbersign,      dead_breve ] }; // " 3 # ˘
   key <AE04>  { [       apostrophe,                4,       braceleft,       0x1002014 ] }; // ' 4 { — (tiret cadratin)
   key <AE05>  { [        parenleft,                5,     bracketleft,       0x1002013 ] }; // ( 5 [ – (tiret demi-cadratin)
   key <AE06>  { [            minus,                6,             bar,       0x1002011 ] }; // - 6 | ‑ (tiret insécable)
   key <AE07>  { [           egrave,                7,           grave,          Egrave ] }; // è 7 ` È
   key <AE08>  { [       underscore,                8,       backslash,       trademark ] }; // _ 8 \ ™
   key <AE09>  { [         ccedilla,                9,     asciicircum,        Ccedilla ] }; // ç 9 ^ Ç
   key <AE10>  { [           agrave,                0,              at,          Agrave ] }; // à 0 @ À
   key <AE11>  { [       parenright,           degree,    bracketright,        notequal ] }; // ) ° ] ≠
   key <AE12>  { [            equal,             plus,      braceright,        Greek_XI ] }; // = + } Ξ

   // Second row
   key <AD01>  { [                a,                A,     Greek_alpha,     Greek_ALPHA ] }; // a A α A
   key <AD02>  { [                z,                Z,           U03DF,           U03DE ] }; // z Z ϟ Ϟ (koppa)
   key <AD03>  { [                e,                E,        EuroSign,     Greek_SIGMA ] }; // e E € Σ
   key <AD04>  { [                r,                R,   Greek_epsilon,        Greek_xi ] }; // r R ε ξ
   key <AD05>  { [                t,                T,       Greek_tau,           THORN ] }; // t T τ Þ
   key <AD06>  { [                y,                Y,           U03E1,          U03E0  ] }; // y Y ϡ Ϡ (sampi)
   key <AD07>  { [                u,                U,           U03D1,           U03F4 ] }; // u U û Û
   key <AD08>  { [                i,                I,           U03DD,       Greek_PSI ] }; // i I ϝ Î
   key <AD09>  { [                o,                O,     Greek_omega,     Greek_OMEGA ] }; // o O ω Ω
   key <AD10>  { [                p,                P,        Greek_pi,        Greek_PI ] }; // p P π Π
   key <AD11>  { [  dead_circumflex,   dead_diaeresis,      dead_tilde,  dead_abovering ] }; // ^ ̈ ̃ ˚
   key <AD12>  { [           dollar,         sterling,          oslash,        Ooblique ] }; // $ £ ø Ø

   // Third row
   key <AC01>  { [                q,                Q,      adiaeresis,      Adiaeresis ] }; // q Q ä Ä
   key <AC02>  { [                s,                S,     Greek_sigma,     Greek_SIGMA ] }; // s S σ Σ
   key <AC03>  { [                d,                D,     Greek_delta,     Greek_DELTA ] }; // d D δ Δ
   key <AC04>  { [                f,                F,       Greek_phi,           U03D5 ] }; // f F φ ϕ
   key <AC05>  { [                g,                G,     Greek_gamma,     Greek_GAMMA ] }; // g G γ Γ
   key <AC06>  { [                h,                H,       Greek_psi,       Greek_PSI ] }; // h H ψ Ψ
   key <AC07>  { [                j,                J,      udiaeresis,      Udiaeresis ] }; // j J ü Ü
   key <AC08>  { [                k,                K,      idiaeresis,      Idiaeresis ] }; // k K ï Ï
   key <AC09>  { [                l,                L,     Greek_lamda,    Greek_LAMBDA ] }; // l L λ Γ
   key <AC10>  { [                m,                M,           U03FB,      Odiaeresis ] }; // m M ϻ Ö
   key <AC11>  { [           ugrave,          percent,      dead_acute,          Ugrave ] }; // ù % ' Ù
   key <BKSL>  { [         asterisk,               mu,      dead_grave,     dead_macron ] }; // * µ ` ̄

   // Fourth row
   key <LSGT>  { [             less,          greater,        lessthanequal,      greaterthanequal ] }; // < > ≤ ≥
   key <AB01>  { [                w,                W,                U03D6,   leftdoublequotemark ] }; // w W ϖ “
   key <AB02>  { [                x,                X,                U03F0,             Greek_chi ] }; // x X ϰ χ
   key <AB03>  { [                c,                C, Greek_finalsmallsigma,                U03DA ] }; // c C ς Ϛ
   key <AB04>  { [                v,                V,            0x100202F,             leftarrow ] }; // v V ⍽ ← (espace fine insécable)
   key <AB05>  { [                b,                B,           Greek_beta,               uparrow ] }; // b B β ↑
   key <AB06>  { [                n,                N,              notsign,            rightarrow ] }; // n N ¬ →
   key <AB07>  { [            comma,         question,         questiondown,             0x1002026 ] }; // , ? ¿ …
   key <AB08>  { [        semicolon,           period,             multiply,             0x10022C5 ] }; // ; . × ⋅
   key <AB09>  { [            colon,            slash,             division,             0x1002215 ] }; // : / ÷ ∕
   key <AB10>  { [           exclam,          section,                U2018,                 U2019 ] }; // ! § ‘ ’
};

Prime and transpose

Today they are in and … or maybe Bₜ

These question and answers on stackoverflow gave a few insight on how to generate a full set of valid identifiers.

  import keyword
import tokenize

def isidentifier(ident):
  """Determines, if string is valid Python identifier."""

  # Smoke test if it's not string, then it's not identifier, but we don't
  # want to just silence exception. It's better to fail fast.
  if not isinstance(ident, str):
     raise TypeError('expected str, but got {!r}'.format(type(ident)))

  # Quick test if string is in keyword list, it's definitely not an ident.
  if keyword.iskeyword(ident): return False

  readline = (lambda: (yield ident.encode('utf-8-sig')))().__next__
  tokens = list(tokenize.tokenize(readline))

  # You should get exactly 3 tokens
  if len(tokens) != 3: return False

  # First one is ENCODING, it's always utf-8 because we explicitly passed in
  # UTF-8 BOM with ident.
  if tokens[0].type != tokenize.ENCODING: return False

  # Second is NAME, identifier.
  if tokens[1].type != tokenize.NAME: return False

  # Name should span all the string, so there would be no whitespace.
  if ident != tokens[1].string: return False

  return True


if __name__ == '__main__':
  import sys,six
  for i in range(sys.maxunicode):
    try:
      c = six.unichr(i)
      if isidentifier(c):
        print (hex(i),c)
    except Exception as e:
      print (e)

The big list comprise japanese, farsi, chinese, indi characters, bold and italic greek, various latin display like gothic, roman, serif and, well, a lot of characters you wouldn't like to show up in a line of code and others you could hardly give a name to.

Since there is a lot of characters that won't display on most screens, keeping only a selection of potentially interesting characters, in order to build keymaps is a start. They might prove to be handy when writing articles and maths formulas in latex.

Sometimes finding the proper latex declaration to render a character takes some times, while it can be direcly written in the source.

Moreover, since identifiers can turn to operators, having A.ㄨ(B) is more than just eye candy. Consider that even a variable AㄨB as temporary one is way more expressive than, say tmp or tmp_a_mul_b.

Drawback

Omicron (ο) can be mistaken to an "o". Finding that identifiers don't match is easy, but getting down to the comparison of two strings looking alike and discovering that they don't match introduces a high dosage of uncertainty in one coder's mind.

Conclusion

Expressiveness has always been a key in programming, and when switching characters sets, compilers written the correct way with unicode's runes will digest this just fine.

Looking beyond a standard character map and extending the keymap is a natural way to gain expressiveness and to keep a code closer to it's mathematical expression. People accustomed to Mathematica and symbolic computation are in known lands.

Consider for instance ϰⁿ, Ω(x) or Aᵀ.

Now that I'm using helm-unicode however, the keymap becomes a necessity only for recurrent characters.

𝓕a𝒊𝖘 ʙɨ∈𝛮 Ɠɐʄʃе Ἁ ᴜᥒі℅ⅆ℮ ⊤╰╯ Ꮲe⨆⨉ aᏙοίɼ ⓓ⒠∫ Ρℜ◯♭∟ѐᵐℯ𝘴 ❰<𝖴𝚗 Аⅿⅈ ℚuℹ т⁅ ⅴ∈⋃Ꭲ ∂ሀ Ƅⅰⓔ∩┅❯❯