I have RSI problems and have tried 30 different computer keyboards which all caused me pain. Playing piano does not cause me pain. I have played piano for around 20 years without any pain issues. I would like to know if there is a way to capture MIDI from a MIDI keyboard and output keyboard strokes. I know nothing at all about MIDI but I would like some guidance on how to convert this signal into a keystroke.
I haven't done any MIDI programming in years, but your fundamental idea is very sound (no pun).
MIDI is a stream of "events" (or "messages"), two of the most fundamental being "note on" and "note off" which carry with them the note number (0 = C five octaves below middle C, through 127 = G five octaves above the G above middle C, in semi-tones). These events carry a "velocity" number on keyboards that are velocity sensitive ("touch sensitive"), with a force of (you guessed it) between 0 and 127.
Between velocity, chording, and the pedals, I'd think you could come up with quite a good "typing" interface for the piano keyboard. Chording in particular could be a very powerful technique — as I mentioned in the comments, it's why rank-and-file stenographers can use a stenotype machine to keep up with people talking for hours in a row, when even top-flight typists wouldn't be able to for any length of time via normal typewriter-style keyboards. As with machine stenography, you'd need a "dictionary" of the meanings of chords and sequences of chords. (Can you tell I used to work in the software side of machine stenography?)
To do this, the fundamental pieces are:
To be most broadly-compatible with software, you'd have to write this as a keyboard device driver. This is a plug-in to the operating system that serves as a source for keyboard events, talking to the underlying hardware (in your case, the piano keyboard). For Windows and Linux, you're probably going to want to use C for that.
However, since you're just generating keystrokes (not trying to intercept them, which I was trying to do years ago), you may be able to use whatever features the operating system has for sending artificial keystrokes. Windows has an interface for doing that (probably several, the one I'm thinking of is SendInput
but I know there's some "journal" interface that does something similar), and I'm sure other operating systems do as well. That may well be sufficient for your purposes — it's where I'd start, because the device driver route is going to be awkward and you'd probably have to use a different language for it than Java. (I'm a big fan of Java, but the interfaces that operating systems use to talk to device drivers tend to be more easily consumed via C and similar.)
Update: More about the "dictionary" of chords to keystrokes:
Basically, the dictionary is a trie (thanks, @Adam) that we search with longest-prefix matching. Details:
In machine stenography, the stenographer writes by pressing multiple keys on the stenotype machine at the same time, then releasing them all. They call this a "stroke" of the keyboard; it's like playing a chord on the piano. Strokes frequently (but not always) correspond to a syllable of spoken language. Like syllables, sometimes one stroke (chord) has meaning all on its own, other times it only has meaning combined with following strokes. (Think "good" vs. "good" followed by "bye"). Although they'll be heavily influenced by the school at which they studied, each stenographer will have their own "dictionary" of what strokes they use to mean what, a dictionary they will continuously hone over the course of their working lives. The dictionary will have entries where the stenographic part ("steno", for short) is one stroke long, or multiple strokes long. Frequently, there will be several entries with the same starting stroke which are differentiated by their length and by the subsequent strokes. For instance (and I won't use real steno here, just placeholders), there may be these entries:
A = alpha A/B = alphabet A/B/C = alphabetic A/C = air conditioning B = bee B/C = because C = sea D = dog D/D = Dee Dee
(Those letters aren't meant to be musical notes, just abstract markers.)
Note that A
starts multiple entries, and also note that how you translate a C
stroke depends on whether you've previously seen an A
, a B
, or you're starting fresh.
Also note that (although not shown in the very small sample above), there may be multiple ways to "play" the same word or phrase, rather than just one. Stenographers do that to make it easier to flow from a preceding word to the next depending on hand position. There's an obvious analogy to music there, and you could use that to make your typing flow more akin to playing music, in order to both prevent this from negatively affecting your piano playing and to maximize the likelihood of this actually helping with the RSI.
When translating steno into standard text, again we use a "longest-prefix match" search: The translation algorithm starts with the first stroke ever written, and looks for entries starting with that stroke. If there is only one entry, and it's one stroke long, then we can reliably say "that's the entry to use", output the corresponding text, and then start fresh with the next stroke. But more likely, that stroke starts multiple entries of varying lengths. So we look at the next stroke and see if there are entries that start with those two strokes in order; and so on until we get a match.
So with the dictionary above, suppose we saw this sequence:
A C B B C A B C A B D
Here's how we'd translate it:
A
is the start of three entries of varying lengths; look at next stroke: C
A/C
matches only one entry; output "air conditioning" and start fresh with next stroke: B
B
starts two entries; look at next stroke: B
B/B
doesn't start anything; take the longest previous match (B
) and output that ("bee")B
= "bee", we still have a B
stroke in our buffer. It starts two entries, so look at the next stroke: C
B/C
matches one entry; output "because" and start fresh with the next stroke: A
A
starts three entries; look at the next stroke: B
A/B
starts two entries; look at the next stroke: C
A/B/C
only matches one entry; output "alphabetic" and start fresh with the next stroke: A
A
starts three entries; look at next stroke: B
A/B
starts two entries; look at next stroke: D
A/B/D
doesn't match anything, so take the longest previous match (A/B
) and use it to output "alphabet". That leaves us with D
still in the buffer.D
starts two entries, so we would normally look at the next stroke — but we've processed all the strokes, so consider it in isolation. In isolation, it translates as "dog" so output that.Aspects of the above to note:
A/B
should be translated as "alphabet", not "alpha" and "bee".E
would be an "untranslate".)