smuecke
Concept for a constructed language based on bits, without inherent phonetics.
Large image of the Mealy machine:
https://drive.google.com/file/d/0B8WTKk_kvsEoTTBXNW9pU0VXY3M/view?usp=sharing
This is actually not the minimal Mealy machine for the given language: If two states have the exact same transitions and outputs, you can merge them, which I haven’t done for this machine. .
Thanks for the idea, I found it interesting. I have a couple of questions, if you don't mind:
If you are using two digits for the number of follow up blocks, then the maximum number of follow up blocks would be two or three. Is this by design? also, how do you distinguish between sentences?
Also, is there a 1 to 1 correspondence between the binary format and the product of the Mealy machine? I ask because if there is not then you wouldn't be able to follow the meaning when you transform from binary to letters, and even if you knew the original meaning of the binary you wouldn't be able to read it from the letters.
As for a vocabulary, you could use the ASCII representation of selected nouns and verbs in english. They don't have to mean the exact same, but it could be a start. Once you have a vocabulary you can make up the grammar rules you want to make it logical.
I don't know if you could make it into a full blown language, but your solutions are ingenious. The part with the poetic meter was so unexpected that it blew me away, and it sounds really cool. I wouldn't mind just using your method to make any old binary file pronounceable.
This is amazing!
Are you familiar with Lojban?
Binary by the way is the most inefficient way to store data. It requires the most space to store information. e.g. 1111 1010 0101 1110 as compared with FA5E
Nêhruf spalfe! NQŞ!!! Şéhcam, viru çe esg atâ nuli góqqâ, em eru qrè ôzuf yze visi fó hetvy? Nù çiveru…
Good idea!
I had a slightly similar concept for an alien conlang; inspired by Reverse Polish Notation. The language(s) would revolve around phonemes belonging to one of four categories: semantical (combined and defined by their neighbors to create meaning), grammatical (to express subject, object, action, etc.), mathematical (to count things and perform calculations) and logical (used to parse sentences). Grammatical, mathematical and logical phonemes would take arguments in a precise order, whereas semantical ones would just agglutinate to create infinitely complex units of meaning.
For the pronunciation, couldn't you simply pronounce every 0 as a/t, and every 1 as o/n, starting with consonant mode and switching modes for every bit? Like 01001110 would be totanona or 11101001110 would be nonanatonot. This way you never get final consonant clusters (no clusters at all, in fact) or two vowels following each other, and it's more intuitive than the Mealy machine algorithm.
Interesting, it's sort of a direct implementation of a grammatical parse-tree. Very nice, though can't the mealy machine just implement every possible parsing of binary bits into language? (especially if you restrict yourself to finitely many rules)
Overall very interesting. I'm curious how you will come up with vocabulary.
so basically utf8 but a language
regarding phonetics, you could make it much simpler by taking advantage of the fact that every word is some multiple of eight digits long
you could assign every possible eight digit string into a syllable
then you could make simple phonotactics (and phonology too).
I think technically this must be a logography like written Chinese, as all the dialects of chinese have wildly different pronunciations under one unified alphabet-thing
It actually might work out nicely to have 16 different consonants (4 bits) and 16 vowels, with the CV structure to read off bytes quickly. For a more English friendly variant, experiment with 32 consonants and 8 vowels which also makes 8 bits total
actually, i made an even better system. consider: a word of length 8N bits will always include N many ones and a spacer zero. In terms of actual lexicographical meaning, this is 7N-1 bits of information. given this, why not pronounce it in chunks of 7? consonants in order of binary value: [P B F V, T D th TH, K G H R, S Z sh zh] which is 4 bits. vowels in order of value: [a ɛ, e i, o ə, ɪ u] IPA standard. thusly, after working out the initial string of ones, the zero and first six bits make a syllable, followed by chunks of 7. the initial consonant can be much denser encoding the length of ones instead of just sets of 4 ones.
after here is where things stray from your vision of the system in ways you may reject, but here it is:
instead, the first bit of the six in question are encoded with the vowel x=[a u] that follows the base consonant.
the other five bits are encoded as in the 7-bit method above in the form [0abcde0] which maximizes clarity.
the values for N from 1 to 8 are: [L N, W M, KL GL,KW GW]; this repeats cyclically. every chunk of 8 disregarded by that cycle is counted up and converted to binary (if 2 blocks of 8 ones are disregarded, the quantity returned is 2) and encoded by an affricate [CH, J] followed by a vowel (as in the main encoding) making base 16. Thusly the word starting with jugwa decodes to mean 15*8 +8 = 128 bytes long, that gives 7*128 – 1 = 895 bits of freedom which can name all the moons, planets, stars, and asteroids in 10^200 universes, as JU=1111b=15 octets of ones which each signify a byte, plus the GW=+8 bytes. this language is very speakable in human terms, and scientists could do something like name all the stars in the galaxy with words initiated by GL-
Great Job, love it! I am working on a structure first language too. I got a lot of new great concepts and had to do a major overhaul, after I actually started making vocab and trying to translate something I might say in English. it is worth it to translate:
I am bob. I am going to the store today. at the store I will buy 5 gallons of milk, for the committee meeting tomorrow, do you want to come with me too? it is a very nice store, last week they had a rock band playing.