Constructed Languages Asked on August 20, 2021
I’m from South Florida and noticed that my Haitian neighbors speak a language called Creole. Then It got my fantasizing a possibility that is well within our reach of no more than a decade.
Using AI to create languages by mixing other languages.
Does anyone know any specific technologies that provide practical languages yet?
IF I ever get around to constructing a conlang I'll probably algorithmically generate its vocabulary from Wiktionary entries. Wiktionary includes pronunciations in IPA format (and audio files), spellings, and translations between hundreds of languages.
(I'd build up an inventory of easily distinguished phonemes, match words of like meaning, and then use machine learning to synthesize a vocabulary out of like-sounding synonyms.)
While some prominent conlangs have attempted something like this in the past, they were forced to cherry-pick and use subjective judgements. Today modern technology enables us to do it with a thoroughness and at a scale previously unfeasible. Lojban spliced together vocabulary from 6 languages; with machine learning and Wiktionary, we can algorithmically splice together hundreds.
I'd find it interesting to see what kind of language you would come up with if you algorithmically synthesized your vocabulary and pronunciations from Wiktionary's database of worldwide languages and developed your syntax from Galactical Dependencies' most representative grammars. Would it resemble Esperanto or Lojban, or would it look like something else entirely?
Answered by Robert Larkins on August 20, 2021
This is something different from my first answer: Mixing syntax
There is a project named Galactical Dependencies where the authors started with dependency treebanks of some known languages and than created new artificial languages at a large scale by altering the typological features of the original language using features from other languages. This leads to new "languages" with a different syntax. Incidentally, the main motivation of that project was the lack of available treebanks for typologically interesting languages and so they created them artificially.
I am not aware of any conlanger tapping that ressource yet.
Answered by jk - Reinstate Monica on August 20, 2021
The problem I can see with trying to do this is that you would first need to be able to encode the urlanguages in a way that a computer could understand. One could imagine giving it corpora from the urlanguages and having a machine learning algorithm produce a new corpus in some pidgin or creole, but it's not very useful unless the computer can then explain what the new texts mean.
The problem of encoding meaning is definitely unsolved - Google Translate has come a long way in the past decade, but it's still far off from a human translator.
So I think in the end, it depends on the criteria you want to use to evaluate it. Does a language whose syntax and vocabulary can only be inferred count?
Answered by Andrew Ray on August 20, 2021
Not really AI, but the vocabulary of the two logical languages Loglan and Lojban was created algorithmically from the vocabulary of pre-defined source languages. In this case, the phonemes of the words were picked by the algorithm which is different from the formation of a creole where whole words are picked by the human speakers.
Answered by jk - Reinstate Monica on August 20, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP