A tool to generate word generators based on phonemes.

Go to file

Feufochmar e5a81f8b8f A sample script to generate words		2018-06-23 03:34:13 +02:00
examples	Indicate which vowels are stressed or unstressed	2018-06-15 19:29:50 +02:00
py-phonagen	A sample script to generate words	2018-06-23 03:34:13 +02:00
web	Simplify css to minimum for the example.	2018-06-09 23:20:01 +02:00
README.md	Document the tools made so far in the README.	2018-06-09 23:41:09 +02:00
generate-example-set.sh	Use the phonology generator in the example json generator.	2018-06-22 23:59:46 +02:00

README.md

Phonagen

Phonemic word generation tools. Phonagen provide several tools to make words generators based on the prononciation and transcriptions of phonemes.

The tools are built around a JSON representation of phonemes and word generators.

Web interface

The web directory contains a sample web interface to generate words from the JSON description included in the web/data.json file. The implementation of generators is located in the script web/phonagen.js. To use it on any webpage:

include the script on your page (<script src='phonagen.js'></script> in the headers)
add a div (or another block element) with the phonagen id
call the phonagen.load() function with the JSON file to use as an argument (either in the onload method of the body, or in a script tags placed after the phonagen block ex: <script>phonagen.load('data.json')</script>)

Python scripts

Those are located in the py-phonagen directory.

phonagen.py

The main module containing all the abstractions on which phonagen is based. Imported by the other tools.

phonology-csv2json.py

Convert a csv file listing the phonemes and their transcriptions into the corresponding JSON phonology representation.

The input csv file should have a header indicating the names of the columns. A phoneme column is mandatory. The id and description columns are optional. The other columns are treated as different transcriptions of the phonemes. If no main transcription is provided in the command line, the first column that is not phoneme, id, or description is taken as the main transcription. The id column serve to identify a phoneme, to be notably used in example lists. The description column may provide informations about a phoneme. Phonemes can be tagged in the description to guide some generators.

Examples of csv files are present in the examples directory.

generator-list2chain.py

Convert a list of examples into a chain-based generator (Markov chains).

The list of examples can be checked against a phonology, by giving the corresponding JSON file in the arguments. If a JSON file of the phonology is given, the phonology is included in the output. The output can be used as the input file of the web interface to generate words.

The file containing the list of examples should be formatted as follow:

one example by line
each phonemes are indicated by its corresponding id
the phoneme's ids are separated by spaces

Lists of examples can be found in the examples directory (.list files).

phonagen-merge.py

Merge several phonagen JSON files into a single JSON file.