Contribute Media
A thank you to everyone who has made this possible: Read More

Simulate your language. ish.


PyData Amsterdam 2017

John will present a simple character-level Markov model for simulating language in Python. The goal is to generate text that demonstrates how English looks to non-English readers. The model generates text that is simultaneously totally foreign and yet weirdly familiar, using logic simple enough that anyone could replicate it.

engl_ish is a Python model for text generation. It is based on a character level Markov model augmented with some additional logic, with the aim of capturing the "feel" of a language. The goal is to generate text in e.g. English, such that it contains no actual English meaning, but nonetheless looks like English to someone who doesn't speak the language.

In this talk, John will share his inspiration for creating the model, go into detail about its logic and how it was implemented in Python, share results from a variety of training sets and settings, and talk about issues and opportunities for improvement.


Improve this page