Jon Rawski

Jon Rawski

Assistant Professor

San Jose State University


I teach courses on general and computational linguistics at San Jose State University. My work generally concerns the mathematics of human language and learning. These cognitive feats emerge from humans' unique neuronal structure and computing power, allowing linguistic insight to contribute to the broader cognitive sciences and artificial intelligence.


  • Computational/Mathematical Linguistics
  • Cognitive Science
  • Artificial Intelligence


  • PhD in Linguistics, 2021

    Stony Brook University

  • MSci in Cognitive Science, 2016

    Higher School of Economics

  • BA in Linguistics, 2013

    University of Minnesota

News and Recent Lectures

See my CV for a full list of all my talks.

Dagstuhl Seminar on Regular Transformations

Rethinking Poverty of the Stimulus

Rethinking Poverty of the Stimulus

Abductive Inference of Phonotactic Constraints

Projects & Publications

See full list of publications.

I try to make all of my published work available for free and Green open-access online, on this website and/or on a centralized, well-archived repository. You can check if your own work meets this criteria via the Dissemin Project


Regular and Polyregular Theories of Reduplication

We show a mismatch between the generative capacity of reduplication and the theories which model it.

Benchmarking Compositionality with Formal Languages

A novel method for sampling a class of subsequential string transductions encoding homomorphisms allows rigorous testing of learning models' capacity for compositionality.

History of Phonology: Learnability

This chapter examines the brief but vibrant history of learnability in phonology.

Phonological Abstractness in the Mental Lexicon

We overview the notion of phonological abstractness, various types of evidence for it, and consequences for linguistics and psychology.

Mathematical Linguistics & Cognitive Complexity

A book chapter on mathematical theories of language and learning, and their consequences for linguistic cognition studies.

Structure and Learning in Natural Language

My doctoral dissertation, examining the relationship between abductive inference and algebraically structured hypothesis spaces, giving a general form for grammar learning over arbitrary linguistic structure.

Typology Emerges from Simplicity in Representations and Learning

We derive the well-studied subregular classes of formal languages, which computationally characterize natural language typology, purely from the perspective of algorithmic learning problems.

Talk isn’t so Cheap

Invited response commenting on substantive and computational comparisons of spoken and signed languages

Comment on Nonadjacent Dependency Processing in Monkeys, Apes, and Humans

We comment on non-human animals' ability to learn syntactic vs phonological dependencies in pattern-learning experiments.

Computational Restrictions on Iterative Prosodic Processes

We formalize various iterative prosodic processes including stress, syllabification and epenthesis using logical graph transductions, showing that the necessary use of fixed point operators without quantification restricts them to a structured subclass of subsequential functions.

Computational Locality in Nonlinear Morphophonology

We present an automata-theoretic analysis of locality in nonlinear phonology and morphology.

Strong Generative Capacity of Morphological Processes

We analyze divergences in strong generative capacity between morphological processes that are equivalent in weak generative capacity.

The Computational Power of Harmony

We overview vowel harmony computationally, describing necessary and sufficient conditions on phonotactics, processes, and learning.

The Logical Nature of Phonology Across Speech and Sign

This article examines whether the computational properties of phonology hold across spoken and signed languages, using model theory and logical transductions.

What can formal language theory do for animal cognition studies?

We comment on mathematical fallacies present in artificial grammar learning experiments and suggest how to integrate psycholinguistic and mathematical results.

Probing RNN Encoder-Decoder Generalization of Subregular Functions using Reduplication

We analyze the expressivity of a variety of recurrent encoder-decoder networks, showing they are limited to learning subsequential functions, and connecting RNNs with attention mechanisms to a class of deterministic 2-way transducers.

Multi-Input Strictly Local Functions for Templatic Morphology

We provide an automata-theoretic characterization of templatic morphology, extending strict locality to consider n-ary functions.

Multi-Input Strictly Local Functions for Tonal Phonology

We provide an automata-theoretic characterization of tonal phonology, extending strict locality to consider n-ary functions.

Tensor Product Representations of Subregular Formal Languages

I provide a vector space characterization of the Star-Free and Locally Threshold testable classes of formal languages, over arbitrary data structures.

Learning with Partially Ordered Representations

We describe a partial order on the space of model-theoretic constraints and a learning algorithm for constraint inference.

Finite-State Locality in Semitic Root-and-Pattern Morphology

We describe the finite-state nature of root-and-pattern morphology using Semitic as a case study, and discuss issues of finiteness vs. infinity, and template emergence.

No Free Lunch in Linguistics or Machine Learning: Reply to Pater

We caution about confusing ignorance of biases with absence of biases in machine learning and linguistics, especially for neural networks.

Quantified Sentences as a Window into Prediction and Priming: An ERP Study

We used event related potentials (ERPs) to examine the processing of quantified sentences in an auditory/visual truth value judgment task, specifically to probe truth value and quantifier type influences on the N400 and ERP markers of quantifier complexity.

Phonological Complexity is Subregular: Evidence from Sign Language

I show the complexity of several signed processes is subregular across speech and sign using string representations.