I teach courses on general and computational linguistics at San Jose State University. In 2023/2024, I am on leave to visit and teach at MIT Linguistics. During summers and winters I am in Paris, transiently at the IJN/ENS DEC. I generally study the mathematics of human language and learning. My work places mathematical boundary conditions on the grammars underlying human language and how they can be learned. These properties reflect humans' unique neuronal structure and computational power, contributing foundational principles for the cognitive and computer sciences.
PhD in Linguistics, 2021
Stony Brook University
MS in Cognitive Science, 2016
Higher School of Economics
BA in Linguistics, 2013
University of Minnesota
See full list of publications.
I try to make all of my work accessible and Green open-access, on this website and/or some other repository. You can check your own work via the Dissemin Project
A short response rejecting the scientific contribution of language models as theories.
We show a mismatch between the generative capacity of reduplication and the theories which model it.
A novel method for sampling a class of subsequential string transductions encoding homomorphisms allows rigorous testing of learning models' capacity for compositionality.
This chapter examines the brief but vibrant history of learnability in phonology.
We overview the notion of phonological abstractness, various types of evidence for it, and consequences for linguistics and psychology.
A book chapter on mathematical theories of language and learning, and their consequences for linguistic cognition studies.
My doctoral dissertation, examining the relationship between abductive inference and algebraically structured hypothesis spaces, giving a general form for grammar learning over arbitrary linguistic structure.
We derive the well-studied subregular classes of formal languages, which computationally characterize natural language typology, purely from the perspective of algorithmic learning problems.
Invited response commenting on substantive and computational comparisons of spoken and signed languages
We comment on non-human animals' ability to learn syntactic vs phonological dependencies in pattern-learning experiments.
We formalize various iterative prosodic processes including stress, syllabification and epenthesis using logical graph transductions, showing that the necessary use of fixed point operators without quantification restricts them to a structured subclass of subsequential functions.
We present an automata-theoretic analysis of locality in nonlinear phonology and morphology.
We analyze divergences in strong generative capacity between morphological processes that are equivalent in weak generative capacity.
We overview vowel harmony computationally, describing necessary and sufficient conditions on phonotactics, processes, and learning.
This article examines whether the computational properties of phonology hold across spoken and signed languages, using model theory and logical transductions.
We comment on mathematical fallacies present in artificial grammar learning experiments and suggest how to integrate psycholinguistic and mathematical results.
We analyze the expressivity of a variety of recurrent encoder-decoder networks, showing they are limited to learning subsequential functions, and connecting RNNs with attention mechanisms to a class of deterministic 2-way transducers.
We provide an automata-theoretic characterization of templatic morphology, extending strict locality to consider n-ary functions.
We provide an automata-theoretic characterization of tonal phonology, extending strict locality to consider n-ary functions.
I provide a vector space characterization of the Star-Free and Locally Threshold testable classes of formal languages, over arbitrary data structures.
We describe a partial order on the space of model-theoretic constraints and a learning algorithm for constraint inference.
We describe the finite-state nature of root-and-pattern morphology using Semitic as a case study, and discuss issues of finiteness vs. infinity, and template emergence.
We caution about confusing ignorance of biases with absence of biases in machine learning and linguistics, especially for neural networks.
We used event related potentials (ERPs) to examine the processing of quantified sentences in an auditory/visual truth value judgment task, specifically to probe truth value and quantifier type influences on the N400 and ERP markers of quantifier complexity.
I show the complexity of several signed processes is subregular across speech and sign using string representations.