- Location: Stanford University
- Email: thashim [at] stanford.edu

I am currently a post-doc at Stanford working for John C. Duchi and Percy Liang on nonparametric statistics applied to optimization and natural langauge processing.

Before that, I was a graduate student at MIT co-advised by Tommi Jaakkola and David Gifford, and a undergraduate student at Harvard in statistics and math advised by Edoardo Airoldi.

Improves distributionally robust estimators for subgroup losses (such as those in Hashimoto 2018) for prediction tasks where the conditional distribution remains the same across groups (i.e. covariate shifts). (Link)

(Preprints not available due to ACL preprint policy. Email if interested).

Proposes a retrieval mechanisms based on noisy embeddings into the surface of a hypersphere which guarantee that retrieved examples can be edited into the desired outputs under an information-theoretically optimal edit model. (Link)

A new generative model for text based upon learning an edit model which can generate lexically similar sentences. (Link)

Proposes unfairness in machine learning arising from the use of empirical loss minimization, and amplification of unfairness in iterated settings. Suggests a distributionally robust optimization procedure for bounding losses incurred by minority groups without access to explicit demographic labels. (Link)

An approach for reducing derivative free optimization into repeated classification problems over sublevel sets. (Link)

Proposes a new approach to learning continuous structure in data by learning transformations between nearby data points. (Link)

Demonstrates an equivalence between learning word vectors and recovering a metric space consistent with semantic similarity judgements. (Link)

Describes several applications of the random walk diffusion limit developed in our AISTAT paper for learning metrics over graphs and word embeddings. (Link)

Derives a weighted hitting time metric over graphs which consistently and robustly recovers the underlying metric structure of a graph. (Link)

Derives a new limiting expression for random walks on unweighted graphs that allows for closed-form calculation of metric quantities such as density and similarity. (Link)

Proposes a new dimensionality reduction algorithm which maintains cluster structure. (Link)

A graph layout algorithm which allocates computation time based on centrality measures. (Link)

Demonstrates the effectiveness of a convolutional log-linear model for chromain accessibility. (Link)

Derives recovery conditions for learning differentiation dynamics from single-cell population data. (Link)

(Link)

(Link)

(Link)

(Link)

Proposes a new preprocessing technique to mitigate the effects of overdispersed count data using log-concave distributions (Link).

A new model for estimating transcription factor binding using DNase-I accessibility data, and observations on the behavior of pioneer transcription factors. (Link)

(Link)

A matrix factorization based approach for extracting interpretable sets of gene modules which differ over differentiation. (Link)

A technique for deriving interpretable decision rules for drug discovery. (Link)

2011 → 2016

2007 → 2011

2010

2007 → 2011