Tatsu Hashimoto

Tatsunori B. Hashimoto Post-Doc @ Stanford

  • Location: Stanford University
  • Email: thashim [at] stanford.edu


I am currently a post-doc at Stanford working for John C. Duchi and Percy Liang on nonparametric statistics applied to optimization and natural langauge processing.

Before that, I was a graduate student at MIT co-advised by Tommi Jaakkola and David Gifford, and a undergraduate student at Harvard in statistics and math advised by Edoardo Airoldi.


Unpublished preprints and manuscripts

Distributionally Robust Losses Against Mixture Covariate Shifts John C. Duchi, Tatsunori B Hashimoto, Hongseok Namkoong

Improves distributionally robust estimators for subgroup losses (such as those in Hashimoto 2018) for prediction tasks where the conditional distribution remains the same across groups (i.e. covariate shifts). (Link)

NAACL papers on evaluation and language modeling.

(Preprints not available due to ACL preprint policy. Email if interested).

Statistics and Machine Learning

A Retrieve-and-Edit Framework for Predicting Structured Outputs Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Tatsunori B Hashimoto, Kelvin Guu, Yonatan Oren, Percy Liang

Proposes a retrieval mechanisms based on noisy embeddings into the surface of a hypersphere which guarantee that retrieved examples can be edited into the desired outputs under an information-theoretically optimal edit model. (Link)

Generating Sentences by Editing Prototypes Transactions of the Association of Computational Linguistics (TACL/ACL 2018) Kelvin Guu*, Tatsunori B Hashimoto*, Yonatan Oren, Percy Liang * - Equal contribution

A new generative model for text based upon learning an edit model which can generate lexically similar sentences. (Link)

Fairness Without Demographics in Repeated Loss Minimization Proceedings of the 35th International Conference on Machine Learning (ICML 2018, Best paper runner up) Tatsunori B Hashimoto, Megha Srivastava, Hongseok Namkoong, Percy Liang

Proposes unfairness in machine learning arising from the use of empirical loss minimization, and amplification of unfairness in iterated settings. Suggests a distributionally robust optimization procedure for bounding losses incurred by minority groups without access to explicit demographic labels. (Link)

Derivative free optimization via repeated classification 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018) Tatsunori B Hashimoto, Steve Yadlowsky, John C Duchi

An approach for reducing derivative free optimization into repeated classification problems over sublevel sets. (Link)

Unsupervised Transformation Learning via Convex Relaxations Advances in Neural Information Processing Systems 30 (NeurIPS 2017) Tatsunori B Hashimoto, Percy S Liang, John C Duchi

Proposes a new approach to learning continuous structure in data by learning transformations between nearby data points. (Link)

Word embeddings as metric recovery in semantic spaces Transactions of the Association for Computational Linguistics 4 (TACL 2016) Tatsunori B Hashimoto, David Alvarez-Melis, Tommi S Jaakkola

Demonstrates an equivalence between learning word vectors and recovering a metric space consistent with semantic similarity judgements. (Link)

Continuous representations and models from random walk diffusion limits PhD Thesis, MIT CSAIL, 2016

Describes several applications of the random walk diffusion limit developed in our AISTAT paper for learning metrics over graphs and word embeddings. (Link)

From random walks to distances on unweighted graphs Advances in Neural Information Processing Systems (NeurIPS 2015) Tatsunori Hashimoto, Yi Sun, Tommi Jaakkola

Derives a weighted hitting time metric over graphs which consistently and robustly recovers the underlying metric structure of a graph. (Link)

Metric recovery from directed unweighted graphs Artificial Intelligence and Statistics (AISTATS 2015), NeurIPS 2014 networks workshop (Best poster) Tatsunori Hashimoto, Yi Sun, Tommi Jaakkola

Derives a new limiting expression for random walks on unweighted graphs that allows for closed-form calculation of metric quantities such as density and similarity. (Link)

Tree preserving embedding Proceedings of the 28th International Conference on Machine Learning (ICML 2011) Albert D Shieh, Tatsunori B Hashimoto, Edoardo M Airoldi

Proposes a new dimensionality reduction algorithm which maintains cluster structure. (Link)

BFL: a node and edge betweenness based fast layout algorithm for large scale networks. BMC bioinformatics (2009) Tatsunori B Hashimoto, Masao Nagasaki, Kaname Kojima, Satoru Miyano

A graph layout algorithm which allocates computation time based on centrality measures. (Link)

Computational Biology and Chemistry

A synergistic DNA logic predicts genome-wide chromatin accessibility Genome research (2016) Tatsunori Hashimoto*, Richard I Sherwood*, Daniel D Kang*, Nisha Rajagopal, Amira A Barkal, Haoyang Zeng, Bart JM Emons, Sharanya Srinivasan, Tommi Jaakkola, David K Gifford * - Equal contribution

Demonstrates the effectiveness of a convolutional log-linear model for chromain accessibility. (Link)

Learning Population-Level Diffusions with Generative RNNs Proceedings of the 33rd International Conference on Machine Learning (ICML 2016) Tatsunori Hashimoto, David Gifford, Tommi Jaakkola

Derives recovery conditions for learning differentiation dynamics from single-cell population data. (Link)

Cas9 Functionally Opens Chromatin PloS One (2016) Amira A Barkal, Sharanya Srinivasan, Tatsunori Hashimoto, David K Gifford, Richard I Sherwood


Cloning-free CRISPR Stem cell reports (2015) Mandana Arbab, Sharanya Srinivasan, Tatsunori Hashimoto, Niels Geijsen, Richard I Sherwood


GERV: a statistical method for generative evaluation of regulatory variants for transcription factor binding Bioinformatics (2015) Haoyang Zeng, Tatsunori Hashimoto, Daniel D Kang, David K Gifford


Long-term persistence and development of induced pancreatic beta cells generated by lineage conversion of acinar cells Nature Biotechnology (2014) Weida Li, Claudia Cavelti-Weder, Yinying Zhang, Kendell Clement, Scott Donovan, Gabriel Gonzalez, Jiang Zhu, Marianne Stemann, Ke Xu, Tatsunori Hashimoto, Takatsugu Yamada, Mio Nakanishi, Yuemei Zhang, Samuel Zeng, David Gifford, Alexander Meissner, Gordon Weir, Qiao Zhou


Universal count correction for high-throughput sequencing PLoS computational biology (2014) Tatsunori Hashimoto, Matthew D Edwards, David K Gifford

Proposes a new preprocessing technique to mitigate the effects of overdispersed count data using log-concave distributions (Link).

Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape Nature Biotechnology (2014) Richard I Sherwood *, Tatsunori Hashimoto *, Charles W O'donnell *, Sophia Lewis, Amira A Barkal, John Peter Van Hoff, Vivek Karun, Tommi Jaakkola, David K Gifford * - Equal contribution

A new model for estimating transcription factor binding using DNase-I accessibility data, and observations on the behavior of pioneer transcription factors. (Link)

Quantifying condition-dependent intracellular protein levels enables high-precision fitness estimates PloS one (2013) Kerry A Geiler-Samerotte, Tatsunori Hashimoto, Michael F Dion, Bogdan A Budnik, Edoardo M Airoldi, D Allan Drummond


Lineage-based identification of cellular states and expression programs Bioinformatics (2012) Tatsunori Hashimoto, Tommi Jaakkola, Richard Sherwood, Esteban O. Mazzoni, Hynek Wichterle, David Gifford.

A matrix factorization based approach for extracting interpretable sets of gene modules which differ over differentiation. (Link)

Finding drug discovery rules of thumb with bump hunting Proceedings of the ACS (2010) Tatsunori Hashimoto, Matthew Segall

A technique for deriving interpretable decision rules for drug discovery. (Link)


MIT – PhD student.

2011 → 2016

RIKEN BSI – Intern.

2007 → 2011

Optibrium LTD – Intern.


Harvard University – Undergraduate Student.

2007 → 2011