Lingvo Notes and Tutorial

I had known about the Lingvo project at Google for a while after listening to Patrick Nguyen speak at the Uber AI Conference in the fall, but finally decided to get around to playing with it.

I tried to make my own ASR library based on Tensorflow, galvASR, but I ran into a fundamental confusion about how to manipulate the interface of Tensorflow sessions to run in an online fashion in models like time delay neural networks (aka, 1-D convolutional neural networks, with dilated filters). So why not see how a team with direct access to the Tensorflow development did it?


Need nvidia-docker for Docker-based install.

Centos7 install instructions:

distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
curl -s -L$distribution/nvidia-container-runtime.repo | \
    sudo tee /etc/yum.repos.d/nvidia-container-runtime.repo
sudo yum install -y nvidia-container-runtime-hook nvidia-container-runtime

sudo tee /etc/docker/daemon.json <<EOF
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
sudo pkill -SIGHUP dockerd

The test cases can crash if you don’t have enough GPUs. Limit the number of concurrently running tests like this:

bazel test -j $(nvidia-smi --list-gpus | wc -l) -c opt //lingvo:trainer_test //lingvo:models_test

The development environment for Lingvo is like this:

  • Write code using your host machine, in the repo located at “$LINGVO_DIR”.
  • Build and test and run within the Docker container, which mounts the host’s $LINGVO_DIR directory on the

This means that Bazel will spew out this warning unless you change your default directory and file permissions:

WARNING: failed to create one or more convenience symlinks for prefix 'bazel-':
  cannot create symbolic link bazel-bin -> /root/.cache/bazel/_bazel_root/17eb95f0bc03547f4f1319e61997e114/execroot/__main__/bazel-out/k8-opt/bin:  /tmp/lingvo/bazel-bin (Permission denied)
  cannot create symbolic link bazel-testlogs -> /root/.cache/bazel/_bazel_root/17eb95f0bc03547f4f1319e61997e114/execroot/__main__/bazel-out/k8-opt/testlogs:  /tmp/lingvo/bazel-testlogs (Permission denied
  cannot create symbolic link bazel-genfiles -> /root/.cache/bazel/_bazel_root/17eb95f0bc03547f4f1319e61997e114/execroot/__main__/bazel-out/k8-opt/genfiles:  /tmp/lingvo/bazel-genfiles (Permission denied)
  cannot create symbolic link bazel-out -> /root/.cache/bazel/_bazel_root/17eb95f0bc03547f4f1319e61997e114/execroot/__main__/bazel-out:  /tmp/lingvo/bazel-out (Permission denied)
  cannot create symbolic link bazel-lingvo -> /root/.cache/bazel/_bazel_root/17eb95f0bc03547f4f1319e61997e114/execroot/__main__:  /tmp/lingvo/bazel-lingvo (Permission denied)

Since the bazel-* directories provide the binaries that you’ve built, you really do want to have these directories available to you. You can fix this by running this command on the host machine:

chmod -R 757 $LINGO_DIR

It gives write permissions to your directory for all users. There are more sophisticated ways to do this, but it’s involved. You can read more about a more sophisicated fix here. But that involves modifying Lingvo’s Dockerfile with configurations specific to your setup, which I dislike doing.

IPython Notebook

There is an example IPython notebook. You can start the IPython server like this:

bazel run -c opt //lingvo:ipython_kernel

This doesn’t work: jupyter notebook --ip= --port=8888 It doesn’t add lingvo to the PYTHONPATH.

This isn’t working yet. I need to copy the introduction.ipynb file into the $(git rev-parse --show-toplevel)/lingvo/ directory from $(git rev-parse --show-toplevel)/codelab in order for it to be visible.

Running the Librispeech Example


Note that the data gets downloaded to /tmp//librispeech by default. If you are like me and need to put the data on a different disk, change the value in both lingvo/tasks/asr/tools/ and lingvo/tasks/asr/params/ (DATADIR static variable).

After doing the above block, run the Grapheme prediction model for Librispeech:

bazel run -c opt --config=cuda //lingvo:trainer -- --logtostderr \
      --model=asr.librispeech.Librispeech960Grapheme --mode=sync \
      --logdir=/tmp/ebs/lingvo/librispeech --saver_max_to_keep=2 \
      --run_locally=gpu 2>&1 |& tee run.log


tensorboard --logdir /tmp/ebs/lingvo/librispeech/train

It will be on port 6006, which the docker run command has already mounted on port 6006 for your host computer.


Lingvo is not really extensible. Adding a new model basically requires modifying the repo. This is fine if you’re on the development team, but not fine if you are an external user. Alas. There is also no way to install it without Bazel.

Secondly, Lingvo does not provide, as far as I can tell, an online decoder. All decoders are offline. This is disappointing because, as far as I can tell, Tensorflow’s Session-based execution model is not good for structured machine learning tasks like streaming speech recognition. I was hoping that Lingvo would provide an example of using Tensorflow for online inference, but I guess not. Alas.


There is a paper describing the design of Lingvo here: It’s a little hidden on the Github page, so I thought I would also link to it.