Faster, Slimmer Word Embeddings

0 0
  • 0 Collaborators

A (not new) way of compressing word embeddings and representation for fast parallel parsing. ...learn more

Artificial Intelligence

Groups
Artificial Intelligence West Coast

Overview / Usage

Enable 10-20x faster loading of word embeddings by
compress vector values into 3 bytes
provide ~4MB[1] chunk boundaries for concurrent reads

Loading word embeddings is a bottleneck. For explorative nlp tasks on a personal machine, parsing word embeddings again-and-again becomes a significant time cost. Plus embeddings are RAM hungry -- forget about loading multiple embeddings into memory at once!
By exploring how to parse embeddings as fast as possible, I've developed a new byte layout for word embeddings that is smaller and allows loaders to leverage multiple cores in a simple way. But before we dive into all of the fun engineering, here are some results and links to these compacted embeddings***:
Original Original Size (GB)* Compressed Compressed Size (GB) Original Parse Time (s)** Compressed Parse Time (s)****
GloVe.840B.300 5.3 glove.bin 1.9 156.42 6.13 (2.15)
GoogleNews-vectors-negative300 3.4 googl.bin 2.6 82.34 8.49 (3.43)
*: These are non-gzipped file sizes, but the download links point to gzipped files.

**: Elapsed wall-clock time for parsing original embeddings with single-threaded program; Compare to the compressed parse time which is a parallel parse.

***: Results collected on 4-core MacBook Pro, 2.7 GHz Intel Core i7, 16 GB 1600 MHz DDR3, 256 KB L2 Cache (per Core), 6MB L3 Cache, APPLE SSD SD512E, 64-Bit Java 1.8 HotSpot(TM)

****: Time in parantheses is averages of repeated runs excluding first run (help from pagecache).

Comments (0)