Disk Space Requirements
Customers with a self-managed installation of LILT will need to provision disk space according to their own translation volumes. This article provides guidance on a disk space rules-of-thumb and file type distributions.
Stored artifacts
When a file is initially imported into LILT, LILT stores the following artifacts:
source file
intermediate Okapi artifact
intermediate extracted XLIFF
translated target file
This results in an upper-bound disk space requirement of approximately 4x the total source file size. That is, customers planning to translate 10 GB of documents using LILT should provision at least 40 GB of hard drive space.
Typical file size distribution
Customers may expect to translate predominantly text documents (TXT, CSV, HTML, XML, JSON) and desktop publishing formats (WORD, EXCEL, PDF, etc.) without knowing the distribution of files on their system.
A random sample of LILT data shows that text-based file formats are the most common, but multimedia-heavy formats such as PPTX files use the most disk space:
Filetype | Number of files | Data used |
PPTX | <1% | 46% |
SDLXLIFF | 8% | 14% |
DOCX | 7% | 14% |
MIF | <1% | 5% |
<1% | 5% | |
XLSX | 2% | 4% |
XML | 37% | 2% |
XLIFF | 3% | 2% |
JSON | 2% | 2% |
XLF | 4% | 2% |
IDML | <1% | 1% |
MQXLIFF | <1% | 1% |
HTML | 6% | <1% |
JSON+HTML | 22% | <1% |
TXT | <1% | <1% |