Stored artifacts
When a file is initially imported into LILT, LILT stores the following artifacts:- source file
- intermediate Okapi artifact
- intermediate extracted XLIFF
- translated target file
Typical file size distribution
Customers may expect to translate predominantly text documents (TXT, CSV, HTML, XML, JSON) and desktop publishing formats (WORD, EXCEL, PDF, etc.) without knowing the distribution of files on their system. A random sample of LILT data shows that text-based file formats are the most common, but multimedia-heavy formats such as PPTX files use the most disk space:Filetype | Number of files (% of total) | Data used (% of total) |
PPTX | <1% | 46% |
SDLXLIFF | 8% | 14% |
DOCX | 7% | 14% |
MIF | <1% | 5% |
<1% | 5% | |
XLSX | 2% | 4% |
XML | 37% | 2% |
XLIFF | 3% | 2% |
JSON | 2% | 2% |
XLF | 4% | 2% |
IDML | <1% | 1% |
MQXLIFF | <1% | 1% |
HTML | 6% | <1% |
JSON+HTML | 22% | <1% |
TXT | <1% | <1% |