> ## Documentation Index
> Fetch the complete documentation index at: https://support.lilt.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Infrastructure Release Notes

This article details the infrastructure changes required when installing this version of LILT.

## 2025

### Patch Updates

The configuration of GPU requirements has changed. Starting with this release, 24GB of VRAM is necessary to run the translate pod successfully.

For customers running on T4 GPUs, this means that the GPU node **must** have at least two (2) T4s attached to it. Put another way, it is **not sufficient** to have two (2) nodes, each with one (1) T4 GPU attached.

## 2024 Q4

### Deprecation Notice

The `analytics-api` application, part of the old analytics implementation, has been officially deprecated and removed.

### Node Labeling

In order to support better utilization of clusters, we have adjusted the way we recommend labeling clusters. [Node Labels](/kb/node-labels) describes the expected labeling of nodes.

### Troubleshooting

Updated the [Troubleshooting](/kb/deleting) guide to include the newly added CLI command to reset the AI models.

### Default Values

Updated resource defaults for services to optimize performance. These changes are documented within [Resource Metrics](/kb/resource-metrics).

### Flannel CNI

Flannel is now a helm chart and included in the overall `install-lilt.sh` script and no longer a separate deployment. If upgrading LILT from a previous version and `flannel `is already installed, comment out the flannel section of the install script:

```bash theme={null}
# install-lilt.sh

# Install flannel, on-prem customers only
kubectl label --overwrite ns kube-flannel pod-security.kubernetes.io/enforce=privileged
sh install_scripts/install-flannel.sh
# wait until pod ready
kubectl wait --namespace kube-flannel --for=condition=ready pod -l app=flannel --timeout=180s
```

If installing `flannel` for the first time via the helm chart, ensure that the `podCidr` is consistent with K8S cluster settings:

```python theme={null}
# flannel/on-prem-values.yaml

flannel:
  podCidr: "192.168.100.0/19"
```

### Redis

Memory limits have been implemented to prevent pod restarts/crashes. This ensures that consumed memory does not exceed pod resource limits. Settings for `maxmemory` must be slightly below `pod mem limits`. Memory can be increased if required:

```python theme={null}
# redis/on-prem-values.yaml

global:
  redis:
    # maxmem needs to be just below resource limit
    maxmemory: "5.8gb"
    maxmemoryPolicy: "allkeys-lru"  # evict least recently used keys
#
master:
  resources:
    limits:
      memory: 6G
```

Persistent storage is now disabled by default. This increases performance and reduces storage requirements.

```cpp theme={null}
# redis/on-prem-values.yaml

master:
  persistence:
    enabled: false
```

If persistence is required for cache security logging/audit, reenable and ensure that the storage size is sufficient for estimated usage:

```cpp theme={null}
# redis/on-prem-values.yaml

master:
  persistence:
    enabled: true
    size: 20Gi   # up to 100Gi for heavy usage
```

### Istio

Additional `kernel`parameters are required to prevent `ztunnel` pod restarts:

```bash theme={null}
# avoid ztunnel container restarts due to load
# append to end of file
cat <<EOF >> /etc/security/limits.conf
soft nofile 131072
hard nofile 131072
EOF

cat <<EOF >> /etc/systemd/system.conf
DefaultLimitNOFILE=131072
EOF
```

### Firewall Ports

Additional ports are required for `Istio`, `api`, `Clickhouse`, and `Flannel`. Please ensure that the following are enabled:

```csharp theme={null}
firewall-cmd --permanent --add-port={22,80,443,2379,2380,5000,6443,10250,10251,10252,10255}/tcp
# api
firewall-cmd --permanent --add-port={5005,8011,8080}/tcp
# istio
firewall-cmd --permanent --add-port={15000,15001,15006,15008,15009,15010,15012,15014,15017,15020,15021,15090,15443,20001}/tcp
# flannel
firewall-cmd --permanent --add-port=8472/udp
# clickhouse
firewall-cmd --permanent --add-port=8123/tcp
```

### Containerd

Additional workloads now run on the `GPU` node in parallel with the `Worker` node. If NOT using a centralized repository for all images, ensure that the following are loaded via `containerd` on the `GPU` node:

```powershell theme={null}
pilot*
proxyv2*
install-cni*
ztunnel*
kiali*
flannel*     # (only if new install)
k8s-device-plugin*
metrics-server*
neural*   # (all neural from docker_images master/node)
llm*
batch*
```

## 2024 Q2

### **Hardware Requirements**

Due to the inclusion of newer models by default, the hardware requirements have changed. [Resource Metrics](/kb/resource-metrics) reflects the additional deployments that need to be considered, and the following recommendations have been updated:

* Master node disk requirement has increased from 200 GB to 500 GB to accommodate for additional container images, configuration, and logging.

* GPU Node instance type updated from `g4dn.2xlarge (8 vCPUs, 32 GB RAM)` to `g4dn.8xlarge (32 vCPUs, 128 GB RAM)`, in order to be able to run the V4 neural services.

* Due to the added models, the hard disk space requirements have been increased. See [Installation Requirements](/kb/install-system-amazon-linux-2023-or-rocky-8-9) for more information.

#### V4 Language Model Updates

As we introduce newer, more accurate language models into LILT, we’ve continually updated our hardware requirements. See [Language Models](/kb/v2-language-models) for the latest in V4 model information. More information can be found in the Knowledge Base around Resource Requirements.

### **Operating System Requirements**

#### CentOS 7 → Rocky Linux 8

New software features of the LILT platform are incompatible with CentOS 7\[1]. All installations still on CentOS 7 should migrate before adopting this release.

Some previous LILT installations were done using CentOS 7, which reached End of Life (EOL) support as of June 30, 2024. The recommended base operating system, and the one being tested in our QA environment, is using Rocky Linux 8. Rocky Linux 8 provides a secure environment similar to CentOS 7, with an End of Life (EOL) support date of May 2029.

#### Istio Module Support

##### Modules

All installations have updated modules needed to support Istio. See the section regarding [kernel modules](/kb/install-system-amazon-linux-2023-or-rocky-8-9) which now includes the following modules to install:

```
overlay
br_netfilter
nf_nat
xt_REDIRECT
xt_owner
iptable_nat
iptable_mangle
iptable_filter
```

These will need to be installed on all existing nodes.

##### Ports

All installations have updated firewall port changes needed to support Istio. See the section regarding [Firewall Settings](/kb/install-system-amazon-linux-2023-or-rocky-8-9) which now includes opening up ports `15000,15001,15006,15008,15009,15010,15012,15014,15017,15020,15021,15090,15443,20001}/tcp` for Istio.

### **Configuration Updates**

#### Custom Domains

New configurations should be done, as described in [Set custom Domain and Certificates](/kb/set-custom-domain-and-certificates) and [Set Connectors Domain](/kb/connector-login-credentials).

#### Upgrading Process

The upgrading process, which involves Helm values files, has been updated for this release to make upgrading simpler in the future. See [Q2 2024 Updates](/kb/upgrade-system#UpgradeSystem-Q22024Updates) for more details.

#### MinIO Resize

In previous releases, the PersistentVolume for MinIO was set to 200GB. With the release of newer models, this is no longer large enough to support them. The default size has been updated from 200GB to 400GB, however, this will not automatically resize existing installations.

If your backing MinIO PersistentVolume is resizable, please resize to 400GB. If it is not able to be resized, the recommended procedure is as follows:

* Back up the MinIO data (if necessary)

* Delete the MinIO PersistentVolumeClaim and PersistentVolume

* Restart the new MinIO deployment

* Restore MinIO data (if necessary)

#### WPA Metrics

WPA metrics, as described in [Generate Evaluation Metrics (WPA, BLEU)](/kb/generate-evaluation-metrics-wpa-bleu) , are now enabled by default.

#### SMTP Notifications

See the new page around configuring SMTP Notifications here: [SMTP Email Notifications](/kb/smtp-email-notifications)

#### Guide on how to handle GPU worker counts

As GPU processing has become increasingly critical in LILT’s models, we’ve added [Configuring GPU Worker Counts in LILT](/kb/configuring-gpu-worker-counts-in-lilt) to assist system administrators in configuring the LILT application for GPU use.

### Vulnerability (CVE) Scan Results

LILT has conducted thorough scans of all services and components to confirm there are no components rated as High or Critical CVEs\[2]. Self-Hosted customers can find further details in the CVE Scan PDF provided with the release.

### **Known Issues**

#### MongoDB Upgrade

The latest MongoDB version has a known [issue](https://github.com/bitnami/charts/issues/27604) that may cause it to fall into a CrashLoop upon upgrading. If this occurs, the recommended fix is as follows:

* Back up the MongoDB data (if necessary)

* Delete the MongoDB PersistentVolumeClaim and PersistentVolume

* Restart the new MongoDB deployment

* Restore MongoDB data (if necessary)

#### Known CVE issues

| **Vulnerability Reference** | **Application**     | **Mitigation / Notes**                                                                          |
| --------------------------- | ------------------- | ----------------------------------------------------------------------------------------------- |
| CVE-2024-31580              | neural              | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2024-31583              | neural              | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2023-6378               | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2023-6481               | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2024-22257              | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2016-1000027            | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2024-22243              | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2024-22259              | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2024-22262              | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2023-32697              | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2022-1471               | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2022-25857              | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| CVE-2024-21634              | core-api            | To be fixed in next release, requires upgrade of package that is used by internal dependencies. |
| GHSA-m425-mq94-257g         | localpv-provisioner | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2024-24790              | localpv-provisioner | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2023-39325              | localpv-provisioner | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2023-45283              | localpv-provisioner | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2023-45287              | localpv-provisioner | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2023-45288              | localpv-provisioner | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2023-1370               | elasticsearch       | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2021-40690              | elasticsearch       | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2022-1471               | elasticsearch       | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |
| CVE-2024-41110              | istiod              | Vulnerability exists in latest version of this application. Waiting on a newer release to fix.  |

### Appendix

\[1] The Analytics dashboard relies on Istio, which uses some underlying code libraries in the operating system kernel that are not present in CentOS 7.

\[2] Fixed High and Critical CVEs refer to vulnerabilities for which fixes are available and do not break the integration of the application and its dependencies. It should be noted that vulnerabilities are continuously discovered, and new CVEs may have been identified but not yet addressed.
