通过rancher镜像部署，配置volcano虚拟化GPU无法正常启动

### Reminder

- [x] I have read the above rules and searched the existing issues.

### System Info

日志记录显示：sleep: error while loading shared libraries: /usr/lib/x86_64-linux-gnu/libcuda.so.1: file too short
我的yaml文件配置如下，关键配置已用粗体标出：
apiVersion: apps/v1
kind: Deployment
metadata:
  name: llamafactory
  namespace: xxx
  labels:
    workload.user.cattle.io/workloadselector: apps.deployment-jmai-llamafactory
spec:
  replicas: 1
  selector:
    matchLabels:
      workload.user.cattle.io/workloadselector: apps.deployment-jmai-llamafactory
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0       # 保证零停机更新
    type: RollingUpdate
  template:
    metadata:
      labels:
        xxx
    spec:
      **schedulerName: volcano
      runtimeClassName: nvidia**
      terminationGracePeriodSeconds: 30
      containers:
        - name: llamafactory
          image: goharbor.jomoo.cn/llmos-ai/llamafactory:0.9.5
          imagePullPolicy: IfNotPresent
          command:
            - llamafactory-cli
            - webui
            - '--host'
            - '0.0.0.0'
            - '--port'
            - '7860'
          ports:
            - containerPort: 7860
              name: http
              protocol: TCP
          env:
            - name: NVIDIA_VISIBLE_DEVICES
              value: all
            - name: NVIDIA_DRIVER_CAPABILITIES
              value: compute,utility
            - name: NVIDIA_DISABLE_REQUIRE
              value: 'true'
            - name: LD_LIBRARY_PATH
              value: >-
                /usr/lib/wsl/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu
          resources:
            limits:
              cpu: '16'
              memory: 16000Mi
              **volcano.sh/vgpu-memory: '12288' 
              volcano.sh/vgpu-number: '1'**
            requests:
              cpu: '4'
              memory: 8000Mi
              **volcano.sh/vgpu-memory: '12288' 
              volcano.sh/vgpu-number: '1'**
          securityContext:
            privileged: false
          volumeMounts:
            - mountPath: /dev/shm
              name: dshm
      volumes:
        - name: dshm
          emptyDir:
            medium: Memory
            sizeLimit: 8Gi


### Reproduction

```text
Put your message here.
```


### Others

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

通过rancher镜像部署，配置volcano虚拟化GPU无法正常启动 #10564

Reminder

System Info

Reproduction

Others

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

通过rancher镜像部署，配置volcano虚拟化GPU无法正常启动 #10564

Description

Reminder

System Info

Reproduction

Others

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions