Flashing Jetson Orin Nano in Virtualized Environments Link to heading

Introduction Link to heading

Flashing NVIDIA Jetson devices remotely presents unique challenges when the host machine is virtualized. This article documents the technical challenges, failures, and eventual success of flashing a Jetson Orin Nano Super developer kit using NVIDIA SDK Manager in various virtualized environments, specifically focusing on QEMU/KVM virtual machines and LXC containers on Proxmox VE.

S3 File

The Constraint: Hypervisor-Only Infrastructure Link to heading

This project operated under a specific constraint: the only available x86_64 machines were homelab servers running Proxmox VE as bare-metal hypervisors. There was no x86 laptop available, and the primary workstation was an Apple M4 Mac (ARM64 architecture incompatible with SDK Manager).

Installing SDK Manager directly on the Proxmox host OS was explicitly ruled out for several reasons:

  1. Hypervisor Stability: The Proxmox hosts run critical infrastructure (Kubernetes clusters, Ceph storage, network services). Installing development tools and potentially conflicting dependencies directly on the hypervisor risks system stability.

  2. Dependency Conflicts: SDK Manager requires numerous dependencies (QEMU, specific Python versions, USB libraries) that could conflict with Proxmox’s carefully managed package versions.

  3. Clean Separation: Best practices dictate keeping hypervisor hosts minimal, with all workloads running in VMs or containers. This separation simplifies maintenance, updates, and disaster recovery.

  4. Repeatability: A solution confined to a VM or container can be easily replicated, backed up, and destroyed without affecting the host system.

This constraint made the flashing process significantly more complex, as it required finding a virtualization method that could reliably handle the Jetson’s USB communication requirements without installing anything on the Proxmox host beyond standard virtualization features.

Background: Jetson Flashing Requirements Link to heading

NVIDIA Jetson devices use a specialized flashing process that requires:

  1. USB Connection: The device must be connected in Force Recovery Mode (APX mode, USB ID 0955:7523)
  2. Initrd Flash Method: Modern Jetson devices boot a temporary Linux kernel over USB (0955:7035) that establishes USB networking
  3. USB Network Communication: The host system must establish network connectivity (typically 192.168.55.1 or IPv6 fc00:1:1::1) with the Jetson during the flash process
  4. SDK Manager: NVIDIA’s SDK Manager orchestrates the entire process, requiring specific kernel modules and capabilities

The initrd flash method is particularly sensitive to timing and USB device handling, making it challenging in virtualized environments.

MethodUSB PassthroughNetwork NamespaceTiming SensitivityResult
QEMU VM (device-level)EmulatedVM-isolatedHigh latency❌ Failed (USB timeout)
LXC ContainerHost devicesHost namespaceNear-native❌ Failed (network isolation)
QEMU VM (PCI-level)Direct hardwareVM-isolatedNative✅ Success

First Attempt: QEMU/KVM Virtual Machine with USB Passthrough Link to heading

Configuration Link to heading

Given the constraint of not having an x86 laptop, initial attempts used a QEMU/KVM virtual machine running Ubuntu 22.04 x86_64 on an Apple M4 Mac via UTM (a QEMU frontend for macOS). This approach allowed running SDK Manager on an emulated x86_64 system while connecting the Jetson device via USB passthrough configured through UTM’s USB settings.

While this satisfied the requirement of having an x86_64 environment without using the Proxmox hosts, it introduced additional virtualization overhead as the entire x86_64 instruction set was being emulated on ARM64 hardware.

Issues Encountered Link to heading

The flash process consistently failed during the USB communication phase with the error:

ERROR: might be timeout in USB write.

Root Cause Analysis Link to heading

QEMU/KVM’s USB passthrough implementation has known reliability issues with complex USB protocols. The Jetson’s initrd flash process requires:

  1. Rapid USB re-enumeration when switching between recovery mode and initrd mode
  2. High-throughput data transfer for writing the root filesystem
  3. Bidirectional USB network communication with strict timing requirements

Individual USB device passthrough in QEMU emulates USB at the device level, introducing latency and potential timing issues. The Jetson’s USB networking during initrd boot is particularly sensitive to these delays, causing the timeout errors.

Conclusion Link to heading

This approach was abandoned due to fundamental limitations in QEMU’s USB emulation layer. USB passthrough at the device level is insufficient for the Jetson flash process.

Second Attempt: LXC Container on Proxmox Link to heading

Rationale Link to heading

After the Mac-based VM approach failed, attention shifted to the Proxmox infrastructure. LXC containers provide near-native performance with minimal virtualization overhead compared to full VMs. Unlike running SDK Manager directly on the Proxmox host (which was ruled out for stability reasons), an LXC container offers:

  1. Isolation: Complete separation from the host OS with its own filesystem and process space
  2. Near-Native Performance: Containers share the host kernel, eliminating instruction emulation overhead
  3. Easy Management: Containers can be created, destroyed, and backed up without affecting the host
  4. USB Access: Proxmox supports passing USB devices to containers via cgroup device permissions

The hypothesis was that an LXC container with proper USB device access would provide the necessary USB timing characteristics while maintaining the clean separation requirement.

Configuration Progression Link to heading

The LXC container (ID 106, Ubuntu 22.04) required extensive configuration on the Proxmox host (/etc/pve/lxc/106.conf):

# Enable mknod capability for creating device nodes
features: nesting=1,mknod=1

# USB device passthrough (Bus 003)
lxc.cgroup2.devices.allow: c 189:* rwm
lxc.mount.entry: /dev/bus/usb/003 dev/bus/usb/003 none bind,optional,create=dir 0 0

# Loop device access for mounting disk images
lxc.cgroup2.devices.allow: b 7:* rwm
lxc.mount.entry: /dev/loop0 dev/loop0 none bind,optional,create=file 0 0
lxc.mount.entry: /dev/loop1 dev/loop1 none bind,optional,create=file 0 0
# ... (loop2-7)
lxc.mount.entry: /dev/loop-control dev/loop-control none bind,optional,create=file 0 0

Issues Encountered and Resolutions Link to heading

1. mknod Permission Errors Link to heading

Error: mknod: .../rootfs/dev/random: Operation not permitted

Cause: LXC containers lack CAP_MKNOD capability by default, required by L4T flash scripts to create device nodes in the rootfs.

Solution: Enable mknod=1 feature on the Proxmox host:

pct set 106 -features nesting=1,mknod=1

2. ARM64 Binary Execution Link to heading

Error: chroot: failed to run command 'dpkg': Exec format error

Cause: The L4T rootfs contains ARM64 binaries that cannot execute on x86_64 without emulation.

Solution: Install and enable qemu-user-static and binfmt-support on the Proxmox host (not the container):

apt-get install qemu-user-static binfmt-support
update-binfmts --enable qemu-aarch64

3. Loop Device Access Link to heading

Error: losetup: cannot find an unused loop device

Cause: The L4T flash scripts use loop devices to mount disk images. LXC containers don’t have loop device access by default.

Solution: Add loop device permissions and mount entries to the container configuration.

4. USB Networking Failure Link to heading

Error: Device failed to boot to the initrd flash kernel

Cause: This was the most complex issue. When the Jetson boots into initrd mode (0955:7035), it creates a USB network interface (enx* or usb0). However, in LXC containers, this interface appeared in the host’s network namespace, not the container’s namespace.

Attempted Solution:

  1. Loaded USB networking kernel modules on the Proxmox host:
modprobe rndis_host cdc_ether cdc_ncm cdc_subset
echo "rndis_host" >> /etc/modules
echo "cdc_ether" >> /etc/modules
echo "cdc_ncm" >> /etc/modules
echo "cdc_subset" >> /etc/modules
  1. Created udev rules to automatically move USB network interfaces to the container:
# /etc/udev/rules.d/99-jetson-usb-network.rules
ACTION=="add", SUBSYSTEM=="net", KERNEL=="enx*", RUN+="/usr/local/bin/handle-jetson-usb-network.sh %k"
  1. Created handler script to move interfaces into container namespace:
#!/bin/bash
INTERFACE=$1
CONTAINER_ID=106
CONTAINER_PID=$(pct exec $CONTAINER_ID -- pidof systemd | awk '{print $1}')
ip link set "$INTERFACE" netns "ct$CONTAINER_ID"
pct exec $CONTAINER_ID -- ip link set dev $INTERFACE up
pct exec $CONTAINER_ID -- dhclient $INTERFACE

Fundamental LXC Limitation Link to heading

Despite all configuration efforts, the LXC container could not properly handle USB network interfaces due to network namespace isolation. LXC containers have separate network namespaces from the host, and moving USB network interfaces between namespaces proved unreliable and often failed to establish proper connectivity.

The initrd flash process has strict timing requirements:

  1. Jetson boots into initrd mode
  2. USB network interface must appear and be configured within seconds
  3. SSH connection must establish for flash commands

Even when the interface was successfully moved to the container’s namespace, DHCP configuration often failed, causing the flash process to timeout.

Conclusion Link to heading

LXC containers, despite their near-native performance, have fundamental limitations for this use case due to network namespace isolation. USB networking devices created dynamically during the flash process cannot be reliably handled.

Final Solution: Proxmox VM with PCI USB Controller Passthrough Link to heading

Architecture Change Link to heading

With both the Mac-based VM (due to QEMU USB emulation issues) and the LXC container (due to network namespace isolation) ruled out, the final approach combined the best aspects of both previous attempts while working within the Proxmox infrastructure constraint:

  1. Use a VM (not a container) to provide proper network namespace isolation for USB networking
  2. Pass through the entire USB controller at the PCI level (not individual USB devices) to eliminate emulation overhead and any potential timing issues
  3. Keep the host OS clean by running SDK Manager only within the disposable VM

This approach leverages Proxmox’s PCI passthrough capability—a feature designed exactly for scenarios where VMs need direct hardware access without installing drivers or tools on the hypervisor host.

Implementation Link to heading

1. Identify USB Controller Link to heading

# Find which USB controller the Jetson is connected to
lsusb -t | grep -B5 "0955:7523"

# Map USB buses to PCI addresses
for bus in {1..8}; do
    pci=$(readlink /sys/bus/usb/devices/usb$bus 2>/dev/null | grep -oE '[0-9a-f]{4}:[0-9a-f]{2}:[0-9a-f]{2}\.[0-9]')
    echo "USB Bus $bus → PCI $pci"
done

Result: Jetson on Bus 4, controlled by PCI device 0000:22:00.3

Verification that no other critical devices shared this controller:

lsusb | grep "Bus 003"  # Empty except root hub
lsusb | grep "Bus 004"  # Only Jetson device

2. Create VM with PCI Passthrough Link to heading

# Create VM
qm create 200 --name jetson-flash --memory 4096 --cores 4 \
    --net0 virtio,bridge=vmbr0 --scsihw virtio-scsi-pci

# Set machine type to q35 (required for PCIe passthrough)
qm set 200 --machine q35

# Import Ubuntu cloud image
qm importdisk 200 ubuntu-22.04-server-cloudimg-amd64.img local-lvm

# Configure disk and cloud-init
qm set 200 --scsi0 local-lvm:vm-200-disk-0 --boot order=scsi0 \
    --ide2 local-lvm:cloudinit

# Configure cloud-init
qm set 200 --ciuser sdkmanager --cipassword sdkmanager \
    --ipconfig0 ip=dhcp --sshkeys ~/.ssh/authorized_keys

# Add PCI passthrough for USB controller
qm set 200 --hostpci0 0000:22:00.3,pcie=1

# Resize disk for JetPack installation
qm resize 200 scsi0 +30G

# Start VM
qm start 200

3. Critical: USB Networking Kernel Modules Link to heading

The Ubuntu cloud image does not include USB networking kernel modules by default. This is critical because when the Jetson boots into initrd mode, it requires the host to have these modules loaded immediately.

Solution: Install and load modules before starting the flash:

# Install extra kernel modules
apt-get install linux-modules-extra-$(uname -r)

# Load USB networking modules
modprobe rndis_host
modprobe cdc_ether
modprobe cdc_ncm
modprobe cdc_subset

# Verify modules loaded
lsmod | grep -E 'rndis|cdc'

When the Jetson transitions to initrd mode (0955:7035), the USB network interface (usb0) now appears immediately in the VM’s network namespace.

4. Network Configuration Link to heading

The Jetson’s initrd uses IPv6 for USB networking by default:

# Interface appears automatically
ip addr show usb0
# Output:
# usb0: inet6 fc00:1:1::1/64 scope global

# Test connectivity
ping6 -c 3 fc00:1:1:0::2  # Jetson's IPv6 address

The SDK Manager automatically detects and uses IPv6 connectivity for SSH and flash operations.

Flash Process Timeline Link to heading

  1. 07:49:38 - Flash component started, 30-second pre-check wait
  2. 07:50:12 - Board detected as jetson-orin-nano-devkit-super
  3. 07:51:25 - System image created (rootfs populated)
  4. 07:52:24 - Converting to sparse image format
  5. 07:54:54 - Device rebooted into initrd mode (0955:7035)
  6. 07:55:05 - USB network interface usb0 appeared immediately
  7. 07:55:16 - SSH connection established via IPv6
  8. 07:55:16-07:59:05 - QSPI flash (boot firmware) written
  9. 07:59:05-08:00:28 - eMMC flash (system partitions) written
  10. 08:00:28 - Flash successful, device rebooted to normal mode (0955:7020)
  11. 08:00:28-08:02:58 - First-boot auto-configuration
  12. 08:03:00 - Installation completed successfully

Total flash time: ~13 minutes

Why PCI Passthrough Succeeded Link to heading

  1. Direct Hardware Access: The VM has complete control over the USB controller with no emulation layer
  2. Timing Precision: USB protocol timing is maintained at hardware level
  3. Network Namespace: The VM’s network stack directly handles USB network interfaces
  4. No Virtualization Overhead: USB transactions happen at native speed

Key Lessons Learned Link to heading

  1. USB Device Passthrough vs Controller Passthrough: Passing through individual USB devices adds emulation overhead. PCI-level controller passthrough provides native hardware access.

  2. LXC Network Namespace Limitations: LXC containers cannot reliably handle dynamically created USB network interfaces due to network namespace isolation. Even with udev rules to move interfaces, timing and configuration issues persist.

  3. Kernel Module Requirements: USB networking kernel modules must be loaded before the Jetson enters initrd mode. Cloud images and minimal installations often lack these modules.

  4. IPv6 Support: Modern Jetson initrd images prefer IPv6 for USB networking. Ensure the host system has IPv6 enabled and properly configured.

  5. Timing Sensitivity: The Jetson’s initrd flash process has strict timing requirements. USB network interfaces must appear and be configured within seconds of the mode transition.

  6. PCI Passthrough Machine Type: QEMU/KVM requires q35 machine type for PCIe device passthrough. The default i440fx machine type does not support it.

Recommendations Link to heading

For flashing Jetson devices in production or automated environments:

  1. Use PCI USB Controller Passthrough: If virtualization is required, pass through the entire USB controller at the PCI level to a VM.

  2. Pre-load USB Networking Modules: Ensure rndis_host, cdc_ether, cdc_ncm, and cdc_subset kernel modules are loaded before starting the flash process.

  3. Verify USB Controller Isolation: Before passthrough, ensure no other critical devices share the USB controller.

  4. Use Physical Machines When Possible: For development and testing, a physical Linux machine provides the most reliable flashing experience.

  5. Monitor USB Device Transitions: Use lsusb and dmesg to monitor device state transitions:

    • 0955:7523 = Recovery mode (APX)
    • 0955:7035 = Initrd flash mode
    • 0955:7020 = Normal operation mode

References Link to heading