Jetson TK1

NVIDIA Jetson TK1

jetson-tk1-2018-04-18-001.jpg
# lsusb
Bus 002 Device 003: ID 046d:c52b Logitech, Inc. Unifying Receiver
Bus 002 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 003: ID 046d:0a1d Logitech, Inc. 
Bus 001 Device 007: ID 0955:7140 NVidia Corp. 
Bus 001 Device 002: ID 8087:0020 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
#
jetson-tk1-2018-04-18-002.jpg

After re-installing the factory image we have Tegra 21.5

# R21 (release), REVISION: 5.0, GCID: 7273100, BOARD: ardbeg, EABI: hard, DATE: 
Wed Jun  8 04:19:09 UTC 2016

Installing CUDA packages

$ wget http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1404/armhf/cuda-repo-ubuntu1404_6.5-14_armhf.deb
$ sudo dpkg --install cuda-repo-ubuntu1404_6.5-14_armhf.deb
$ cat /var/lib/apt/lists/*cuda*Packages | grep "Package:"

Package: cuda-6-5
Package: cuda-command-line-tools-6-5
Package: nvidia-modprobe
Package: cuda-core-6-5
Package: cuda-cublas-6-5
Package: cuda-cublas-dev-6-5
Package: cuda-cudart-6-5
Package: cuda-cudart-dev-6-5
Package: cuda-cufft-6-5
Package: cuda-cufft-dev-6-5
Package: cuda-curand-6-5
Package: cuda-curand-dev-6-5
Package: cuda-cusparse-6-5
Package: cuda-cusparse-dev-6-5
Package: cuda-documentation-6-5
Package: cuda-driver-dev-6-5
Package: cuda-drivers
Package: cuda-license-6-5
Package: cuda-minimal-build-6-5
Package: cuda-misc-headers-6-5
Package: cuda-npp-6-5
Package: cuda-npp-dev-6-5
Package: cuda-repo-ubuntu1404
Package: cuda-cufft-dev-6-5
Package: cuda-curand-6-5
Package: cuda-curand-dev-6-5
Package: cuda-cusparse-6-5
Package: cuda-cusparse-dev-6-5
Package: cuda-documentation-6-5
Package: cuda-driver-dev-6-5
Package: cuda-drivers
Package: cuda-license-6-5
Package: cuda-minimal-build-6-5
Package: cuda-misc-headers-6-5
Package: cuda-npp-6-5
Package: cuda-npp-dev-6-5
Package: cuda-repo-ubuntu1404
Package: cuda-runtime-6-5
Package: cuda-samples-6-5
Package: cuda-toolkit-6-5
Package: cuda
Package: libcuda1-340
Package: nvidia-340-dev
Package: nvidia-340-uvm
Package: nvidia-340
Package: nvidia-settings

The cluster. I have decided to only use (4) units which will draw about 20A at full GPU load.

2018-04-30-Jetson-Cluster.jpg

This is probably just a temporary work configuration, but I needed a gigabit-ethernet switch and some usb hubs (keyboard, mouse,etc.)

2018-05-01-jetson-tk1-001.jpg

This is the latest design.

2018-05-09-jetson-tk1-01.jpg

2018-05-21

I compiled the JtR code and that went well. It seems as if everything is working.

ubuntu@tegra-ubuntu:~/JohnTheRipper-CUDA/src$ ../run/john --list=cuda-devices
CUDA runtime 6.5, driver 6.5 - 1 CUDA device found:

CUDA Device #0
    Name:                          GK20A
    Type:                          integrated
    Compute capability:            3.2 (sm_32)
    Number of stream processors:   192 (1 x 192)
    Clock rate:                    852 Mhz
    Memory clock rate (peak)       924 Mhz
    Memory bus width               64 bits
    Peak memory bandwidth:         14 GB/s
    Total global memory:           1.0 GB
    Total shared memory per block: 48.0 KB
    Total constant memory:         64.0 KB
    L2 cache size                  128.1 KB
    Kernel execution timeout:      No
    Concurrent copy and execution: One direction
    Concurrent kernels support:    Yes
    Warp size:                     32
    Max. GPRs/thread block         32768
    Max. threads per block         1024
    Max. resident threads per MP   2048
    PCI device topology:           00:00.0

ubuntu@tegra-ubuntu:~/JohnTheRipper-CUDA/src$ ../run/john --list=formats --format=cuda
md5crypt-cuda, sha256crypt-cuda, sha512crypt-cuda, mscash-cuda, mscash2-cuda, 
phpass-cuda, pwsafe-cuda, Raw-SHA512-cuda, wpapsk-cuda, xsha512-cuda, 
Raw-SHA224-cuda, Raw-SHA256-cuda
2018-05-21-jetson-tk1-01.jpg

Sample run on some test data

ubuntu@tegra-ubuntu:~/JohnTheRipper-CUDA/run$ ./john --session=unixmd5 --format=md5crypt-cuda ./unix.txt
Using default input encoding: UTF-8
Loaded 408 password hashes with 408 different salts (md5crypt-cuda, crypt(3) $1$ [MD5 CUDA])
Remaining 405 password hashes with 405 different salts
Press 'q' or Ctrl-C to abort, almost any other key for status
0g 0:00:00:18 17.35% 1/3 (ETA: 07:20:10) 0g/s 6707p/s 6707c/s 6707C/s sputz99999+..Sputz99999c

I compiled and ran john using standalone (single node) mode and everything went well. When I tried to execute the binaries via mpirun it spewed errors about not finding a library. I updated ld.so.conf.d/nvidia-tegra.conf to add this:

/usr/lib/arm-linux-gnueabihf/tegra
# Len added below
/usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib

When I ran ldconfig, it produced an error!

root@gpu02:/etc/ld.so.conf.d# ldconfig
/sbin/ldconfig.real: /usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib/libcudnn.so.6.5 is not a symbolic link

root@gpu02:/etc/ld.so.conf.d# cd /usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib
root@gpu02:/usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib# ls -l *cudnn*
-rwxr-xr-x 1 root root 8978224 Apr 26 21:49 libcudnn.so
-rwxr-xr-x 1 root root 8978224 Apr 26 21:49 libcudnn.so.6.5
-rwxr-xr-x 1 root root 8978224 Apr 26 21:49 libcudnn.so.6.5.48
-rwxr-xr-x 1 root root 9308614 Apr 26 21:49 libcudnn_static.a
root@gpu02:/usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib# rm libcudnn.so libcudnn.so.6.5
root@gpu02:/usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib# ln -s libcudnn.so.6.5.48 libcudnn.so.6.5
root@gpu02:/usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib# ln -s libcudnn.so.6.5.48 libcudnn.so
root@gpu02:/usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib# ls -l *cudnn*
lrwxrwxrwx 1 root root      18 May 25 01:02 libcudnn.so -> libcudnn.so.6.5.48
lrwxrwxrwx 1 root root      18 May 25 01:02 libcudnn.so.6.5 -> libcudnn.so.6.5.48
-rwxr-xr-x 1 root root 8978224 Apr 26 21:49 libcudnn.so.6.5.48
-rwxr-xr-x 1 root root 9308614 Apr 26 21:49 libcudnn_static.a
root@gpu02:/usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib# ldconfig
root@gpu02:/usr/local/cuda-6.5/targets/armv7-linux-gnueabihf/lib#

No errors!

This solved the issue with not being able to load the shared library.

SHA-512

mpirun: Forwarding signal 10 to job
4 0g 0:22:35:24 41.23% 2/3 (ETA: 2018-05-28 07:21) 0g/s 709.8p/s 709.8c/s 709.8C/s jmllrs...kndlrtjs.
1 0g 0:22:34:23 29.04% 2/3 (ETA: 2018-05-29 06:19) 0g/s 828.8p/s 828.8c/s 828.8C/s giletrak4..enisorek4
2 0g 0:22:35:48 36.90% 2/3 (ETA: 2018-05-28 13:48) 0g/s 827.6p/s 827.6c/s 827.6C/s charking?..chromophobic?
3 0g 0:22:50:38 59.98% 2/3 (ETA: 20:24:02) 0g/s 813.8p/s 813.8c/s 813.8C/s Vararonos3..Velocipedeses3

2018-06-04

I will have to rebuild the kernel to get all my wifi boards working as LT 21.5 kernel was built without anything included it seems.

First need to identify all the pieces in order to get a working build server up.

1. The resource within NVIDIA for the kernel sources is https://developer.nvidia.com/linux-tegra-r215 Kernel sources

apt-add-repository universe
apt-get update
apt-get install libncurses5-dev  pkg-config -y
cd /usr/src
wget  https://developer.download.nvidia.com/embedded/L4T/r21_Release_v5.0/source/kernel_src.tbz2
cd kernel
zcat /proc/config.gz > .config
make menuconfig

I have several different wireless cards in my Jetson TK1 so I will build a kernel that supports all of them.

1. RTL8821AE device in gpu02 host.
2. Intel Corporation Wireless 7260 (rev bb) in gpu03 host
3. Qualcomm Atheros AR9285 Wireless Network Adapter (PCI-Express) (rev 01)

root@gpu02:~# lspci
00:00.0 PCI bridge: NVIDIA Corporation TegraK1 PCIe x4 Bridge (rev a1)
01:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8821AE 802.11ac PCIe Wireless Network Adapter
02:00.0 PCI bridge: NVIDIA Corporation TegraK1 PCIe x1 Bridge (rev a1)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
root@gpu02:~# 
root@gpu03:~# lspci
00:00.0 PCI bridge: NVIDIA Corporation TegraK1 PCIe x4 Bridge (rev a1)
01:00.0 Network controller: Intel Corporation Wireless 7260 (rev bb)
02:00.0 PCI bridge: NVIDIA Corporation TegraK1 PCIe x1 Bridge (rev a1)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
root@gpu03:~# 
root@gpu04:~# lspci
00:00.0 PCI bridge: NVIDIA Corporation TegraK1 PCIe x4 Bridge (rev a1)
01:00.0 Network controller: Qualcomm Atheros AR9285 Wireless Network Adapter (PCI-Express) (rev 01)
02:00.0 PCI bridge: NVIDIA Corporation TegraK1 PCIe x1 Bridge (rev a1)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)
root@gpu04:~#

It seems that NVIDIA didn't have the RTL8821AE driver in the kernel source code. I am attempting to graft it in using https://mirrors.edge.kernel.org/pub/linux/kernel/projects/backports/stable/v3.10.4/ and installing the backports-3.10.4-1.tar.gz sources.

2018-06-07

I gave up on the RTL8821AE and instead got the Atheros AR9285 working. I had to get the driver from the backports distribution for this kernel version and build it.

My next attempt will be at the Intel 7260 card. I will accomplish this by simply copying the zImage and the modules in a tar file and install. Hopefully all the wifi modules enabled in my new kernel means I can just do a binary install on all (4) GPUs. I need to ditch the Realtek card and either get a couple more Intel 7260 cards or more Atheros cards. I have spent a lot of time on trying to produce a kernel that recognized most wifi pci-e cards. We shall see!

I was able to get the Intel 7260 working!

I am going to get the RTL8821AE working somehow. We will see!

2018-06-08

2018-06-08-001-jetson.jpg

I have disassembled the cluster in order to troubleshoot the PCI-e wifi-card issues. I have given up on the Realtek 8821. I ordered a few Atheros AR9285 cards and a few wire antennae. I am running JtR on it even now because I am working on a new password I am breaking. It looks like one port on the gig-e ethernet switch has died. I have not had time to investigate yet. The activity light never illuminates now and the port is down.

This is an important link to keep track of, this is the Tegra 21.5 main page at NVIDIA. https://developer.nvidia.com/linux-tegra-r215

Atheros fix for disconnects

(as root)

echo "options ath9k nohwcrypt=1" | sudo tee /etc/modprobe.d/ath9k.conf
modprobe -rfv ath9k
modprobe -v ath9k

Crank up the CPU

# To obtain full performance on the CPU 1. serious power draw), 
#  this will disable CPU scaling and force the 4 CPU cores to 
# always run at max performance until reboot:

echo 0 > /sys/devices/system/cpu/cpuquiet/tegra_cpuquiet/enable
echo 1 > /sys/devices/system/cpu/cpu0/online
echo 1 > /sys/devices/system/cpu/cpu1/online
echo 1 > /sys/devices/system/cpu/cpu2/online
echo 1 > /sys/devices/system/cpu/cpu3/online
echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor

2018-06-24

I don't login and use the GUI so decided to save about .5gb of RAM by disabling that. I just renamed lightdm.conf in /etc/init.

i.e. mv /etc/init/lightdm.conf /etc/init/lightdm.conf.disabled