Commit | Line | Data |
---|---|---|
a60cd2ef AT |
1 | # Xeon Phi Setup/Build Notes # |
2 | ||
3 | These notes cover the creation of a Debian 10.9 (buster) server with ZFS root | |
4 | which serves as host to Knights Corner Xeon Phi coprocessor cards. | |
5 | ||
6 | Each of these coprocessor cards features a P54C-derived core extended to | |
7 | support the X86-64 instruction set, 4-way SMT, and a beefy 512-bit vector | |
8 | processor bolted alongside. Sixty of these cores are connected on a roughly 1 | |
9 | terabit/s bi-directional ring bus. In addition to 8GB of GDDR5 RAM, each core | |
10 | has 512kB of local cache and, via the ring bus and a distributed tag store, all | |
11 | caches are coherent and quickly accessible from remote cores. This hardware is | |
12 | packaged up on a PCIe card which presents a virtual network interface to the | |
13 | host. The coprocessor card runs Linux+BusyBox, allowing SSH access to a | |
14 | traditional Linux environment on a familiar 60-core x86-64 architecture. | |
15 | ||
16 | The hostname `frostburg.subgeniuskitty.com` stems from the original | |
17 | [FROSTBURG](https://en.wikipedia.org/wiki/FROSTBURG), a CM-5 designed by | |
18 | Thinking Machines. Although the fundamental connection topology of a fat tree | |
19 | was different than the ring used in this Xeon Phi, the systems are somewhat | |
20 | similar. Both feature a NUMA cluster of repackaged and extended commercial | |
21 | processor cores operating on independent instruction streams in a MIMD fashion | |
22 | focused on small local data stores. By coincidence, both also feature similar | |
23 | core counts and total memory size. | |
24 | ||
25 | The information on this page includes: | |
26 | ||
27 | - Hardware compatibility notes for Xeon Phi and Xeon host. | |
28 | ||
29 | - Installation of Debian 10.9 (buster) root on encrypted ZFS mirror with | |
30 | automated snapshots and scrubs. | |
31 | ||
32 | - (TODO) Porting the Intel kernel module to Linux kernel version 4.19.0. | |
33 | ||
34 | - (TODO) Installing MPSS toolkit on Debian (or CentOS VM). | |
35 | ||
36 | - (TODO) Building GCC toolchain for Xeon Phi. | |
37 | ||
38 | - (TODO) Installing Intel toolchain for Xeon Phi. | |
39 | ||
40 | These notes are a high-level checklist for my reference rather than a | |
41 | step-by-step installation guide for the public. That means they make no attempt | |
42 | to explain all options at each step, rather that they mention only the options | |
43 | I use on my servers. It also means they use my domains, my file system paths, | |
44 | etc in the examples. Don't blindly copy and paste. | |
45 | ||
46 | -------------------------------------------------------------------------------- | |
47 | ||
48 | ||
49 | ## Hardware ## | |
50 | ||
51 | The host system was kept low power both figuratively and literally. It will | |
52 | primarily serve as a host for the Phi coprocessors and bridge to the network. | |
53 | ||
54 | - **Chassis:** Supermicro 2027GR-TR2 | |
55 | ||
56 | - **Motherboard:** Supermicro X9DRG-HF+II | |
57 | ||
58 | - **CPU:** 2x Xeon E5-2637 | |
59 | ||
60 | - **RAM:** 8x 4GB DDR3 RDIMM | |
61 | ||
62 | - **Storage:** 2x Intel 160GB X-25M SSD | |
63 | ||
64 | - **Payload:** 4x Intel Xeon Phi 5110P | |
65 | ||
66 | To enter the BIOS, use the `DEL` key. Similarly, a boot device selection menu | |
67 | is obtained by pressing `F11`. System will display two-character status codes | |
68 | in the bottom right corner of display. | |
69 | ||
70 | Support files are stored under `hw_support/Intel Xeon Phi/supermicro/`. | |
71 | ||
72 | ||
73 | ### Memory ### | |
74 | ||
75 | Using eight identical sticks of MT36JSZF51272PZ-1G4 RAM. These are ECC DDR3 | |
76 | 2Rx4 PC3-10600 RDIMMS operating at 1.5V. Per page 2-12 of the manual | |
77 | (`MNL_1502.pdf`), DIMMs are installed in all blue memory slots. | |
78 | ||
79 | ||
80 | ### Processors & Heatsinks ### | |
81 | ||
82 | Xeon E5-2637 CPUs selected for lower power, high frequency, cheap price, and | |
83 | 'full' PCIe lane count. They only need to be a host for the real show. Per page | |
84 | 5-7 of the chassis manual (`MNL-1564.pdf`), CPU1 requires heatsink SNK-P0048PS | |
85 | and CPU2 requires heatsink SNK-P0047PS. | |
86 | ||
87 | ||
50ab1573 | 88 | ### SAS Backplane & Motherboard SATA ### |
a60cd2ef AT |
89 | |
90 | The SAS backplane is a little odd. The first eight drive bays connect via a | |
91 | pair of SFF-8087 connectors and the last two drive bays connect via standard | |
92 | 7-pin SATA connectors. | |
93 | ||
94 | Since the motherboard provides ten 7-pin SATA connectors, two cables breaking | |
95 | out SFF-8087 to quad SATA will be required. I tried using just such a cable, | |
96 | but had no luck. There doesn't appear to be anything configurable on the | |
97 | backplane itself. The backplane manual is stored at `BPN-SAS-218A.pdf`. My | |
98 | cable was of unknown origin. Per photos on some eBay auctions, the proper | |
99 | Supermicro cable appears to be part number 672042095704. In addition to the | |
100 | four SATA connectors, this cable also bundles some sort of 4-pin header, | |
101 | presumably the SGPIO connection. | |
102 | ||
103 | In the meantime, since I only intend to use two small drives in a ZFS mirror | |
104 | for the OS and home directories, with all other storage on network shares, | |
105 | simply use the last two slots and connect with normal 30"+ SATA cables. | |
106 | ||
107 | These last two drive bay slots are connected to the two white SATA ports on the | |
108 | motherboard, with the lowest numbered drive slot connected to the rear-most | |
109 | white SATA port. When SFF-8087 connectors are eventually used to increase local | |
110 | storage, relocate the boot drives to drive slots 0 and 1, and connect these | |
111 | slots to the white SATA ports. | |
112 | ||
113 | On the motherboard, the white ports are SATA3 and the black ports are SATA2. | |
114 | The line of 2x white and 4x black SATA ports are part of the primary SATA | |
115 | controller or `I_SATA`. The other line of 4x black SATA ports is part of the | |
116 | secondary or `S_SATA` controller. Put any boot drives on the `I_SATA` ports. | |
117 | ||
118 | ||
119 | ### Xeon Phi ### | |
120 | ||
121 | Section 5.1 of the Intel Xeon Phi Coprocessor Datasheet (DocID 328209-004EN) | |
122 | mentions that connecting the card via both 2x4 and 2x3 power connectors enables | |
123 | higher sustained power draw up to 245 watts versus 225 watts of other power | |
124 | cable configurations. This chassis will easily support the higher power draw | |
125 | and heat dissipation. | |
126 | ||
127 | The Xeon Phi coprocessor cards reserve PCIe MMIO address space sufficient to | |
128 | map the entire coprocessor card's RAM. Since this is >4GB, PCIe Base Address | |
129 | Registers (BAR) of greater than 32-bit size are required. This should be | |
130 | enabled in the BIOS of this particular motherboard under | |
131 | `PCIe/PCI/PnP Configuration` -> `Above 4G Decoding`. | |
132 | ||
133 | In general, motherboards with chipsets equal to or newer than the C602 should | |
134 | work. This includes most Supermicro motherboards from the X9xxx generation or | |
135 | later. None of the Supermicro X8xxx generation motherboards appear to be | |
136 | compatible. | |
137 | ||
138 | The Xeon Phi 5110P, per the suffix, is passively cooled. Section 3 of the Intel | |
139 | Xeon Phi Coprocessor Datasheet (DocID 328209-004EN) details the cooling and | |
140 | mounting requirements. | |
141 | ||
142 | ||
143 | ### Optional Fans ### | |
144 | ||
145 | There are a number of optional fans for this chassis, all detailed in the | |
146 | chassis manual (`MNL-1564.pdf`). My machine includes the optional fan for | |
147 | another double-height, full-length PCIe card with backpanel IO slots, intended | |
148 | to support something like a GPU to drive monitors. Since the optional fan is | |
149 | installed and since the power budget easily supports it, this means the fifth | |
150 | Xeon Phi card could be installed, albeit with slower PCIe connection. | |
151 | ||
152 | Regardless, since this fan is installed, whenever fewer than four Xeon Phi | |
153 | cards are installed, preferentially locate them on the left hand side of | |
154 | chassis, near the lower numbered drive bays. | |
155 | ||
156 | ||
157 | ### Power Supply ### | |
158 | ||
159 | The system contains dual redundant power supplies. Each is capable of supplying | |
160 | 1600 watts, but only when connected to a 240 volt source. When connected to a | |
161 | 120 volt source, maximum power output is 1000 watts. | |
162 | ||
163 | ||
164 | ### Rackmount ### | |
165 | ||
166 | The chassis is over 30" long and protrudes from rear of rack by approximately | |
167 | 1/2". To avoid the rear cable snagging passing carts and elbows, chassis was | |
168 | mounted at top of rack (after empty 1U). The Supermicro rails required cutting | |
169 | four notches in the vertical posts, so this is a semi-permanent home. | |
170 | ||
171 | Inserting or extracting the server from the rack at that height requires an | |
172 | extraordinary amount of free space in front of the rack and some advance | |
173 | planning. Where possible, try to do hardware modifications in-rack. The rails | |
174 | are extremely solid even when the server is fully extended. The grey | |
175 | OS-114/WQM-4 sonar test set chassis makes a solid step stool at the ideal | |
176 | height for working on the server while installed in the rack. | |
177 | ||
178 | ||
179 | ### USB Ports ### | |
180 | ||
181 | There are only two USB ports, both located on the rear of the chassis. During | |
182 | OS installation, if a mouse is required in addition to the keyboard and USB | |
183 | install drive, then a USB hub is required. | |
184 | ||
185 | -------------------------------------------------------------------------------- | |
186 | ||
187 | ||
188 | ## Debian Buster Installation ## | |
189 | ||
190 | These installation instructions use the following XFCE Debian live image. | |
191 | ||
192 | debian-live-10.9.0-amd64-xfce.iso | |
193 | ||
194 | Both the Gnome and XFCE live images were unusably slow in GUI mode. The text | |
195 | installer was fast and responsive, as were VTYs (`Ctrl`+`Alt`+`F2`) from within | |
196 | the live environment. Only the GUIs were slow, but they were slow to the point | |
197 | of being unusable, with single keypresses registering over a dozen times. Once | |
198 | Debian was installed on the SSD and booting normally, the GUI is perfectly | |
199 | usable. Since the local terminal is only used to install and start an OpenSSH | |
200 | daemon, and since this can be done from a VTY, the issue was not investigated | |
201 | further. | |
202 | ||
203 | The root on ZFS portion of this installation process is derived from the guide | |
204 | located here: | |
205 | ||
206 | <https://openzfs.github.io/openzfs-docs/Getting%20Started/Debian/Debian%20Buster%20Root%20on%20ZFS.html> | |
207 | ||
208 | ||
209 | ### Remote Access ### | |
210 | ||
211 | From the `F11` BIOS boot menu, select the UEFI entry for the USB live image. | |
212 | Lacking a mouse, press `CTRL`+`ALT`+`F2` after X is running in order to access | |
213 | a text-only VTY, already logged in as the user `user`. Install an SSH server so | |
214 | the remaining install can be done over the network. | |
215 | ||
216 | apt-get update | |
217 | apt-get install openssh-server | |
218 | systemctl enable ssh | |
219 | ||
220 | From wherever you intend to complete the install, SSH into the live Debian | |
221 | environment as user `user` with password `live`. | |
222 | ||
223 | ||
224 | ### ZFS Configuration ### | |
225 | ||
226 | Edit `/etc/apt/sources.list` to include the following entries. | |
227 | ||
228 | deb http://deb.debian.org/debian/ buster main contrib | |
229 | deb http://deb.debian.org/debian/ buster-backports main contrib | |
230 | deb-src http://deb.debian.org/debian/ buster main contrib | |
231 | ||
232 | Install the ZFS kernel module. Specify `--no-install-recommends` to avoid | |
233 | picking up `zfsutils-linux` since it will fail at this point. See | |
234 | <https://github.com/openzfs/zfs/issues/9599> for more details. | |
235 | ||
236 | apt-get install -t buster-backports --no-install-recommends zfs-dkms | |
237 | modprobe zfs | |
238 | ||
239 | With the kernel module successfully loaded, proceed to install ZFS. | |
240 | ||
241 | apt-get install -t buster-backports zfsutils-linux | |
242 | ||
243 | After using `dd` to eliminate any existing partition tables, partition the | |
244 | disks for use with UEFI and ZFS. | |
245 | ||
246 | First, create a UEFI partition on each disk. | |
247 | ||
248 | sgdisk -n2:1M:+512M -t2:EF00 /dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTPO1252011L160AGN | |
249 | ||
250 | Next, create a partition for the boot pool. | |
251 | ||
252 | sgdisk -n3:0:+1G -t3:BF01 /dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTPO1252011L160AGN | |
253 | ||
254 | Finally, create a partition for the encrypted pool. | |
255 | ||
256 | sgdisk -n4:0:0 -t4:BF00 /dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTPO1252011L160AGN | |
257 | ||
258 | Now that partitioning is complete, create the boot and root pools. | |
259 | ||
260 | The boot pool uses only ZFS options supported by GRUB. | |
261 | ||
262 | zpool create \ | |
263 | -o cachefile=/etc/zfs/zpool.cache \ | |
264 | -o ashift=12 -d \ | |
265 | -o feature@async_destroy=enabled \ | |
266 | -o feature@bookmarks=enabled \ | |
267 | -o feature@embedded_data=enabled \ | |
268 | -o feature@empty_bpobj=enabled \ | |
269 | -o feature@enabled_txg=enabled \ | |
270 | -o feature@extensible_dataset=enabled \ | |
271 | -o feature@filesystem_limits=enabled \ | |
272 | -o feature@hole_birth=enabled \ | |
273 | -o feature@large_blocks=enabled \ | |
274 | -o feature@lz4_compress=enabled \ | |
275 | -o feature@spacemap_histogram=enabled \ | |
276 | -o feature@zpool_checkpoint=enabled \ | |
277 | -O acltype=posixacl -O canmount=off -O compression=lz4 \ | |
278 | -O devices=off -O normalization=formD -O relatime=on -O xattr=sa \ | |
279 | -O mountpoint=/boot -R /mnt \ | |
280 | bpool mirror \ | |
281 | /dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTPO1252011L160AGN-part3 | |
282 | /dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTHC72250AKD480MGN-part3 | |
283 | ||
284 | Now create the root pool with ZFS encryption. | |
285 | ||
286 | zpool create \ | |
287 | -o ashift=12 \ | |
288 | -O encryption=aes-256-gcm \ | |
289 | -O keylocation=prompt -O keyformat=passphrase \ | |
290 | -O acltype=posixacl -O canmount=off -O compression=lz4 \ | |
291 | -O dnodesize=auto -O normalization=formD -O relatime=on \ | |
292 | -O xattr=sa -O mountpoint=/ -R /mnt \ | |
293 | rpool mirror \ | |
294 | /dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTPO1252011L160AGN-part4 | |
295 | /dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTHC72250AKD480MGN-part4 | |
296 | ||
297 | All the pools are created, so now it's time to setup filesystems. Start with | |
298 | some containers. | |
299 | ||
300 | zfs create -o canmount=off -o mountpoint=none rpool/ROOT | |
301 | zfs create -o canmount=off -o mountpoint=none bpool/BOOT | |
302 | ||
303 | Now add filesystems for boot and root. | |
304 | ||
305 | zfs create -o canmount=noauto -o mountpoint=/ rpool/ROOT/debian | |
306 | zfs mount rpool/ROOT/debian | |
307 | zfs create -o mountpoint=/boot bpool/BOOT/debian | |
308 | ||
309 | Create a filesystem to contain home directories and mount root's homedir in the | |
310 | correct location. | |
311 | ||
312 | zfs create rpool/home | |
313 | zfs create -o mountpoint=/root rpool/home/root | |
314 | chmod 700 /mnt/root | |
315 | ||
316 | Create filesystems under `/var` and exclude temporary files from snapshots. | |
317 | ||
318 | zfs create -o canmount=off rpool/var | |
319 | zfs create -o canmount=off rpool/var/lib | |
320 | zfs create rpool/var/log | |
321 | zfs create rpool/var/spool | |
322 | zfs create -o com.sun:auto-snapshot=false rpool/var/cache | |
323 | zfs create -o com.sun:auto-snapshot=false rpool/var/tmp | |
324 | chmod 1777 /mnt/var/tmp | |
325 | zfs create rpool/var/mail | |
326 | ||
327 | Create a few other misc filesystems. | |
328 | ||
329 | zfs create rpool/srv | |
330 | zfs create -o canmount=off rpool/usr | |
331 | zfs create rpool/usr/local | |
332 | ||
333 | Temporarily mount a `tmpfs` at `/run`. | |
334 | ||
335 | mkdir /mnt/run | |
336 | mount -t tmpfs tmpfs /mnt/run | |
337 | mkdir /mnt/run/lock | |
338 | ||
339 | ||
340 | ### Debian Configuration ### | |
341 | ||
342 | Install a minimal Debian system. | |
343 | ||
344 | apt-get install debootstrap | |
345 | debootstrap buster /mnt | |
346 | ||
347 | Copy the zpool cache into the new system. | |
348 | ||
349 | mkdir /mnt/etc/zfs | |
350 | cp /etc/zfs/zpool.cache /mnt/etc/zfs | |
351 | ||
352 | Set the hostname. | |
353 | ||
354 | echo frostburg > /mnt/etc/hostname | |
355 | echo "127.0.1.1 frostburg.subgeniuskitty.com frostburg" >> /mnt/etc/hosts | |
356 | ||
357 | Configure networking. | |
358 | ||
359 | vi /mnt/etc/network/interfaces.d/enp129s0f0 | |
360 | ||
361 | auto enp129s0f0 | |
362 | iface enp129s0f0 inet static | |
363 | address 192.168.1.7/24 | |
364 | gateway 192.168.1.1 | |
365 | ||
366 | vi /etc/resolv.conf | |
367 | ||
368 | search subgeniuskitty.com | |
369 | nameserver 192.168.1.1 | |
370 | ||
371 | Configure packages sources. | |
372 | ||
373 | vi /mnt/etc/apt/sources.list | |
374 | ||
375 | deb http://deb.debian.org/debian buster main contrib | |
376 | deb-src http://deb.debian.org/debian buster main contrib | |
377 | ||
378 | deb http://security.debian.org/debian-security buster/updates main contrib | |
379 | deb-src http://security.debian.org/debian-security buster/updates main contrib | |
380 | ||
381 | deb http://deb.debian.org/debian buster-updates main contrib | |
382 | deb-src http://deb.debian.org/debian buster-updates main contrib | |
383 | ||
384 | vi /mnt/etc/apt/sources.list.d/buster-backports.list | |
385 | ||
386 | deb http://deb.debian.org/debian buster-backports main contrib | |
387 | deb-src http://deb.debian.org/debian buster-backports main contrib | |
388 | ||
389 | vi /mnt/etc/apt/preferences.d/90_zfs | |
390 | ||
391 | Package: libnvpair1linux libuutil1linux libzfs2linux libzfslinux-dev libzpool2linux python3-pyzfs pyzfs-doc spl spl-dkms zfs-dkms zfs-dracut zfs-initramfs zfs-test zfsutils-linux zfsutils-linux-dev zfs-zed | |
392 | Pin: release n=buster-backports | |
393 | Pin-Priority: 990 | |
394 | ||
395 | apt-get update | |
396 | ||
397 | Chroot into the new environment. | |
398 | ||
399 | mount --rbind /dev /mnt/dev | |
400 | mount --rbind /proc /mnt/proc | |
401 | mount --rbind /sys /mnt/sys | |
402 | chroot /mnt | |
403 | ||
404 | Configure the new environment as a basic system. | |
405 | ||
406 | ln -s /proc/self/mounts /etc/mtab | |
407 | apt-get update | |
408 | export TERM=vt100 | |
409 | apt-get install console-setup locales | |
410 | dpkg-reconfigure locales tzdata keyboard-configuration console-setup | |
411 | ||
412 | Install ZFS on the new system. | |
413 | ||
414 | apt-get install dpkg-dev linux-headers-amd64 linux-image-amd64 | |
415 | apt-get install zfs-initramfs | |
416 | echo REMAKE_INITRD=yes > /etc/dkms/zfs.conf | |
417 | ||
418 | Install GRUB and configure UEFI boot partition. | |
419 | ||
420 | apt-get install dosfstools | |
421 | mkdosfs -F 32 -s 1 -n EFI /dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTPO1252011L160AGN-part2 | |
422 | mkdir /boot/efi | |
423 | echo "/dev/disk/by-id/ata-INTEL_SSDSA2M160G2GN_BTPO1252011L160AGN-part2 /boot/efi vfat defaults 0 0" >> /etc/fstab | |
424 | mount /boot/efi | |
425 | apt-get install grub-efi-amd64 shim-signed | |
426 | apt-get remove --purge os-prober | |
427 | ||
428 | Ensure the bpool is always imported, even if `/etc/zfs/zpool.cache` doesn't | |
429 | exist or doesn't include a relevant entry. | |
430 | ||
431 | vi /etc/systemd/system/zfs-import-bpool.service | |
432 | ||
433 | [Unit] | |
434 | DefaultDependencies=no | |
435 | Before=zfs-import-scan.service | |
436 | Before=zfs-import-cache.service | |
437 | ||
438 | [Service] | |
439 | Type=oneshot | |
440 | RemainAfterExit=yes | |
441 | ExecStart=/sbin/zpool import -N -o cachefile=none bpool | |
442 | # Work-around to preserve zpool cache: | |
443 | ExecStartPre=-/bin/mv /etc/zfs/zpool.cache /etc/zfs/preboot_zpool.cache | |
444 | ExecStartPost=-/bin/mv /etc/zfs/preboot_zpool.cache /etc/zfs/zpool.cache | |
445 | ||
446 | [Install] | |
447 | WantedBy=zfs-import.target | |
448 | ||
449 | systemctl enable zfs-import-bpool.service | |
450 | ||
451 | Create a `tmpfs` mounted at `/tmp`. | |
452 | ||
453 | cp /usr/share/systemd/tmp.mount /etc/systemd/system/ | |
454 | systemctl enable tmp.mount | |
455 | ||
456 | ||
457 | ### Bootloader Configuration ### | |
458 | ||
459 | Verify ZFS boot filesystem is recognized. | |
460 | ||
461 | grub-probe /boot | |
462 | ||
463 | Refresh initrd. | |
464 | ||
465 | update-initramfs -c -k all | |
466 | ||
467 | Configure GRUB by editing `/etc/default/grub`. Remove the `quiet` option from | |
468 | `GRUB_CMDLINE_LINUX_DEFAULT` and add the following two options to the | |
469 | appropriate entries. | |
470 | ||
471 | GRUB_CMDLINE_LINUX="root=ZFS=rpool/ROOT/debian" | |
472 | GRUB_TERMINAL=console | |
473 | ||
474 | Install GRUB to the UEFI boot partition. | |
475 | ||
476 | grub-install --target=x86_64-efi --efi-directory=/boot/efi --bootloader-id=debian-1 --recheck --no-floppy | |
477 | ||
478 | Install GRUB on the other hard drives, incrementing `-2` to `-N` as necessary. | |
479 | ||
480 | umount /boot/efi | |
481 | dd if=/dev/disk/by-id/scsi-SATA_disk1-part2 \ | |
482 | of=/dev/disk/by-id/scsi-SATA_disk2-part2 | |
483 | efibootmgr -c -g -d /dev/disk/by-id/scsi-SATA_disk2 \ | |
484 | -p 2 -L "debian-2" -l '\EFI\debian\grubx64.efi' | |
485 | mount /boot/efi | |
486 | ||
487 | Fix filesystem mount ordering. Quoting from the install reference, "We need to | |
488 | activate `zfs-mount-generator`. This makes systemd aware of the separate | |
489 | mountpoints, which is important for things like `/var/log` and `/var/tmp`. In | |
490 | turn, `rsyslog.service` depends on `var-log.mount` by way of `local-fs.target` | |
491 | and services using the `PrivateTmp` feature of systemd automatically use | |
492 | `After=var-tmp.mount`." | |
493 | ||
494 | mkdir /etc/zfs/zfs-list.cache | |
495 | touch /etc/zfs/zfs-list.cache/bpool | |
496 | touch /etc/zfs/zfs-list.cache/rpool | |
497 | zed -F | |
498 | ||
499 | From another SSH session, verify that zed updated the cache by making sure the | |
500 | previously created empty files are not empty. | |
501 | ||
502 | cat /etc/zfs/zfs-list.cache/bpool | |
503 | cat /etc/zfs/zfs-list.cache/rpool | |
504 | ||
505 | If all is well, return to the previous SSH session and terminate `zed` with | |
506 | `Ctrl`+`C`. | |
507 | ||
508 | Fix the paths to eliminate `/mnt`. | |
509 | ||
510 | sed -Ei "s|/mnt/?|/|" /etc/zfs/zfs-list.cache/* | |
511 | ||
512 | ||
513 | ### Reboot ### | |
514 | ||
515 | The Debian install is almost ready for use without the live Debian host | |
516 | environment. Only a few steps remain. | |
517 | ||
518 | Do a final system update. | |
519 | ||
520 | apt-get dist-upgrade | |
521 | ||
522 | Disable log compression since ZFS is already compressing at the block level. | |
523 | ||
524 | for file in /etc/logrotate.d/* ; do | |
525 | if grep -Eq "(^|[^#y])compress" "$file" ; then | |
526 | sed -i -r "s/(^|[^#y])(compress)/\1#\2/" "$file" | |
527 | fi | |
528 | done | |
529 | ||
530 | Install an SSH server so we can login again after rebooting. | |
531 | ||
532 | apt-get install openssh-server | |
533 | ||
534 | Set a root password. | |
535 | ||
536 | passwd | |
537 | ||
538 | Create a user account. | |
539 | ||
540 | zfs create rpool/home/ataylor | |
541 | adduser ataylor | |
542 | mkdir /etc/skel/.ssh && chmod 700 /etc/skel/.ssh | |
543 | cp -a /etc/skel/. /home/ataylor/ | |
544 | scp ataylor@lagavulin:/usr/home/ataylor/.ssh/id_rsa.pub /home/ataylor/.ssh/authorized_keys | |
545 | chown -R ataylor:ataylor /home/ataylor | |
546 | usermod -a -G audio,cdrom,dip,floppy,netdev,plugdev,sudo,video ataylor | |
547 | ||
548 | Snapshot the install. | |
549 | ||
550 | zfs snapshot bpool/BOOT/debian@install | |
551 | zfs snapshot rpool/ROOT/debian@install | |
552 | ||
553 | Exit the chroot and unmount all filesystems. | |
554 | ||
555 | exit | |
556 | mount | grep -v zfs | tac | awk '/\/mnt/ {print $3}' | xargs -i{} umount -lf {} | |
557 | zpool export -a | |
558 | ||
559 | Reboot the computer and remove the USB stick. Installation is complete. | |
560 | ||
561 | ||
562 | ### UNIX Userland ### | |
563 | ||
564 | Install various no-config-required userland packages before continuing. | |
565 | ||
566 | apt-get install net-tools bzip2 zip ntp htop xterm screen git \ | |
567 | build-essential pciutils smartmontools gdb valgrind | |
568 | ||
569 | ||
570 | #### X Window Manager #### | |
571 | ||
572 | Install X and dwm to ensure all dependencies are met for running my dwm-derived | |
573 | window manager. | |
574 | ||
575 | apt-get install xorg dwm numlockx | |
576 | ||
577 | Install dependencies for building my window manager. | |
578 | ||
579 | apt-get install libx11-dev libxft-dev libxinerama-dev | |
580 | ||
581 | Copy the Hophib Modern Desktop git repo to the new server. Make the following changes: | |
582 | ||
583 | - `hhmd/src/mk.conf`: Change the installation prefix from `/hh` to | |
584 | `/home/ataylor/bin` | |
585 | ||
586 | - `hhmd/src/window_manager/Makefile`: Change library and include paths from | |
587 | `/usr/local/...` to `/usr/...` | |
588 | ||
589 | - `hhmd/src/window_manager/dwm-status.c`: Change `#include <sys/time.h>` to | |
590 | `#include <time.h>` and add `#define _GNU_SOURCE` as well as | |
591 | `#define _DEFAULT_SOURCE` to the top of the file | |
592 | ||
593 | - `hhmd/src/window_manager/dwm.c`: Add `#define _POSIX_C_SOURCE 2` to the top | |
594 | of the file. | |
595 | ||
596 | - `hhmd/src/window_manager/dwm-watchdog.sh`: Change paths and executable | |
597 | names from `/hh/...` to `/home/ataylor/bin/...` and from `wm` to `dwm`. | |
598 | ||
599 | Execute `make clean install`. Verify that `dwm`, `dwm-status` and | |
600 | `dwm-watchdog.sh` all ended up in `/home/ataylor/bin` with appropriate | |
601 | permissions. Delete the man pages that were installed in ataylor's homedir. | |
602 | ||
603 | Create `~/.xinitrc` with following contents. | |
604 | ||
605 | /usr/bin/numlockx & | |
606 | /home/ataylor/bin/dwm-status & | |
607 | /home/ataylor/bin/dwm-watchdog.sh | |
608 | ||
609 | Verify X and my window manager start successfully and that `dwm-watchdog.sh` | |
610 | keeps X and X applications alive during a window manager live restart. | |
611 | ||
612 | ||
613 | #### VIM #### | |
614 | ||
615 | Install gvim. | |
616 | ||
617 | apt-get install gvim | |
618 | ||
619 | Create `~/.vimrc` with the following contents. | |
620 | ||
621 | set nocompatible | |
622 | filetype off | |
623 | set mouse=r | |
624 | set number | |
625 | syntax on | |
626 | set tabstop=4 | |
627 | set expandtab | |
628 | ||
629 | "Folding | |
630 | "http://vim.wikia.com/wiki/Folding_for_plain_text_files_based_on_indentation | |
631 | "set foldmethod=expr | |
632 | "set foldexpr=(getline(v:lnum)=~'^$')?-1:((indent(v:lnum)<indent(v:lnum+1))?('>'.indent(v:lnum+1)):indent(v:lnum)) | |
633 | "set foldtext=getline(v:foldstart) | |
634 | "set fillchars=fold:\ "(there's a space after that \) | |
635 | "highlight Folded ctermfg=DarkGreen ctermbg=Black | |
636 | "set foldcolumn=6 | |
637 | ||
638 | " Color the 100th column. | |
639 | set colorcolumn=100 | |
640 | highlight ColorColumn ctermbg = darkgray | |
641 | ||
642 | ||
643 | #### TCSH #### | |
644 | ||
645 | Install tcsh. | |
646 | ||
647 | apt-get install tcsh | |
648 | ||
649 | Change the default shell for new users by editing `/etc/adduser.conf`, setting | |
650 | the `DSHELL` variable to `/bin/tcsh`. Then use the `chsh` command to change the | |
651 | shell for root and ataylor. Create `~/.cshrc` in ataylor's and root's homedir | |
50ab1573 AT |
652 | with the following contents. Remember to also copy it to `/etc/skel` and set |
653 | permissions so it's used for any future users on the system. | |
a60cd2ef AT |
654 | |
655 | # .cshrc - csh resource script, read at beginning of execution by each shell | |
656 | ||
657 | alias h history 25 | |
658 | alias j jobs -l | |
659 | alias la ls -aF | |
660 | alias lf ls -FA | |
661 | alias ll ls -lAF --color | |
662 | alias ls ls --color | |
663 | ||
664 | # These are normally set through /etc/login.conf. You may override them here | |
665 | # if wanted. | |
666 | set path = (/sbin /bin /usr/sbin /usr/bin /usr/local/sbin /usr/local/bin $HOME/bin) | |
667 | ||
668 | setenv EDITOR vim | |
669 | setenv PAGER more | |
670 | ||
671 | if ($?prompt) then | |
672 | # An interactive shell -- set some stuff up | |
673 | set prompt = "%N@%m:%~ %# " | |
674 | set promptchars = "%#" | |
675 | ||
676 | set filec | |
677 | set history = 1000 | |
678 | set savehist = (1000 merge) | |
679 | set autolist = ambiguous | |
680 | # Use history to aid expansion | |
681 | set autoexpand | |
682 | set autorehash | |
683 | set mail = (/var/mail/$USER) | |
684 | if ( $?tcsh ) then | |
685 | bindkey "^W" backward-delete-word | |
686 | bindkey -k up history-search-backward | |
687 | bindkey -k down history-search-forward | |
688 | endif | |
689 | ||
690 | endif | |
691 | ||
692 | ||
693 | #### XScreensaver #### | |
694 | ||
695 | Install Xscreensaver and configure screen locking. | |
696 | ||
697 | apt-get install xscreensaver xscreensaver-data | |
698 | ||
699 | Run `xscreensaver-demo` and select some screensavers. If inspiration doesn't | |
700 | strike, do single screensaver mode with the `abstractile` hack; it looks good | |
701 | on pretty much any hardware. Remember to enable screen locking. | |
702 | ||
703 | Add the following line to `~/.xinitrc`. | |
704 | ||
705 | /bin/xscreensaver -nosplash & | |
706 | ||
707 | ||
708 | #### ZFS Snapshots #### | |
709 | ||
710 | In order to configure automatic ZFS snapshots, use the `auto-zfs-snapshot` | |
711 | package. | |
712 | ||
713 | apt-get install auto-zfs-snapshot | |
714 | ||
715 | In addition to the snapshot script itself, this package includes automatically | |
716 | enabled cron entries, but it will only snapshot filesystems with the | |
717 | `com.sun:auto-snapshot` property set to `true`. Since we already manually set | |
718 | that property to `false` for `/var/cache` and `/var/tmp`, simply set it to | |
719 | `true` for the two parent pools and allow filesystems to inherit wherever | |
720 | possible. | |
721 | ||
722 | zfs set com.sun:auto-snapshot=true rpool | |
723 | zfs set com.sun:auto-snapshot=true bpool | |
724 | ||
725 | Verify that relevant filesystems inherited the property. | |
726 | ||
727 | zfs get com.sun:auto-snapshot | |
728 | ||
729 | After waiting 15+ minutes, verify that snapshots begin to appear. | |
730 | ||
731 | zfs list -t snapshot | |
732 | ||
733 | ||
734 | #### ZFS Scrubs #### | |
735 | ||
736 | Automate ZFS scrubs by creating `/etc/cron.d/zfs-scrubs` with the following | |
737 | contents. | |
738 | ||
739 | PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin | |
740 | 0 0 0 * * root /sbin/zpool scrub rpool | |
741 | 0 0 0 * * root /sbin/zpool scrub bpool | |
742 | ||
743 | ||
744 | #### Status Updates #### | |
745 | ||
746 | In order to receive status updates like failed drive notifications, we must | |
747 | first configure the system to send email through the SGK mail server. Rather | |
748 | than use `exim4` as provided by the base system, instead use `msmtp`. | |
749 | ||
750 | apt-get install msmtp-mta | |
751 | ||
752 | Create the file `/etc/msmtprc` with the following contents. | |
753 | ||
754 | # Set default values for all following accounts. | |
755 | defaults | |
756 | auth on | |
757 | tls on | |
758 | tls_trust_file /etc/ssl/certs/ca-certificates.crt | |
759 | tls_starttls off | |
760 | ||
761 | # Account: subgeniuskitty | |
762 | account default | |
763 | host mail.subgeniuskitty.com | |
764 | port 465 | |
765 | from ataylor@subgeniuskitty.com | |
766 | user ataylor@subgeniuskitty.com | |
767 | password <plaintext-password> | |
768 | ||
769 | Create the file `/etc/cron.d/status-emails` with the following contents. | |
770 | ||
771 | PATH=/etc:/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin | |
772 | SHELL=/bin/bash | |
773 | 0 0 * * 0 root /sbin/zpool status | echo -e "Subject:FROSTBURG: zpool status\n\n $(cat -)" | msmtp ataylor@subgeniuskitty.com | |
774 | ||
775 | -------------------------------------------------------------------------------- | |
776 | ||
777 | ||
778 | ## Xeon Phi Kernel Module ## | |
779 | ||
780 | It appears that Linux kernel version 4.19.0 included with Debian 10.9 already | |
781 | has some sort of in-tree kernel support for these Xeon Phi coprocessor cards as | |
782 | seen in the final lines of the following diagnostic output. | |
783 | ||
784 | root@frostburg:~ # lspci | grep -i Co-processor | |
785 | 02:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 5100 series (rev 11) | |
786 | root@frostburg:~ # lspci -s 02:00.0 -vv | |
787 | 02:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 5100 series (rev 11) | |
788 | Subsystem: Intel Corporation Xeon Phi coprocessor 5100 series | |
789 | Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ | |
790 | Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- | |
791 | Latency: 0, Cache Line Size: 64 bytes | |
792 | Interrupt: pin A routed to IRQ 69 | |
793 | NUMA node: 0 | |
794 | Region 0: Memory at 21c00000000 (64-bit, prefetchable) [size=8G] | |
795 | Region 4: Memory at cb900000 (64-bit, non-prefetchable) [size=128K] | |
796 | Capabilities: [44] Power Management version 3 | |
797 | Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot-,D3cold-) | |
798 | Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- | |
799 | Capabilities: [4c] Express (v2) Endpoint, MSI 00 | |
800 | DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us | |
801 | ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W | |
802 | DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- | |
803 | RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+ | |
804 | MaxPayload 256 bytes, MaxReadReq 512 bytes | |
805 | DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- | |
806 | LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <4us, L1 unlimited | |
807 | ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- | |
808 | LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ | |
809 | ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- | |
810 | LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- | |
811 | DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF Not Supported | |
812 | DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled | |
813 | LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- | |
814 | Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- | |
815 | Compliance De-emphasis: -6dB | |
816 | LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- | |
817 | EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- | |
818 | Capabilities: [88] MSI: Enable- Count=1/16 Maskable- 64bit+ | |
819 | Address: 0000000000000000 Data: 0000 | |
820 | Capabilities: [98] MSI-X: Enable+ Count=16 Masked- | |
821 | Vector table: BAR=4 offset=00017000 | |
822 | PBA: BAR=4 offset=00018000 | |
823 | Capabilities: [100 v1] Advanced Error Reporting | |
824 | UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- | |
825 | UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- | |
826 | UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- | |
827 | CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- | |
828 | CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ | |
829 | AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- | |
830 | Kernel driver in use: mic | |
831 | Kernel modules: mic_host | |
832 | ||
833 | However, as no virtual network device automatically showed up, and since the | |
834 | Intel manuals are plastered with warnings about using exact, sanctioned | |
835 | combinations of kernel module, MPSS software, and Phi firmware, I decided to | |
836 | avoid the kernel module included with the system and instead attempt porting | |
837 | the kernel module source code included with MPSS onto a newer Linux kernel. At | |
838 | a minimum, it appears the timer API has changed, as well as some utility | |
839 | functions related to requesting block interrupt assignments. | |
840 |