From 31f1b002d1b97e0683f3360e43c7e97bc4ddf887 Mon Sep 17 00:00:00 2001 From: Aaron Taylor Date: Mon, 17 May 2021 17:12:59 -0700 Subject: [PATCH] Added notes for build/install of Xeon Phi kernel module to Xeon Phi server notes. --- data/notes/xeon_phi_server.md | 267 +++++++++++++++++++++++++++------- 1 file changed, 216 insertions(+), 51 deletions(-) diff --git a/data/notes/xeon_phi_server.md b/data/notes/xeon_phi_server.md index 2288ef1..f4b869a 100644 --- a/data/notes/xeon_phi_server.md +++ b/data/notes/xeon_phi_server.md @@ -29,7 +29,7 @@ The information on this page includes: - Installation of Debian 10.9 (buster) root on encrypted ZFS mirror with automated snapshots and scrubs. - - (TODO) Porting the Intel kernel module to Linux kernel version 4.19.0. + - Porting the Xeon Phi kernel module to newer versions of the Linux kernel. - (TODO) Installing MPSS toolkit on Debian (or CentOS VM). @@ -895,64 +895,229 @@ able to login to the *server* via the *endpoint*. ## Xeon Phi Kernel Module ## -It appears that Linux kernel version 4.19.0 included with Debian 10.9 already +It appears that Linux kernel version 4.19.181 included with Debian 10.9 already has some sort of in-tree kernel support for these Xeon Phi coprocessor cards as -seen in the final lines of the following diagnostic output. +seen in the final lines of the following diagnostic output. Also note that the +card allocated an 8GB PCIe MMIO region, indicating that the 64-bit BAR setting +in the BIOS is working as intended. root@frostburg:~ # lspci | grep -i Co-processor 02:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 5100 series (rev 11) root@frostburg:~ # lspci -s 02:00.0 -vv 02:00.0 Co-processor: Intel Corporation Xeon Phi coprocessor 5100 series (rev 11) - Subsystem: Intel Corporation Xeon Phi coprocessor 5100 series - Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ - Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- SERR- Region 0: Memory at 21c00000000 (64-bit, prefetchable) [size=8G] - Region 4: Memory at cb900000 (64-bit, non-prefetchable) [size=128K] - Capabilities: [44] Power Management version 3 - Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot-,D3cold-) - Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- - Capabilities: [4c] Express (v2) Endpoint, MSI 00 - DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 <64us - ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 75.000W - DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported- - RlxdOrd- ExtTag+ PhantFunc- AuxPwr- NoSnoop+ - MaxPayload 256 bytes, MaxReadReq 512 bytes - DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- - LnkCap: Port #0, Speed 5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <4us, L1 unlimited - ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp- - LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk+ - ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- - LnkSta: Speed 5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt- - DevCap2: Completion Timeout: Range AB, TimeoutDis+, LTR-, OBFF Not Supported - DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled - LnkCtl2: Target Link Speed: 5GT/s, EnterCompliance- SpeedDis- - Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS- - Compliance De-emphasis: -6dB - LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1- - EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest- - Capabilities: [88] MSI: Enable- Count=1/16 Maskable- 64bit+ - Address: 0000000000000000 Data: 0000 - Capabilities: [98] MSI-X: Enable+ Count=16 Masked- - Vector table: BAR=4 offset=00017000 - PBA: BAR=4 offset=00018000 - Capabilities: [100 v1] Advanced Error Reporting - UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol- - UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- - UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol- - CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- - CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ - AERCap: First Error Pointer: 00, GenCap- CGenEn- ChkCap- ChkEn- + Kernel driver in use: mic Kernel modules: mic_host -However, as no virtual network device automatically showed up, and since the -Intel manuals are plastered with warnings about using exact, sanctioned -combinations of kernel module, MPSS software, and Phi firmware, I decided to -avoid the kernel module included with the system and instead attempt porting -the kernel module source code included with MPSS onto a newer Linux kernel. At -a minimum, it appears the timer API has changed, as well as some utility -functions related to requesting block interrupt assignments. +However, since the Intel manuals are plastered with warnings about using exact, +sanctioned combinations of kernel module, MPSS software, and Phi firmware, I +decided to avoid the kernel module included with the system and instead attempt +porting the kernel module source code included with MPSS onto a newer Linux +kernel. Once I have everything operational and understand how it *should* work, +then I can try the open-source driver. + +I have updated the Intel kernel driver to work with newer Linux kernels. My +work is based upon the kernel source included with MPSS 3.8.6, the latest/last +release from Intel. Since the Xeon Phi x100 series is EOL, I don't think Intel +intends to release any more versions of MPSS. Check `README.md` in my +[xeon-phi-kernel-module](https://git.subgeniuskitty.com/xeon-phi-kernel-module/.git) +git repo for up-to-date information regarding kernel version compatibility. + +Before compiling the kernel module, verify that relevant kernel headers are +installed. + + % uname -a + Linux frostburg 4.19.0-16-amd64 #1 SMP Debian 4.19.181-1 (2021-03-19) x86_64 GNU/Linux + % dpkg -l | grep linux-header + ii linux-headers-4.19.0-16-amd64 4.19.181-1 amd64 Header files for Linux 4.19.0-16-amd64 + ii linux-headers-4.19.0-16-common 4.19.181-1 all Common header files for Linux 4.19.0-16 + ii linux-headers-amd64 4.19+105+deb10u11 amd64 Header files for Linux amd64 configuration (meta-package) + +Download and compile my updated version of the Intel kernel driver. Sample +compilation output is included below. + + % git clone git://git.subgeniuskitty.com/xeon-phi-kernel-module/ + % cd xeon-phi-kernel-module/ + % make clean all + make -C /lib/modules/4.19.0-16-amd64/build M=xeon-phi-kernel-module modules \ + INSTALL_MOD_PATH= + make[1]: Entering directory '/usr/src/linux-headers-4.19.0-16-amd64' + CC [M] xeon-phi-kernel-module/dma/mic_dma_lib.o + CC [M] xeon-phi-kernel-module/dma/mic_dma_md.o + CC [M] xeon-phi-kernel-module/host/acptboot.o + CC [M] xeon-phi-kernel-module/host/ioctl.o + CC [M] xeon-phi-kernel-module/host/linpm.o + CC [M] xeon-phi-kernel-module/host/linpsmi.o + CC [M] xeon-phi-kernel-module/host/linscif_host.o + CC [M] xeon-phi-kernel-module/host/linsysfs.o + CC [M] xeon-phi-kernel-module/host/linux.o + CC [M] xeon-phi-kernel-module/host/linvcons.o + CC [M] xeon-phi-kernel-module/host/linvnet.o + CC [M] xeon-phi-kernel-module/host/micpsmi.o + CC [M] xeon-phi-kernel-module/host/micscif_pm.o + CC [M] xeon-phi-kernel-module/host/pm_ioctl.o + CC [M] xeon-phi-kernel-module/host/pm_pcstate.o + CC [M] xeon-phi-kernel-module/host/tools_support.o + CC [M] xeon-phi-kernel-module/host/uos_download.o + CC [M] xeon-phi-kernel-module/host/vhost/mic_vhost.o + CC [M] xeon-phi-kernel-module/host/vhost/mic_blk.o + CC [M] xeon-phi-kernel-module/host/vmcore.o + CC [M] xeon-phi-kernel-module/micscif/micscif_api.o + CC [M] xeon-phi-kernel-module/micscif/micscif_debug.o + CC [M] xeon-phi-kernel-module/micscif/micscif_fd.o + CC [M] xeon-phi-kernel-module/micscif/micscif_intr.o + CC [M] xeon-phi-kernel-module/micscif/micscif_nm.o + CC [M] xeon-phi-kernel-module/micscif/micscif_nodeqp.o + CC [M] xeon-phi-kernel-module/micscif/micscif_ports.o + CC [M] xeon-phi-kernel-module/micscif/micscif_rb.o + CC [M] xeon-phi-kernel-module/micscif/micscif_rma_dma.o + CC [M] xeon-phi-kernel-module/micscif/micscif_rma_list.o + CC [M] xeon-phi-kernel-module/micscif/micscif_rma.o + CC [M] xeon-phi-kernel-module/micscif/micscif_select.o + CC [M] xeon-phi-kernel-module/micscif/micscif_smpt.o + CC [M] xeon-phi-kernel-module/micscif/micscif_sysfs.o + CC [M] xeon-phi-kernel-module/micscif/micscif_va_gen.o + CC [M] xeon-phi-kernel-module/micscif/micscif_va_node.o + CC [M] xeon-phi-kernel-module/vnet/micveth_dma.o + CC [M] xeon-phi-kernel-module/vnet/micveth_param.o + LD [M] xeon-phi-kernel-module/mic.o + Building modules, stage 2. + MODPOST 1 modules + CC xeon-phi-kernel-module/mic.mod.o + LD [M] xeon-phi-kernel-module/mic.ko + make[1]: Leaving directory '/usr/src/linux-headers-4.19.0-16-amd64' + +At this point you can manually load/install the new kernel module (`mic.ko`) +which is found in the current directory, or execute `make install`. The latter +command also installs the SCIF header file, as well as putting some config files +under `/usr/local/etc/`. The information in those config files won't be picked +up by the system (we will install configs in the correct location in a moment), +but it is useful as a reference. Sample `make install` output is shown below. + + # make install + make -C /lib/modules/4.19.0-16-amd64/build M=/home/ataylor/xeon-phi-kernel-module modules_install \ + INSTALL_MOD_PATH= + make[1]: Entering directory '/usr/src/linux-headers-4.19.0-16-amd64' + INSTALL /home/ataylor/xeon-phi-kernel-module/mic.ko + DEPMOD 4.19.0-16-amd64 + Warning: modules_install: missing 'System.map' file. Skipping depmod. + make[1]: Leaving directory '/usr/src/linux-headers-4.19.0-16-amd64' + install -d /usr/local/etc/sysconfig/modules + install mic.modules /usr/local/etc/sysconfig/modules + install -d /usr/local/etc/modprobe.d + install -m644 mic.conf /usr/local/etc/modprobe.d + install -d /usr/local/etc/udev/rules.d + install -m644 udev-mic.rules /usr/local/etc/udev/rules.d/50-udev-mic.rules + install -d /lib/modules/4.19.0-16-amd64 + install -m644 Module.symvers /lib/modules/4.19.0-16-amd64/scif.symvers + install -d /usr/src/linux-headers-4.19.0-16-amd64/include/modules + install -m644 include/scif.h /usr/src/linux-headers-4.19.0-16-amd64/include/modules + +Create the file `/etc/modprobe.d/mic.conf` with the following contents, +intended to accomplish two things. First, blacklist the in-tree MIC kernel +module that shipped with our kernel, including all associated modules, and +second, configure the Intel MIC kernel module which we just built and installed. +The options shown are drawn from the defaults in +`/usr/local/etc/modprobe.d/mic.conf`. + + # Blacklist the in-tree kernel modules associated with the Knight's Corner Xeon + # Phi so that we can load the Intel kernel module. + + # These two modules depend on the various bus modules that follow. + blacklist mic_host + blacklist mic_x100_dma + + blacklist cosm_bus + blacklist vop_bus + blacklist scif_bus + blacklist mic_bus + + # ^^^------ Blacklisting the in-tree MIC kernel module. + # ============================================================================== + # vvv------ Configuring the Intel MIC kernel module. + + # The following options apply to the Intel Many Integrated Core (MIC) driver. + # Unless otherwise noted, the value "1" enables the feature and "0" disables + # it. + # + # Option: p2p + # Description: Enables use of SCIF interface peer to peer communication. + # + # Option: p2p_proxy + # Description: Enables use of SCIF P2P Proxy DMA which converts DMA + # reads into DMA writes for performance on certain Intel + # platforms. + # + # Option: reg_cache + # Description: Enables SCIF Registration Caching. + # + # Option: huge_page + # Description: Enables SCIF Huge Page Support. + # + # Option: watchdog + # Description: Enables SCIF watchdog for Lost Node detection. + # + # Option: watchdog_auto_reboot + # Description: Configures behavior of MIC host driver upon detection of a lost + # node. This option is a nop if watchdog=0. Setting value "1" + # allows host driver to reboot node back to "online" state, + # whereas value "0" only allows the host driver to reset the node + # back to "ready" state, leaving the user responsible for rebooting + # the node (or not). + # + # Option: crash_dump + # Description: Enables uOS Kernel Crash Dump Captures. + # + # Option: ulimit + # Description: Enables ulimit checks on max locked memory for scif_register. + # + options mic reg_cache=1 huge_page=1 watchdog=1 watchdog_auto_reboot=1 crash_dump=1 p2p=1 p2p_proxy=1 ulimit=0 + options mic_host reg_cache=1 huge_page=1 watchdog=1 watchdog_auto_reboot=1 crash_dump=1 p2p=1 p2p_proxy=1 ulimit=0 + +Finally, add the line `mic` to the file `/etc/modules-load.d/modules.conf`, +instructing the system to load this kernel module on boot, then run `depmod` to +ensure the system is aware of the new kernel module, followed by a reboot to +verify everything works. + +After the system comes back up, verify that the module loaded with your desired +options using the `systool` command, sample output below. + + # systool -v -m mic + Module = "mic" + + Attributes: + coresize = "741376" + initsize = "0" + initstate = "live" + refcnt = "0" + taint = "OE" + uevent = + + Parameters: + crash_dump = "Y" + huge_page = "Y" + msi = "Y" + p2p_proxy = "Y" + p2p = "Y" + pm_qos_cpu_dma_lat = "-1" + psmi = "N" + ramoops_count = "4" + reg_cache = "Y" + ulimit = "N" + vnet = "dma" + vnet_addr = "0" + vnet_num_buffers = "62" + watchdog_auto_reboot= "Y" + watchdog = "Y" + + Sections: + + + +-------------------------------------------------------------------------------- + + +## Intel MPSS ## -- 2.20.1