Linux Kernel 4.2 is here! The good and the far..for the Pogoplug v4/mobile

Arch just released the 4.2 kernel last weekend and I was already jumping on it to test.  Unfortunately, we already hit a bug (actually two) in one of the big features I was looking forward to, which was the DMA support in the crypto driver for Marvell SoCs.

  1. Non-DT (device tree) kernels, such as linux-kirkwood, showed the new driver (replacement for mv_cesa, marvell_cesa) loading, but it was not showing up as available to the kernel crypto.  The only modules available for crypto was the kernel, and we should be seeing marvell_cesa there.  I pointed this out to the Arch developers and they removed marvell_cesa from linux-kirkwood and enabled mv_cesa again.
  2. DT  (linux-kirkwood-dt) kernels suffered from a different problem.  Working with one of the developers at Arch, I found that there was a patch to the device tree for kirkwood SoCs that was never merged in 4.2, which provided the necessary compatibility property to the driver to utilize the TDMA block.  As a consequence, it was still reporting a property that resulted in the driver disabling DMA.  Hence, performance was no better (in some cases worse, but more consistent) than mv_cesa.

I am in process of patching and compiling a new DT kernel to determine if we can get the new marvell_cesa driver working with DMA on the Pogoplug v4/mobile.  Hopefully we can merge this patch in with linux-kirkwood-dt until it is merged upstream if it works.

For the issue with the non-DT kernel (linux-kirkwood), we may be out of luck without manually patching the marvel_cesa driver.  This is because they did not intend to allow for DMA with non-DT kernels, even though according to messages it appears the authors intended it to work without DMA in non-DT kernels.  In any case, if you don’t need either the crypto acceleration at all or the DMA for the crypto acceleration, then there is no major reason to use the DT kernel.

Assuming I can get this working by patching the kirkwood device tree, I should then be able to provide some impressions on the performance of the driver on the Pogoplug v4/mobile.  Please stay tuned for that.

Now, the other bad.  There was a feature called DAX that I was eyeing to help improve disk I/O performance for certain types of operations.  While this was primarily intended for memory devices, where it wouldn’t actually make sense to have a page cache, the page cache itself was part of the problem on the Pogoplug.  Unfortunately, while this code appears to have been merged into Linux 4.2 for EXT4 and XFS, it is not available to us on ARM.  There is a small paragraph under Shortcomings in the DAX Documentation that explains why.

The DAX code does not work correctly on architectures which have virtually
mapped caches such as ARM, MIPS and SPARC.

There was an old option, deprecated since roughly 2010, in the EXT file system that may have been interesting to try called nobh.  Unfortunately, nobh is ignored in modern kernels and therefore not able to be tested.  You can still attempt to mount an EXT file system with this option, and while it won’t return an error, you can see from dmesg output that it was ignored.

The silver lining in this posting is that for those of us that were looking for DMA in the Marvell crypto driver, we may soon have a LUKS rock star in the Pogoplug v4/mobile.  It just didn’t work out of the box, and that is what we’re trying to fix.

Pogoplug v4 Performance Tuning Future Enhancements

There are some exciting changes coming to the Linux kernel for both ext4 and xfs file systems that have the potential to greatly increase I/O performance on the Pogoplug v4. As we know, one of the limits of the Pogoplug v4 is related to memory operation performance. Once the kernel has been released for linux-kirkwood, I will begin testing these changes for possible inclusion into my performance tuning guide.

I plan to make a posting about using LUKS with the Pogoplug to enhance security on your data. There is hardware crypto with the kirkwood that does work, but the mv_cesa driver has some design issues that affect performance on the kirkwood due to not using DMA. Recall the aforementioned issues with memory operations on this platform.

There is some evidence that the DMA issue with mv_cesa may be fixed in the future.  Stay tuned and if/when that comes, I’ll be sure to test it.  If they successfully implement DMA in mv_cesa, this should greatly increase throughput of the hardware crypto engine, speeding up LUKS, OpenSSL, and OpenSSH by extension.